Too many fields in MySQL?

I developed a statistics site for the game as a training project several years ago. It is still in use today, and I would like to clean it a little.

The database is one area that needs improvement. I have a game statistics table that has GameID, PlayerID, Kill, Deaths, DamageDealt, DamageTaken, etc. In total, this separate table contains about 50 fields and many more that can be added in the future. At what point are too many fields? It currently has 57,341 rows and is 153.6 MiB on its own.

I also have several fields that store arrays in a BLOB in the same table. An example of an array are matches between players and players. The array stores how many times this player killed another player in the game. These are large fields in files. Does the array save in BLOB?

The array looks like this:

[Killed] => Array ( [SomeDude] => 13 [GameGuy] => 10 [AnotherPlayer] => 8 [YetAnother] => 7 [BestPlayer] => 3 [APlayer] => 9 [WorstPlayer] => 2 ) 

They do not exceed 10 players.

+6
source share
4 answers

I prefer not to have one table with an undefined number of columns (with even more), but rather to have a linked table of labels and values, so each user has an identifier, and you use this identifier as a key in the table of labels and values. Thus, you save only the data that the user needs. I believe that this approach is called EAV (according to Triztian's comment), and also about how the medical databases are stored, since there are many potential fields for each patient, even if any given patient has a very small number of these fields with actual data.

So you will have

 user: id | username | some_other_required_field user_data: id | user_id | label | value 

You can now have as many or more user_data lines as you need for each user.

[change]

As for your array, I would consider this also with a relational table. Sort of:

 player_interraction: id | player_id | player_id | interraction_type 

here you would save two players who had an interaction and what type of interaction it was.

+2
source

The design of the table seems mostly beautiful. So far, the columns you are storing cannot be calculated from other columns within the same row. IE, you do not store SelfKills, OtherDeath and TotalDeaths (where TotalDeaths = SelfKills + OtherDeath). It does not make sense and can be cut out of your table.

I would be interested to know more about how you store these arrays in a BLOB - what purpose do they fulfill in a BLOB? Why are they not normalized to a table to simplify data conversion and analytics? (OR they and they are simply stored as an array here for ease of displaying data to end users).

Also, I would be curious how much data your BLOB and the rest of the table will occupy. Generally speaking, the line size is not as large as the number of lines, and ~ 60K does not really matter much. Until you write queries that should check each column value (ideally, you ignore blob when trying to write a where clause).

+1
source

With mysql, you have a hard limit of approximately 4,000 columns (fields) and 65 KB of shared memory per row. If you need to store large strings, use a text field, they are stored on disk. Blocks really need to be reserved for non-textual data (if necessary).

Don't worry about the overall size of your db, but think about the structure and how it is organized and indexed. I saw how a little db works like shit.

If you still want numbers when you get a total score in the GB range or for a couple of hundred thousand rows in one table, then start to worry about things - 150M in 60K rows is not so much and table scanning will not cost you much in performance. However, now is the time to make sure that you create good coverage indexes for your heavily used queries.

+1
source

There is nothing wrong with adding columns to a database table over time. Database designs change all the time. Keep in mind how data is grouped. I have always considered a database table as a collection of similar elements.

The things that I consider are as follows:

When inserting data into a row, will the number of columns be zero?
Does this new column apply to 80% of my data that already exists?
Will I do multiple updates for multiple columns in this table?
If so, do I need to keep track of what the values ​​really were?

Thinking about such data, you may find that you need to split the table into several small tables, connected with each other by foreign keys.

+1
source

Source: https://habr.com/ru/post/909217/


All Articles