Many columns in a table

The table in our diagram already contains 100 columns. We need to add another 600 columns if we follow the horizontal method of data storage. If we use a vertical data warehouse, which means creating a new table and creating referential integrity with a table containing 100 columns, there will be a problem joining the table, since a table containing 100 columns has 53 million records, and the new table created will have much more than that. Which approach is better in this case.

I would like to add an interesting test case here. I added 600 columns to my table, already having 87 columns and 53 million records. Then I tried updating it in batches

a> Time spent updating 1000 records β†’ 2.10 sec. b> Time required to update 10,000 records β†’ 5.57 sec. c> Time spent on updating 1,000,000 records β†’ 5.42 min. d> Time spent updating 53 million records β†’ 4. 5 hours (exhausted table space and we needed to expand the table space)

Can anyone suggest a faster update method?

+3
source share
5 answers

Questions you need to ask yourself:

  • Most of the fields in my row have default values ​​or empty values? If this is the case, the vertical layout may be more appropriate.
  • , , ? , , , , , , , , 1:1 .
+2

: , . , , . ! , , , ( , , ). , - .

53 ? , , MySQL/SQL, .

600+ . , , " "? , .

+1

... , , 100 , 53 , normalized

, . , , (, 1000 1000 53 . , , , ). , 53 , , , , . . .

, . , . .

+1

, .

, XML-, db...

0

, , HBase (http://hadoop.apache.org/hbase/), , , Google.

0

Source: https://habr.com/ru/post/1719503/


All Articles