Structured or unstructured data in db

Question is one of design. I am collecting a large piece of performance data with a lot of key-value pairs. almost everything in / proc / cpuinfo, / proc / meminfo /, / proc / loadavg, plus many other things, from several hundred hosts. right now, I just need to display the last piece of data in my user interface. I will probably end up doing some analysis of the collected data to figure out performance issues in the future, but this is a new application, so I'm not sure what exactly I'm looking for in performance just yet.

I could structure the data in db - have a column for each key that I collect. the table will eventually be O (100) wide, it would be painful to put in db, I would need to add new columns if I start collecting a new stat. but it would be easy to sort / analyze the data just using SQL.

or I could just dump my unstructured blob data into a table. maybe three columns - host id, timestamp and serialized version of my array, possibly using JSON in the TEXT field.

what should I do? will I be sorry if I go with an unstructured approach? when analyzing, should I just transform the fields of interest to me and create a new, more structured table? what compromises am i missing here?

+3
source share
5 answers

Thanks for your suggestions.

Having thought about this problem a few more, I decided to go with a two-element approach. One table contains the most recent raw data dump in the same JSON format in which I originally entered it. I use this to display the latest statistics - the most common use case - and it would be foolish to try to parse from all the fields in the dump just to collect them all again when someone wants to see the current status.

, , ( ). .

EAV, , . (40-way JOIN ), , .

0

, SQL- , min/max/avg, , , 100+. , .

, , 100 .

, Entity-Attribute-Value antipattern - /, . / , , , EAV. SQL, .

+3

,

performance_data​​h3 >

        host_id
        key
        value
        timestamp

- . .

0

: .

cpuinfo, meminfo, loadavg .. , miscellaneous_stats, , " ".

:

  • .
  • , . meminfo. , .
  • . cpuinfo, , One Big Yable 1-15 94.
  • . , cpuinfo , meminfo.

stats_runs , HOST, TIMESTAMP .., .

, :

  • ( , , ?).
  • SQL , .
0

Source: https://habr.com/ru/post/1742373/


All Articles