Question is one of design. I am collecting a large piece of performance data with a lot of key-value pairs. almost everything in / proc / cpuinfo, / proc / meminfo /, / proc / loadavg, plus many other things, from several hundred hosts. right now, I just need to display the last piece of data in my user interface. I will probably end up doing some analysis of the collected data to figure out performance issues in the future, but this is a new application, so I'm not sure what exactly I'm looking for in performance just yet.
I could structure the data in db - have a column for each key that I collect. the table will eventually be O (100) wide, it would be painful to put in db, I would need to add new columns if I start collecting a new stat. but it would be easy to sort / analyze the data just using SQL.
or I could just dump my unstructured blob data into a table. maybe three columns - host id, timestamp and serialized version of my array, possibly using JSON in the TEXT field.
what should I do? will I be sorry if I go with an unstructured approach? when analyzing, should I just transform the fields of interest to me and create a new, more structured table? what compromises am i missing here?
source
share