I have a large, complex, outdated relational database containing our user data. I want to create an application that will segment user groups according to various criteria ( show everyone who weighs more than 200 pounds and wears a red shirt ). Requests will be composed of predefined, parameterized predicates (think of the message rule interface in Outlook or Gmail). Completely ad-hoc requests will be rare.
Building SQL queries on the source data is impractical due to the complexity of the legacy schema.
The first naive idea would be to denormalize the data that will be used for segmentation into a very wide table in the RDBMS:
id | hat size | shirt color | weight | ....
123 | 7 | blue | 175 |
456 | 6 | red | 205 |
But this is not very attractive, because the data will be scarce, and the columns will change quite often (weekly?). Schema changes are logically difficult in my environment.
I could normalize the table in a simple key / value table, but at this point, NoSQL becomes interesting.
So here is my question:
Would there be a documented db suitable for this use case such as MongoDB or CouchDB?
I don't have huge amounts of data (10 million million rows, 300 or so columns in a hypothetically denormalized table). Entries are quite rare (10,000 per day). Requests will be executed several times a day, and the response time should be in seconds.
I spent the last couple of days looking at various approaches to NoSQL, and document-oriented documents seem to me the most suitable. Feel free to suggest a better approach.
Bonus question _For the benefits of the db document, do you justify the overhead of introducing new technology into our data centers? _
I mean, I could probably satisfy the performance requirements with an existing relational database, but I'm interested in plunging into NoSQL water, because I have other applications in the area where a document-oriented database will really pay , and I would like to first wet my feet with a simple application.