How to create a searchable version library with various properties and multiple editors?

What I need:

I am developing a backend for a product library that must meet the following requirements:

  • Several editors will simultaneously edit different elements - there must be some kind of lock at the element level.

  • Wildly changing properties of an element - there are about 100 subcategories, each of which can have 10+ properties of an object specific to itself.

  • The entire repository of elements must be a version - several changes can be made (insert, change and delete) before publishing the entire set of changes on the site; non-publication should be possible.

  • I should be able to search for all properties and filter some of them - that is, find a keyword anywhere in the library or find all products that meet a set of criteria - within a data set of at least 10 MB (i.e. 5000 items , 2 KB each) and possibly twice as much.

The solution should be either MySQL-specific or, even better, a sales agent.

What I thought:

I am considering using one large XML object with all the elements (to satisfy 2 ) stored in the database (to satisfy 3 ), but this makes 1 impossible and 4 difficult. I used to use something like this, but with smaller XML objects and no element-level locking.

Another solution that I am considering is a classic database solution using a separate table for each subcategory, which makes 1 and 2 trivial, but 3 and 4 are quite complicated. This is also a bit cumbersome given the number of different subcategories and therefore the number of different tables in the database, but I think it can be automated.

Another possibility is a hybrid between two, with one large database table of all elements. Each row will contain an XML object with all the properties of the element and additionally all filtered properties in the form of table fields. This solves 1 , 2 and partially solves 4 , but does not perform a full-text search and still makes 3 quite difficult to achieve.

If you have done this so far:

I will probably have several weeks to resolve this, which should leave enough time for discussion. I will be very grateful for any thoughts and ideas that the SO community can provide. Thanks in advance.

+4
source share
4 answers

Option 2 - the classic database solution described by you is well suited for this case.

It takes care of 1, 2 [a little difficult, but you can overcome most of it by designing a small heroic mannager], 3.

In paragraph 4, I suggest you study Apache Solr, which can be easily integrated with an RDBMS, can index data 100 times faster than SQL.

+3
source

couchbase

It meets all your requirements.

  • Use concurrency versioning to control concurrent access to database items and to ensure consistency, but control locks can be used to limit access to items for a single client.
  • Flexible layout
  • Versions are easy to implement http://blog.couchbase.com/simple-document-versioning-couchdb
  • Supports map reduction and viewing so you can get data (beware that requests are not adhoc)
+2
source

I think that the hybrid model has the greatest prospects.

In particular, I would use a relational model for the data that your application should be able to reason about. I would include version control logic and locking in this relational model.

I would use XML or similar to store data that the application does not need justification for.

For search and filtering, I would use a dedicated search engine - something like Lucene or the like. I would use the Lucene index as part of the publish new version procedure. Of course, you can use your database server to freely search for text instead of Lucene.

I would not use the same data model for "transactional" logic and search / filtering - these are different tasks, and it is difficult to deal with a large number of schema changes.

+1
source

It seems like you are bullying yourself, heading straight for the relational database or XML file for persistence. Do you consider the stability level of NoSQL or polyglot? There are many different options for NoSQL with various strengths and weaknesses. Martin Fowler recently published an excellent overview of NoSQL databases and the perseverance of a polyglot, which you can find by educating.

I have no personal experience outside of using a relational database to save, but I read something about the constant load of NoSQL and polyglot, and I feel like coding a solution for a game with a concept.

Hope this helps.

0
source

Source: https://habr.com/ru/post/1394819/


All Articles