He is currently working on a project that is centered around the medical nomenclature known as SNOMED. At the heart of snomed are three relational datasets whose length is 350,000, 1.1 mil, and 1.3 million records. We want to be able to quickly query this data set for the data entry part, where we would like to have some form or form of automatic completion / offer.
It is currently located in MySQL MyISAM DB for dev purposes only, but we want to start playing with some of the memory options. Currently, it is 30 MB + 90 MB + 70 MB, including indexes. MEMORY MySQL Engine and MemCached were obvious, so my question is, which ones would you suggest or is there something better?
We work in Python primarily at the application level, if that matters. We also work on one small dedicated server, which will soon switch to 4 GB DDR2.
Edit: Additional Information
We are interested in quickly supporting the proposal and auto-complete. What is desirable is to reflect these types of queues well. Each term in snomed usually has several synonyms, abbreviations and a preferred name. We will intensively query this dataset (size 90 MB, including index). We are also considering creating an inverted index to speed up the process and return more relevant results (many of the terms are long). The entire spiral artery of basalis decidua (body structure).)) Lucene or some other full-text search may be appropriate.
source
share