A traditionally inverted index is written directly to a file and stored on disk somewhere. If you want to execute logical search queries (either the file contains all the words in the query or not), the messages may look like they are stored permanently in the file.
Term_ID_1: Frequency_N: Doc_ID_1, Doc_ID_2, Doc_ID_N.Term_ID_2: Frequency_N: Doc_ID_1, Doc_ID_2, Doc_ID_N.Term_ID_N: Frequency_N: Doc_ID_1, Doc_ID_2, Doc_ID_N
The term id is the identifier of the term, frequency is the number of documents the term is in (in other words, how long the list of transactions is), and the id of the document is the document containing this term.
Along with the index, you need to know where everything is in the file, so the mappings should also be stored somewhere in another file. For example, given term_id, the map should return a file position containing this index, and then you can search for that position. Since frequency_id is recorded in messages, you know how many doc_ids should read from the file. In addition, there must be mappings from identifiers with the actual name term / doc.
If you have a small use case, you can disable it using SQL, using blob for the list of transactions and independently handle the intersection upon request.
Another strategy for very little use is to use a term matrix.
CleoR source share