During pre-processing, documents uploaded to the search engine should be enriched with what is reasonable and helps to find them. For instance. a document containing the string paris can be enriched with french capital , capital of france , ile-de-france , ... For this you need a dictionary. You can take data from dbpedia.org or only in English - WordNet. In order not to overly generalize, you will need to implement some ambiguity (meaning opening) in the first stage, since paris - for example - can be divided into alexandros , alaksandu of wilusa , king of troy , depending on the context.
source share