Combining part of a query with Lucene and part in a database (MySQL)

I have an application that should filter and retrieve results from a list of articles. I use MySQL for the database, and NHibernate as ORM. The query also performs keyword-based full-text searches, and it uses Lucene.Net to do this.

My problem is that the query covers "both domains." For example, I might need to get all the articles that contain keyword traffic management and have PublOn <2012-10-01. In addition, the query uses pagination, such as page # 2, with a page size of 50. The problem is how to create a query that covers both MySQL (for the PublOn part) and Lucene.Net to use the full-text search capability.

If you search MySQL first, I can't just get the first 50, because the results can be further filtered in Lucene, and I need 50 as my page size. The same thing happens if I start with Lucene.Net. In addition, it is preferable that ordering is done by “relevance”, so this is what Lucene can do, not MySQL.

My current approach was to filter MySQL first and retrieve ALL the primary keys of the consistent records. Then I execute the query in Lucene with the query term matching the primary key with the list of results. However, for such a query, Lucene is very slow, and the database can contain more than 200,000 entries. Such a query takes a long time to complete in Lucene, while it is incredibly fast for full-text searches.

Any ideas on how to solve this problem?

+4
source share
1 answer

Lucene is not just a full-text search. You can add the PublishingOn property to the Lucene document and execute the query as follows:

Text:"traffic control" AND PublishedOn:[00000000 TO 20121001] 

Check out the “Range Search” section in Lucene Syntax Documentation .

0
source

Source: https://habr.com/ru/post/1436867/


All Articles