Recommendations for using lucene.net in a web service application?

Just started reading on Lucene.net, and I would like some of my REST-based web services to use Lucene.net's powerful search tools.

However, I came across a link saying that I have to create a Windows service (with WCF) to do all the searches / lucene indexes, etc. when IIS is processing the application pool, which will cause all kinds of blocking problems.

My question is: is this correct? If so, is there another way to solve this problem without creating a Windows service (with WCF)? Also, since I have REST-based services, would I call Windows WCF from these services to slow things down?

+4
source share
2 answers

Indexing

While reading, you would get indexing done using the IndexWriter class. Lucene will only open one instance of IndexWriter . When using a lock, the default lock file is created in the index directory and any other instances of IndexWriter are prevented from being created. For this reason, it may be better to implement indexing in a process that you have more control over.

If the indexing process is completed with extreme prejudice, and your IndexWriter class IndexWriter not close, locking in your index folder is supported, and no other instances are allowed. Because of this, Lucene allows you to unlock the Indexed folder (using IndexWriter.unlock ) - a dangerous method, because if there are two IndexWriters open in the same index, it will damage the index. If you have a Windows service that performs indexing, and this is the only process in your solution that performs indexing (and any updates), you can surely unlock the indexing folder when the service starts. In an environment based on a web service, where you index using the web method, managing and recovering from blocking problems becomes problematic.

Search

The IndexSearcher class IndexSearcher used for search. This readonly mode can be made from your services-based code. I do not think that for this it is necessary to create a separate set of WCF methods.

Optimization

An index may be required for periodic optimizations for performance, depending on volumes. Once again, with indexing in a separate process, you can plan on optimizing nightly, weekly, or any need. Optimization is done by calling a single method.

Indexing New Data

How and when, so that the indexing process indexes new data ... I don’t know what data you are indexing, so it's hard to say. In my scenario, I have WCF methods that are responsible for the input - a large amount. I demand that the data that has been received be searchable as soon as possible. In this way,

There is a notification level in my model layer, which, when new records of the required type have been successfully completed, a simple notification message is inserted into the local queue in MSMQ.

The reason for MSMQ is that the queue is persistent and transactional and that any messages there are available even after a system reboot failure, which allows me to never (cough!) Lose any messages.

The indexing service takes a notification, builds a Lucene Document and indexes the data.

The indexing service can also be started for a full reindex by deleting the existing index when crawling Db.

EDIT:

Architecture Example:

WCF service services that take data and pass it to the model level. The model level notifies the listening client that the CRUD operation was successful on the elements. The listener sends a notification to the queue.

The Windows service processes indexing data by looking at the queue for indexing requests.

ASP.Net provides a user interface with search functions.

+4
source

You can simply disable application pool reuse and host your application / service in IIS. To disable recirculation during configuration changes, use the disallowRotationOnConfigChange parameter.

You can also divide the application into two parts: index updates and search queries.

Paste index updates from Windows, and your IIS part handles the search (read-only). You would do this by installing a mechanism that detects index updates and update IndexSearchers . Thus, if you are responsible for using the services, this will not affect the search time, which is an important aspect for users. With this configuration, you can even have an update to the main node index and distribute requests to different web servers in the farm. The only drawback is that you do not have real-time search functions that are built into the IndexWriter class.

http://wiki.apache.org/lucene-java/NearRealtimeSearch

I have never had a performance problem with the Lucene feature settings that were opened through the WCF service, especially if you are working on the same machine with NetNamedPipe or on the local LAN using NetTcp.

+1
source

Source: https://habr.com/ru/post/1432343/


All Articles