Lucene gotchas with punctuation

While creating some unit tests for my Lucene queries, I noticed some strange punctuation behavior, in particular around parentheses.

What are some of the best ways to work with search fields that contain significant punctuation?

+3
source share
2 answers

If you have not configured the query parser, Lucene should behave according to the syntax of the parsing of the query by query . Are you getting something else? Do you want punctuation to have special meaning or just remove punctuation from searches? Another common suspect here is Analyzer , which determines how your field is indexed and how the query is broken into parts for search. Can you post specific examples of bad behavior?

+3
source

These are not only brackets, other point fragments, such as colon, hyphen, etc. will cause problems. Here is a way to deal with them.

+1
source

Source: https://habr.com/ru/post/1757274/


All Articles