Solr will use Highlighter instead of warning FastVectorHighlighter

Hi, I am developing a rails application with Solr 4.1 search engine,

When I add highlighting for searchSolr, start waking up tomcat6 log with this warning:

Jan 29, 2015 12:13:38 PM org.apache.solr.highlight.DefaultSolrHighlighter useFastVectorHighlighter WARNING: Solr will use Highlighter instead of FastVectorHighlighter because *Field_Name* field does not store TermPositions and TermOffsets. 

An example of my field in schema.xml:

<field name="name" type="text" indexed="true" stored="true" multiValued="true"/>

What I found in the documentation:

The standard marker is a Swiss army knife. It has the most complex and finely grained representation of the queries of the three main elements. For example, this marker can provide exact match even for advanced search elements, such as a surround parser. It does not require any special data structures, such as termVectors, although they will use them if they are present. If this is not the case, this marker will reanalyze the document on the fly to highlight it. These markers are a good choice for a wide range of search use cases. Fastlector highlighter

FastVector Highlighter requires the use of vector options (termVectors, termPositions and termOffsets) in the field and is optimized for this. It works better for more languages ​​than Standard Highlighter because it supports Unicode breakers. On the other hand, its request-request is less advanced than a standard marker: for example, it will not work with a surround parser. These markers are a good choice for large documents and text highlighting in different languages.

And FastVector highlighting provides a faster search: http://solr.pl/en/2011/06/13/solr-3-1-fastvectorhighlighting/ .

But what is the difference in setting Highlighting and FastVectorHighlighting?

And users see the difference in search results when I change the selection to FastVectorHighlighting?

All I need to do to enable FastVectorHighlighting is to add termVectors="on" termPositions="on" termOffsets="on"/> to each field in schema.xml? How:

<field name="name" type="text" indexed="true" stored="true" multiValued="true" termVectors="on" termPositions="on" termOffsets="on"/>

I also found this problem in the Solr documentation: https://issues.apache.org/jira/browse/SOLR-5544

But I still don’t know how I can fix the WARNING, as the size of my log file increases by 500 MB per second! this is critical because the seah server stops if there is no free space on the volume.

Please, help.

0
source share
1 answer

I found fields in my schema.xml that include the termVectors="true" attribute without termPositions="true" termOffsets="true" .

That was the reason for the warnings.

So what I did:

  • added termPositions="true" termOffsets="true" to the fields in schema.xml wihch there is only an attribute termVectors="true"
  • added termVectors="true" termPositions="true" termOffsets="true" to each field that I found in the warnings: ("... the field phone does not save position and offset ...").

After I reindexed, but it does not fix the spam warnings in the logs.

The reason for this problem. Sold, schema.xml updates are not visible, and tomcat does not restart.

So, I restart tomcat:

  • sudo /etc/init.d/tomcat6 restart .

  • I'm starting to reindex again because all the selections were lost

Thanks a lot @chefe for the help!

+2
source

Source: https://habr.com/ru/post/911521/


All Articles