How do I get the appropriate intervals for a request for deadlines in Lucene 5?

In Lucene, to get words around a term, it is recommended to use Span Queries. There is a good walkthrough at http://lucidworks.com/blog/accessing-words-around-a-positional-match-in-lucene/

It is assumed that spans are accessed using the getSpans () method.

SpanTermQuery fleeceQ = new SpanTermQuery(new Term("content", "fleece"));
Spans spans = fleeceQ.getSpans(searcher.getIndexReader());

Then the API changed in Lucene 4, and the getSpans () method became more complex and, finally, in the latest Lucene (5.3.0) release, this method was removed (apparently moved to the SpanWeight class).

So what is the current way of accessing the gaps matched by the term query?

+4
source share
1 answer

The way to do this would be as follows.

LeafReader pseudoAtomicReader = SlowCompositeReaderWrapper.wrap(reader);
Term term = new Term("field", "fox");
SpanTermQuery spanTermQuery = new SpanTermQuery(term);
SpanWeight spanWeight = spanTermQuery.createWeight(is, false);
Spans spans = spanWeight.getSpans(pseudoAtomicReader.getContext(), Postings.POSITIONS);

span.next() 5.3 Lucene. ,

int nxtDoc = 0;
while((nxtDoc = spans.nextDoc()) != spans.NO_MORE_DOCS){
  System.out.println(spans.toString());
  int id = nxtDoc;
  System.out.println("doc_id="+id);
  Document doc = reader.document(id);
  System.out.println(doc.getField("field"));
  System.out.println(spans.nextStartPosition());
  System.out.println(spans.endPosition());
}
+1

Source: https://habr.com/ru/post/1605846/


All Articles