Nested prohibit / require operators in Lucene searches

I use Lucene for Java, and I need to figure out what the engine does when I execute some obscure queries. Take the following query:

+(foo -bar)

If I use QueryParser to parse input, I get a BooleanQuery object that looks like this:

org.apache.lucene.search.BooleanQuery:
    org.apache.lucene.search.BooleanClause(required=true, prohibited=false):
        org.apache.lucene.search.BooleanQuery:
            org.apache.lucene.search.BooleanClause(required=false, prohibited=false):
                org.apache.lucene.search.TermQuery: foo
            org.apache.lucene.search.BooleanClause(required=false, prohibited=true):
                org.apache.lucene.search.TermQuery: bar

What does Lutsen want to find? Are these documents that SHOULD contain "foo" but cannot contain a "bar"? What if I am looking for:

-(foo +bar)

Are those documents that CANNOT contain "foo" and cannot contain a "bar"? Or perhaps those that CANNOT contain "foo" but MUST contain a "bar"?

If this helps, here's what I looked into the QueryParser results:

QueryParser parser = new QueryParser("contents", new StandardAnalyzer());
Query query = parser.parse(text);
debug(query, 0);

public static void debug(Object o, int depth) {
    for(int i=0; i<depth; i++) System.out.print("\t");
    System.out.print(o.getClass().getName());

    if(o instanceof BooleanQuery) {
        System.out.println(":");
        for(BooleanClause clause : ((BooleanQuery)o).getClauses()) {
            debug(clause, depth + 1);
        }
    } else if(o instanceof BooleanClause) {
        BooleanClause clause = (BooleanClause)o;
        System.out.println("(required=" + clause.isRequired() + ", prohibited=" + clause.isProhibited() + "):");
        debug(clause.getQuery(), depth + 1);
    } else if(o instanceof TermQuery) {
        TermQuery term = (TermQuery)o;
        System.out.println(": " + term.getTerm().text());
    } else {
        throw new IllegalArgumentException("Unknown object type");
    }
}
+3
2

Lucene OR ,

+(foo OR -bar)

, ( ) "foo" ""

"+" "" , "foo" , "-bar", , "bar"

+1

Source: https://habr.com/ru/post/1708119/


All Articles