TokenStream Lucene Field with Saved Values

I have a field that should come from a token stream; it cannot be created using a string and then parsed into tokens. For example, I could combine data from several columns (in my RDBMS) into a single Lucene field, but I want to analyze each column in its own way. Therefore, I cannot just concatenate them as a single string and then parse the resulting string.

The problem that I encountered is that fields created from token streams cannot be saved, which makes sense in the general case, since the stream may not have an obvious string representation. However, I know the string representation, and I would like to keep it.

I tried to add the same field twice, once when it is saved and will have string data, and once it comes from the token stream, but it seems to be impossible. Besides some hacking, for example adding a field named "myfield__stored", is there any way to do this?

I am using 2.9.2.

+3
source share
1 answer

I have found a way. You can sneak it by creating it as a normal field, but later calling SetTokenStream:

Field f = new Field(Name, StringValue, Store, Analyzed, TV);
f.SetTokenStream(TokenStreamValue);

Since the reader / string value is only indexed, if the value of the token stream is null, the value of the token stream will be indexed. Storage methods look at the string / reader regardless of the flow of tokens, so this value will be saved.

+3
source

Source: https://habr.com/ru/post/1781328/


All Articles