Indexing Lucene: Storage and Indexing Modes

I think I still do not understand the lucene indexing options.

The following options:

  • Store.Yes
  • Store.No

and

  • Index.Tokenized
  • Index.Un_Tokenized
  • Index.No
  • Index.No_Norms

I do not understand the store option. Why would you ever want to NOT store your area?
Tokenizing breaks the content and removes noise words / delimiters (e.g., "and", "or", etc.)
I do not know what norms may be. How are tagged values ​​stored? What happens if I save the value "my string" in the field "Field Name"? Why not request

 fieldName:my string 

return something?

+43
c # lucene
Mar 16 '09 at 14:24
source share
3 answers

Store.Yes

means that the field value will be stored in the index

Store.No

means that the field value will NOT be stored in the index

Store.Yes / No does not affect indexing or searching with lucene. It just tells lucene if you want it to act as a data store for the values ​​in the field. If you use Store.Yes, then during the search the value of this field will be included in your Documents search result.

If you store your data in a database and use only the Lucene index to search, you can leave Store.No in all of your fields. However, if you use the index as a store, then you will need Store.Yes.

Index.Tokenized

means that the field will be marked when indexing (you received it). This is useful for long fields with a few words.

Index.Un_Tokenized

means that the field will not be parsed and will be saved as a single value. This is useful for keywords / single words and some short fields with multiple words.

Index.No

What he says. The field will not be indexed and therefore cannot be determined. However, you can use Index.No with Store.Yes to store a value that you do not want to look for.

Index.No_Norms

Same as Index.Un_Tokenized, except that several bytes will be saved without storing some normalization data. This data is used to increase and normalize the field length.

For further reading, lucene javadocs are priceless (current API version 4.4.0):

For your last question, why your query returns nothing without knowing more about how you index this field, I would say that this is because your fieldName qualifier is only bound to the string "my". To search for the phrase "my string" that you want:

fieldName: "my string"

Look for the words "my" and "string" in the fieldName field:

fieldName: (my string)

+84
Mar 17 '09 at 4:05
source share
β€” -

In case any Java users stumble upon this, the same parameters in the answer in March 2009 still exist in the Lucene 4.6.0 Java library, but are deprecated. The current way to set these parameters is FieldType .

+2
Jan 14 '14 at 10:11
source share

Store.YES will give you the ability to highlight a word (through the highlight function) that matches your search keyword. This means not just extraction, but also display

0
Oct 19 '17 at 1:50
source share



All Articles