Removing Empty Dynamic Fields from Solr 1.4 Index

I have a Solr index that uses quite a few dynamic fields. I recently changed my code to reduce the amount of data that we index with Solr, greatly reducing the number of dynamic fields used.

I reindexed my data, and the doc number (as shown in the admin schema browser) for the old fields dropped to zero. But I am confused why the fields still exist. I did the optimization and restarted the server, but I cannot find any information on whether there is a way to make these fields disappear.

Am I stuck with these fields now if I don't create an index from scratch? We are talking about a significant reduction in fields (about 200 β†’ 30), and I'm worried about the impact of performance on maintaining them.

I am using Solr 1.4.

Edit: dynamic field definitions still exist in schema.xml because I still use them in several cases. It is just that the number of fields based on them has been significantly reduced.

Edit:

None of these fields are saved, only indexed. Therefore, I do not see them, just checking the returned documents, but I can separate them.

Here are my results for a query in a field that I'm still using ...

Query:

/?q=*:*&facet=on&facet.field=books_isbn_10_s_exact 

Result:

 <lst name="books_isbn_10_s_exact"> <int name="1010102457">2</int> <int name="1110011010">2</int> <int name="1110011013">2</int> ... 

Here are my results for one of the empty ones ...

Query:

 /?q=*:*&facet=on&facet.field=mobiles_infrared_s_exact 

Result:

 <lst name="mobiles_infrared_s_exact"/> 

Both fields use this field definition in my .xml schema:

 <dynamicField name="*_s_exact" type="string" indexed="true" stored="false" termVectors="true" omitNorms="true" multiValued="false" /> 

The only place where I see old fields (for example, mobiles_infrared_s_exact and about 100 others) is in the browser of the Solr scheme in / admin /. Where can I see all the dynamic fields that I have ever used, even if the number of documents for most of them is 0.

I'm just trying to find out if there is a way to remove them from the browser of the circuit, and also if there is a performance implication for them, while adhering to the fact that I have a 10 m document index

+4
source share
2 answers

What happens when you do something like this:

 /?q=mobiles_infrared_s_exact:xyzzy 

Are you getting zero documents or getting an error message?

0
source

I discovered this for several salt cores after several rounds of migration scheme. You can automate it by pulling directly from the lucene data, for example:

/solr/your_core/admin/luke?numTerms=0&wt=json

 [ // ... fields: { _version_: { type: "long", schema: "IS-----OF------", index: "-TS-------------", docs: 761997 }, abstract_display: { type: "string", schema: "--SM----------l", dynamicBase: "*_display" }, abstract_t: { type: "text", schema: "ITS-M-----------", dynamicBase: "*_t" } //... }] 

Then filter the fields for non-zero docs count. As for deleting them in the schema browser, I could only do this when porting solr to new installations or rebuilding the kernel from scratch. There may be other means, but it’s not really what Solr is set up to manipulate. He probably thinks the track is an internal artifact.

Effectively, this is more a question of the browser of the sorr scheme than a question of solr.

0
source

Source: https://habr.com/ru/post/1397019/


All Articles