How to disable property for indexing in ElasticSearch

I index the data using the default mechanism (without passing any schema / structure). I am just an XPOST JSON document.

I want to use:

  • one single index
  • various types, but not tied to the data itself

The problem I am facing is that my JSON document has one specific property that sometimes nests recursively. When this happens, ElasticSearch errors during the PUT operation of indexing data.

The content of this attribute is not important for my searches / indexes. I know that I can exclude it from the data, but I still want it to be stored as a NoSQL solution.

Example:

{prop1: "something", dirty_prop: {someprop: 123, dirty_prop: {....}}}

As you can see above, there is a nested inclusion that will fail.

The question arises: how to avoid errors by storing data. I will assume that removing dirty_prop from indexing will allow it to go through. What is the easiest way to exclude it without having to supply a complete structure (I cannot provide a complete structure / scheme, because I get new attributes in my data).

+4
source share
1 answer

I would say that using JSON is probably not a good idea, but if you have it and you can’t fix anything, you can look at the enabled property, which you can use for fields of type object in your mapping. Look here to find out more. If you say enabled: false , the json branch will not be parsed or indexed, but stored in the _source field, as you wish.

On the other hand, I am not 100% sure that this will work, depending on how bad your json is. Of course, the json parser (which uses the pull approach) should be able to identify the next object and continue parsing other json fields.

The fact that you provide the complete structure of your json makes things a little more complicated. You can use dynamic patterns to specify a pattern that identifies all objects that need to be ignored, and their mapping by specifying enabled: false .

+5
source

Source: https://habr.com/ru/post/1492203/


All Articles