Add multiValued field to SolrInputDocument

We use the built-in solr instance for Java SolrJ.

I want to add a multi-valued field to the document. A multi-valued field is a coma divided by lines.

In Java, I want:

solrInputDocument.addField(Field1, "value1,value2,value3");

The definition for field 1 in the diagram is as follows

<field name="Field1" type="multiValuedField"   indexed="true"  stored="true"  multiValued="true" required="false"/>

<fieldType name="multiValuedField" class="solr.TextField" positionIncrementGap="100">
     <analyzer type="index">
         <tokenizer class="solr.ClassicTokenizerFactory"/>
     </analyzer>
</fieldType> 

With this configuration, we expected that when calling the addField method, Solr was able to verify that it was multiValuedField, and therefore it would convert String to an arrayList with different values.

Instead, we get an arraylist with a single value, which is actually the original string added to the document.

Question: should the tokenizer take care of this, or should we do it ourselves when we add multi-valued fields to the document?

Thanks.

+4
4

addField SolrInputDocument . ArrayList , SolrJ :

String[] valuesArray = {"value1", "value2", "value3"};
ArrayList<String> values = new ArrayList<String>(Arrays.asList(valuesArray));
solrInputDocument.addField("Field1", values);
+10

SolrInputDocument.addField(String name, Object value) , Object , Collection .

# 1:

ArrayList<String> values = Arrays.asList({"value1", "value2", "value3"});
solrInputDocument.addField("field", values);

# 2:

solrInputDocument.addField("field", "value1");
solrInputDocument.addField("field", "value2");
solrInputDocument.addField("field", "value3");

. , . , , Solr, SolrInputField.addValue(Object v, float b).

/**
 * Add values to a field.  If the added value is a collection, each value
 * will be added individually.
 */
@SuppressWarnings("unchecked")
public void addValue(Object v, float b) {
  if( value == null ) {
    if ( v instanceof Collection ) {
      Collection<Object> c = new ArrayList<Object>( 3 );
      for ( Object o : (Collection<Object>)v ) {
        c.add( o );
      }
      setValue(c, b);
    } else {
      setValue(v, b);
    }

    return;
  }

  boost *= b;

  Collection<Object> vals = null;
  if( value instanceof Collection ) {
    vals = (Collection<Object>)value;
  }
  else {
    vals = new ArrayList<Object>( 3 );
    vals.add( value );
    value = vals;
  }

  // Add the new values to a collection
  if( v instanceof Iterable ) {
    for( Object o : (Iterable<Object>)v ) {
      vals.add( o );
    }
  }
  else if( v instanceof Object[] ) {
    for( Object o : (Object[])v ) {
      vals.add( o );
    }
  }
  else {
    vals.add( v );
  }
}
+6

SOLRJ SOLR, , ,

solrInputDocument.addField(Field1, "value1");
solrInputDocument.addField(Field1, "value2");
solrInputDocument.addField(Field1, "value3");
+2

Confirmed. Tokenizers do not throw data for you. Thus, the approach is to work with the data at boot time, in order to have it in the correct format.

Thanks for the help.

+1
source

Source: https://habr.com/ru/post/1524335/


All Articles