Solr against problems with document coding

I am using solrj 1.4. My solrj does not correctly index utf-16 encoded documents. I think when he tries to convert to unicode, he replaces the problem keys of the surrogate utf-16 with a Unicode replaceable character U + FFFD. Can someone explain to me how to configure solrj 1.4 to index / search utf-16 documents as well as utf-8?

+4
source share
1 answer

The Solr index is in utf-8 ( Why international characters do not work ). To be able to search using other encodings, you can always perform translations in your Solr programming interface.

+1
source

Source: https://habr.com/ru/post/1368889/


All Articles