Just a problem. I am trying to configure full-text search for localized content (in particular, in Russian). The problem in the default configuration (as in my normal mode) does not apply to emails. Example:
SELECT * from to_tsvector('test_russian', ' '); > '':1 '':4 '':6 '':3 '':5 '':2
'On' is a stop word and should be deleted, but it does not even decrease in the result vector. If I pass a lowercase string everything works correctly
SELECT * from to_tsvector('test_russian', ' '); > '':4 '':6 '':3 '':5 '':2
Of course, I can pass lowercase strings, but the manual says
A simple dictionary template works by converting the input token to lowercase and checking it for a stop word file.
Config russian_test as follows:
create text search CONFIGURATION test_russian (COPY = 'russian'); CREATE TEXT SEARCH DICTIONARY russian_simple ( TEMPLATE = pg_catalog.simple, STOPWORDS = russian ); CREATE TEXT SEARCH DICTIONARY russian_snowball ( TEMPLATE = snowball, Language = russian, StopWords = russian ); alter text search configuration test_russian alter mapping for word with russian_simple,russian_snowball;
But I get exactly the same results with the built-in russian config.
I tried ts_debug and tokens processed as word , as I expected.
Any ideas?
source share