The tsvector text view contains a list of entries for a specific token:
test=# select to_tsvector ( 'english', 'new bar in New York' ); to_tsvector ---------------------------- 'bar':2 'new':1,4 'york':5
The following is an example of an example function based on this. It takes text parameters and converts them to ts_vector internally, but can be easily rewritten to accept ts_vector.
CREATE OR REPLACE FUNCTION lexeme_occurrences ( IN _document text , IN _word text , IN _config regconfig , OUT lexeme_count int , OUT lexeme_positions int[] ) RETURNS RECORD AS $$ DECLARE _lexemes tsvector := to_tsvector ( _config, _document ); _searched_lexeme tsvector := strip ( to_tsvector ( _config, _word ) ); _occurences_pattern text := _searched_lexeme::text || ':([0-9,]+)'; _occurences_list text := substring ( _lexemes::text, _occurences_pattern ); BEGIN SELECT count ( a ) , array_agg ( a::int ) FROM regexp_split_to_table ( _occurences_list, ',' ) a WHERE _searched_lexeme::text != ''
Usage example:
select * from lexeme_occurrences ( 'The Fat Rats', 'rat', 'english' ); lexeme_count | lexeme_positions
source share