ORDER BY with diacritics in Postgres

I need to select data from a table and sort it with an ORDER BY clause. The problem is that the column contains text data with Czech diacritics. I cannot use COLLATE because the database is part of the postgres cluster that was created using lc_collate = en_US.UTF-8, and I cannot allow the downtime caused by reconstructing the cluster with the correct lc_collate.

Sample data:

CREATE TABLE test ( id serial PRIMARY key, name text ); INSERT INTO test (name) VALUES ('Žoo'), ('Zoo'), ('ŽOO'), ('ZOO'), ('ŽoA'), ('ŽóA'), ('ŽoÁ'), ('ŽóÁ'); 

Ideal yield:

 SELECT * FROM test ORDER BY name COLLATE "cs_CZ.utf8"; id | name ----+------ 2 | Zoo 4 | ZOO 5 | ŽoA 7 | ŽoÁ 6 | ŽóA 8 | ŽóÁ 1 | Žoo 3 | ŽOO (8 rows) 

Here I found my solution:

 SELECT * FROM test ORDER BY name USING ~<~; id | name ----+------ 4 | ZOO 2 | Zoo 3 | ŽOO 5 | ŽoA 1 | Žoo 7 | ŽoÁ 6 | ŽóA 8 | ŽóÁ (8 rows) 

The result is close enough (for my use) - carboxylic letters AFTER without caroling.


My slightly non -ect analogue of Postgresql with the operator ~<~

edit : turned into a new question .


Returning to the question : is there any other solution to get the perfect order, except to recreate the postgres cluster with the correct locale?

It would also be nice to bind ~<~ to the operator.

+1
source share
3 answers

As @Igor noted in his comment, there is no need to recreate the postgres cluster with another lc_collate and handle the downtime caused.

The exact steps that solved the problem:

  • add / uncomment the line cs_CZ.UTF-8 UTF-8 in /etc/locale.gen

  • create a new locale:

    # locale-gen

  • define a new sort in postgres:

    CREATE COLLATION "cs_CZ.utf8" ( locale = 'cs_CZ.UTF-8' );

0
source

I am not sure if I understand this question, because it seems that you have already found a solution. The only thing I can offer is to add a new czechName field with the correct mapping

http://www.postgresql.org/docs/current/static/sql-altertable.html

 ADD [ COLUMN ] column_name data_type [ COLLATE collation ] [ column_constraint [ ... ] ] 
0
source

Similar to what Juan Carlos Oropeza offers, you can try changing the sorting of the column:

 ALTER TABLE test ALTER COLUMN "name" TYPE text COLLATE 'cs_CZ.utf8'; 

Link: http://www.postgresql.org/docs/current/static/sql-altertable.html

0
source

Source: https://habr.com/ru/post/1244478/


All Articles