How to speed up Redshift requests

I am using the json_extract_path_text function to extract values ​​from JSON. As the row data grows, the query takes a long time to start and crash for some time.

Is there a way to shorten the query execution time or improve the josn_extract_path_text function

+5
source share
2 answers

Solution: store your data in a tabular format instead of JSON. JSON is not a good choice for storing large datasets, because by storing disparate data in a single column, JSON does not use Amazon Redshifts column storage architecture. Or, alternatively, change the node type to a larger one.

+1
source

Redshift - column storage; storing data in JSON format will not speed up queries for it. This will work with the NOSQL database for the document model, but not with RedShift. To make efficient RedShift requests, the distribution style (even for a scenario in which the data does not correspond to the speicifc order or is random) from the tables is important based on the number of clusters you have. In addition, the distribution key in the primary key column (otherwise in the RDBMS model) and the sort key on the same will help you in Joins (it will use Merge Merge instead of a longer Hash Join).

See the documentation for more details. RTFM is your friend here.

-1
source

Source: https://habr.com/ru/post/1206796/


All Articles