Proper use of column index

I just found out about the wonders of column indexes and how you can: "Use a column index to achieve up to 10 times better performance than traditional row-oriented storage and up to 7x data compression by uncompressed size."

With such significant performance gains, is there really any reason NOT to use them?

+6
source share
3 answers

The main disadvantage is that it will be difficult for you to read only part of the index if the query contains a selective predicate. There are ways to do this (partitioning, eliminating a segment), but they are not particularly easy for reliable implementation and do not scale to complex requirements.

For scan-only workloads, column column indices are almost ideal.

+4
source

Columnstore indexes are especially useful for DataWarehousing (DW) . This means that you will only be updating or uninstalling at a specific time.

This is due to their special design with delta loading and a large number of functions. This video will demonstrate great detail and a good basic overview of what the exact difference of the Columnstore Index is .

Traditional

If you have high input / output (input and output) of the application; Columnstore index is not perfect, as traditional row indexing will find and manipulate (using strings found through the index) for this particular purpose. An example of this is an ATM application, which often changes the values โ€‹โ€‹of account data strings .

Columnstore

Indexing column indices throughout COLUMNS , which is not ideal in this case, since row values โ€‹โ€‹will be distributed in segments (columnsindexes).

I highly recommend the video!

I also want to dwell on the cluster cluster storage of clusters:

The non-clustered Columnstore (update in 2012) saves WHOLE data again, which means (2X data) twice the data.

Where as a cluster column index (update in 2014) only 5 MB is required for approximately 16 GB of data. This is due to RTE (run-time coding), which stores the amount of duplicate data in each column. Creating an index takes up less space.

+4
source

Hello, a very detailed explanation of the column storage index can be found here .

Column column index

Column index is a technology for storing, retrieving, and managing data using a column data format called column storage.

This feature was introduced with SQL Server 2012, which intends to significantly speed up the processing time of common data warehouse queries. The main tasks of storage column indexes are suitable for typical datastore datasets and improve query performance each time data is extracted from huge datasets.

They are column-based indexes that can transform the data warehouse experience for users, providing better performance for general data warehouse queries, such as filtering, aggregation, grouping, and star aggregation queries. They store data column by column rather than row by row, as indexes currently do.

+2
source

Source: https://habr.com/ru/post/971468/


All Articles