What is a column index and how is it different from clustered and non-clustered

Hi, I got confused about the columnstore index, what is the index of the column index and how it differs from the clustered and non-clustered index.

+10
source share
3 answers

The Clustered Columnstore Index is a new feature in SQL Server 2014. Column index allows you to store data in a columnar format instead of the traditional row-based storage. Column repository indexes (non-clustered) were originally introduced in SQL 2012 to meet high query performance with the large volume requirements typical of data / report repositories.

Basic moments:

  • It stores data in a columnar data structure that helps read faster. Saves data in a compressed format, and therefore your total I / O cost will be minimal.
  • The column structure of columns is the same data structure where data and indexes are stored, in contrast to data stored separately, and indexes stored separately, etc.,
  • This will be very useful for a larger column table, where you select only limited columns daily, for example, if there is a ProductSalesFact table, you usually choose for this product what is the number of sales, or for this quarter, what are sales, etc., Although it has hundreds of columns, it only has access to two required columns.

My column index blog, which provides a study of the performance of 300 million records using a storestore vs rowstore column

https://sqlserver101.wordpress.com/2016/01/25/why-clustered-columnstore-index-in-sql-server-2014/

MSDN link for various columnstore versions and paths:

https://msdn.microsoft.com/en-us/library/dn934994.aspx

+2
source

Suppose you have a table as below with col1 as your primary key

 col1 col2 col3 1 2 3 4 5 6 

The normal index will be saved as shown below, provided that the page can contain only one row

  row1 1 2 3--page1-- all columns reside in one page row2 4 5 6--page2 

so when you want to read something like sum (col3), SQLServer will need to read page1 and page 2 to get 3.6, i.e. the cost of two pages ..

Now with column storage indexes, the same table will be saved as below

 page1 page2 page3 1 2 3 4 5 6 

Now, if you want to get the sum of col3, you just need to read one page (page3)

The advantage of using column storage indexes is that you can only touch the necessary pages from disk. Memory is also used efficiently since you will not keep reading unnecessary data.

+15
source

The column index is very well explained here: http://www.patrickkeisler.com/2014/04/what-is-non-clustered-columnstore-index.html

The traditional clustered and non-clustered index that you mentioned is the rowstore index, where the database stores the index in a row by row. The index will be distributed across several sections, so even when we select only one column, the database still has to scan all the sections to get the data, therefore, do a lot of I / O.

Index on the other hand

Columnstore stores an index column by column. Usually this will have all the column data stored in one section, since all the data in one column is not so large combined. Now, when we select 1 column from the index, the database can return data from one section, which reduces the number of I / O operations. Moreover, the column index often has a significant degree of compression, so I / O is even more efficient and the entire index can be stored in memory, which allows faster queries from 10x to 100x.

Column index is not always better than rowstore. The Columnstore index is suitable for scenarios such as data warehouse and BI, where data is often processed in bulk, for example, for aggregates. However, it works worse than the rowstore index in scenarios where data is often viewed on separate rows.

It is worth noting that the index of a non-clustered column blocks the modification of your table (but there are some solutions for modifying data), and the clustered columnstore index still allows you to edit data without dropping or disabling the index.

See the article above for more information on this topic, and try reading the MSDN docs.

+3
source

Source: https://habr.com/ru/post/1257638/


All Articles