Should I store large data types in a separate table?

The question is old, and I think it does not have a 100% correct answer. But I would like to hear more experienced tips.

Using SQL Server 2008 R2.

I have a table where millions of rows will be stored. Most columns are descriptions (date, status, name, ..) of varbinary (max) column data. There are also 2 columns of XML data type. these XMLs are small and will be frequently requested. So:

MyTable ( SomeID varchar(20)-- queried most often Date DateTime -- queried most often Status VarChar(10) -- queried most often Title VarChar(50) -- queried most often -- some more columns here SomeSmallXML xml -- queried quite often SomeOtherSmallXML xml -- queried quite often MyData varbinary(max) -- queried rarely MyOtherData varbinary(max) -- queried rarely ) 

IF I move all large value types to another table:

  • can perform online reindexing of the clustered index. But , then I need to move the xml types to another table as well. since they are often asked, she is not visible to the rational. (I expect fragmentation because the SomeID column comes from the client application. It is not wise to make another surrogate key as a clustered index, so SomeID will be a cluster index key.)
  • can move big data to slower storage. But the hunch can achieve the same thing: partitioning tables (old data in a slow filegroup) + indexes on fast storage.

In this case, I see very good reasons to move large data types to another table. I see the reason for setting "sp_tableoption N'MyTable", "large values ​​of types from the string", 'ON' ".

What is your advice? What else should I consider?

+4
source share
1 answer

I made a decision based on a discussion with other colleagues: I partitioned SOME LOB data (also SomeID and Date columns) into partitioned data in another table.

Most importantly: I skipped to consider the speed of updating columns , and how often data is queried frequently , and when they become old enough not to be interesting in the vast majority (but not all).

And that is what makes the difference in this case.

So, I came up with:

  MyTable ( SomeID varchar(20)-- queried most often / Updated never Date DateTime -- queried most often / Updated never Status VarChar(10) -- queried most often / Updated few times after insert Title VarChar(50) -- queried most often / Updated never -- some more columns here SomeSmallXML xml -- queried quite often / Updated few times after insert SomeOtherSmallXML xml -- queried quite often / Updated never MyData varbinary(max) -- queried rarely / Updated never MyOtherData varbinary(max) -- queried rarely / Updated 1 shortly after insert ) 

So, as you can see, some data LOB-MyData and MyOtherData varbinary (max) become static after a short time. They are large enough, so I would like to store them on a chip disk and at some point put them on a read-only partition. Since a later date, since more often I need "MyData" or "MyOtherData".

So, the final design looks something like this:

  MyTable ( SomeID varchar(20) Date DateTime Status VarChar(10) Title VarChar(50) -- some more columns here SomeSmallXML xml SomeOtherSmallXML xml ) MyTableLOB ( SomeID varchar(20) Date DateTime -- used for partitioning MyData varbinary(max) MyOtherData varbinary(max) ) 
+1
source

Source: https://habr.com/ru/post/1402200/


All Articles