What is the advantage of having a varbinary field in a separate table 1-1?

I need to store binaries in a varbinary (max) column in SQL Server 2005 as follows:

Fileinfo

  • FileInfoId int, PK, identity
  • FileText varchar (max) (may be null)
  • Date and time of file FileCreatedDate, etc.

Filecontent

  • FileInfoId int, PK, FK
  • FileContent varbinary (max)

FileInfo is one to one with FileContent. FileText is intended to be used when there is no file to upload, and only text will be entered manually for the item. I'm not sure what percentage of elements the binary will have.

Create a second table. Will there be performance improvements with two tables? Are there any logical benefits?

I found this page , but not sure if this applies in my case.

+1
source share
3 answers

No performance or operational benefits. Starting with SQL 2005, LOB types are already stored for you by the engine in a separate selection block, a separate b-tree. If you study the Table and Index Organization in SQL Server, you will see that each section has up to 3 allocation units: data, LOB, and the -overflow series:

Table organization

The LOB field (varchar (max), nvarchar (max), varbinary (max), XML, CLR UDT, as well as the text of obsolete types, ntext and the image) will have only a small footprint in the data record itself, in the cluster index: pointer to the block for selecting large objects, see Anatomy of a recording .

By storing the LOB explicitly in a separate table , you get absolutely nothing . You simply add unnecessary complexity, because the previous atomic updates should now be distributed on two separate tables, which complicates the structure of the application and the application.

If the contents of the LOB is a whole file, perhaps you should consider upgrading to SQL 2008 and using FILESTREAM .

+6
source

There is no real logical advantage to this two-table design, since the ratio is 1-1, you can have all the information related to the FileInfo table. However, there are serious operational and operational advantages, in particular if your binary data averages several hundred bytes.

EDIT . As Remus Rusanu noted, in some DBMS implementations, such as SQL2005, large types of objects are transparently stored in a separate table, effectively eliminating the practical drawback of having large records. The implementation of this function implicitly confirms the weakness of the [true] approach for a single table.

I just looked at the SO posting pointed out in this question. I usually argue that while another publication makes several valid points, such as the integrity of the embedded data (since all CRUD actions on this element are atomic), but in general and if only relatively atypical use cases (for example, using the table element as repository, which is mainly requested for individual elements at the same time), the performance advantage is associated with the use of two tables (in this case, indexes in the header table will be more efficient, queries that Some do not require binary data, they will return much faster, etc. etc.)

And the approach to the two tables has additional advantages if the design is developed to provide various types of binary objects in a different context. For example, let's say that these elements are images (GIF, JPG, etc.). At a later stage, you will also want to provide a small preliminary version of these images (and / or the high-resolution version), the choice of which depends on the context (user preferences, clients with low bandwidth, subscriber and visitor, etc.). In this case, not only the operational problems associated with the approach to a single table become more acute, the model becomes more universal.

+2
source

This can help separate the IMAGE, (N) TEXT, (N) VARCHAR (max) and VARBINARY (max) columns from the wider tables exclusively for some SQL Server restrictions.

For example, until 2012, it was not possible to rebuild the clustered table online if it contained LOB. On the other hand, you may not like these restrictions, so setting up a table, like your data, is the best thing to do.

If you physically want to save LOB data from the table selection block, you can still set the "large values ​​from the row" table parameter.

0
source

Source: https://habr.com/ru/post/904932/


All Articles