Considerations in which to store documents - on a file server or in a database?

I have a design decision regarding documents uploaded to my website: I can either store them on my file server somewhere, or I can store them as a blob in my database (MSSQL 2005). If this is of any significance for the design solution, these documents are confidential and must have a certain degree of protection.

The considerations I was thinking of are as follows:

  • Saving to a file server makes for HUUUUUUUUGEGE the number of files that are all dumped in one directory and, therefore, slower access, unless I can work out a reasonable semantic definition of the directory tree structure
  • OTOH, I assume that the file server can handle compression somewhat better than DB ... or am I wrong?
  • My instincts tell me that database security is stronger than a file server, but I'm not sure if this is necessarily true.
  • I don't know how using terabytes of blob in my database will affect performance.

I really appreciate some of the recommendations here. Thanks!

+4
source share
3 answers

In SQL Server 2005, you only have the choice of using VARBINARY(MAX) to store files inside a database table or save them externally.

The obvious drawback of leaving them outside the database is that the database cannot really control what happens to them; they can be moved, renamed, deleted .....

SQL Server 2008 introduces the FILESTERAM attribute in the VARBINARY(MAX) types, which allows you to leave files outside the database table, but is still under transactional database management - for example, you cannot just delete files from disk, files are an integral part of the database and thus will be copied and copied with it. Great if you need it, but it can do for huge backups! :-)

The SQL Server 2008 release introduced some โ€œbest practicesโ€ as to when to store material directly in the database, and when to use FILESTREAM. It:

  • If the file size is usually less than 256 KB, it is best to use a database table
  • If the file size usually exceeds 1 MB or may have a size of more than 2 GB, then your best bet would be FILESTREAM (or in your case: a plain old file system)
  • no recommendations for files between these two fields

In addition, in order not to adversely affect the performance of your queries, it is often useful to place large files in a separate alltogether table - not have huge blocks that will be part of your regular tables that you query - but rather create a separate table with which you only ever asking if you really need megabytes of documents or images.

So, this can give you an idea of โ€‹โ€‹where to start!

+7
source

I highly recommend you consider a file system solution. The reasons are as follows:

  • you have better access to files (precious in case of debugging), which means that you can use regular console tools
  • you can quickly and easily use the OS to distribute the load, for example, using a distributed file system, add redundancy through hardware RAID, etc.
  • you can use OS access control lists to provide access rights.
  • you do not litter your database

If you are concerned about the large number of entries in your directories, you can always create a branching scheme. eg:

 filename : hello.txt filename md5: 2e54144ba487ae25d03a3caba233da71 final filesystem position: /path/2e/54/hello.txt 
+3
source

There is a lot of โ€œdepends onโ€ in this popular storyline. Since you say that the documents are sensitive and confidential, from the cuff I would go with storage in the database. Here are a few reasons:

  • Potentially better security. It is often easier to hack a file system than a database.
  • Improved volume control. Thousands of files in one folder can strain the OS, where the database can receive millions of rows in one table without blinking.
  • Improved search and scan. Add column classification when loading data or try full text indexing to scan actual documents.
  • Backups can be more efficient - just add another database to the backup plan and you will be protected (as soon as you deal with space information, of course). And these backup files are another layer of obfuscation for anyone trying to get your sensitive documents.
  • SQL Server 2008 has data compression options that can help here. Is this or an app? (Perhaps more security through obfuscation)

SQL Server 2008 also has a stream data type that can help here, but I'm not familiar enough with it to give you recommendations for your situation.

+1
source

Source: https://habr.com/ru/post/1300318/


All Articles