Bulk insert in SQL Server, insert statements and one XML insert statement?

At the application level (ColdFusion), you must insert multiple rows in SQL Server 2005.

I was thinking of using a loop in an application layer to create multiple input and send statements in SQL Server via JDBC in the same connection.

My colleague, however, suggests building XML and XML bulk insert instead.

Which one is the best method?

+4
source share
3 answers

For insertions of millions of rows, I used BULK INSERT, writing data to a CSV file that the instance of SQL Server has access to. This is superior to any type of insertion through JDBC, but at the cost of reduced flexibility. For fewer lines, using JDBC Statement.addBatch() and Statement.executeBatch() can be used to prevent overhead when sending many small commands.

Depending on your requirements, you may have to put all this in one transaction or you can split it into several transactions if full ACID guarantees for the entire data set are not required.

Here is an article that discusses XML bulk insertion . I have no data to base any conclusion on, but, by assumption, I would suggest that BULK INSERT raw row data would be faster since there is no need to convert OPENXML. Of course, if your data is already in XML, then it makes sense, but if not, then staying with tabular data is probably the easiest and possibly the most efficient.

+4
source

XML would be a better approach to IMO, although I would also recommend that you create a stored procedure to handle the actual processing instead of running an inline query.

Basically, you send the XML data as a single argument, then inside the SP you run the INSERT INTO SELECT statement, choosing from XML in some table or group of tables.

 DECLARE @FOO xml; SET @FOO = '<things><thing><id>1</id></thing><thing><id>2</id></thing><thing><id>3</id></thing><thing><id>4</id></thing></things>'; SELECT ParamValues.id.value('.', 'int') AS thing_id FROM @FOO.nodes('/things/thing/id') AS ParamValues(id) 

This will create a table with a single column "thing_id". Now all you have to do is something like

 INSERT INTO someTable (someID) SELECT ParamValues.id.value('.', 'int') AS thing_id FROM @FOO.nodes('/things/thing/id') AS ParamValues(id) 

and you've got a single INSERT to handle however many rows of XML you have.

+2
source

One INSERT at a time will be unnecessarily slow due to all this network latency.

The best way would be to send several transactions in the form of one batch in one trip over the network and transfer them as a whole.

If you have a very large number of records, you can consider a hybrid approach: cycle through several batches and send them as a unit of work. This will be done even for large transactions, and you will not force your database to maintain a large rollback log until the entire transaction.

I'm not a fan if your XML solution means inserting a raw XML stream as a CLOB. How will you request it after using it in the database? You lose everything that SQL gives: the ability to query. All you can do is XPath for certain values ​​if you store raw XML. And updates mean replacing the entire CLOB.

+1
source

Source: https://habr.com/ru/post/1309918/


All Articles