SQL Server BULK INSERT - Insert DateTime Values

I have 6 million rows of data that I want to insert into my SQL Server database. I can do it the slow way with 6 million INSERT statements (I estimate it will take 18 hours), or I can try BULK INSERT.

BULK INSERT has problems with the inability to escape characters, but the data in this case is very simple and therefore should not run into this problem.

However, SQL Server does not seem to want to insert any date / time data into the field.

Here is the table (psuedo-SQL)

CREATE TABLE Tasks ( TaskId bigint NOT NULL IDENTITY(1,1) PRIMARY KEY, TriggerId bigint NOT NULL FOREIGN KEY, Created datetime NOT NULL, Modified datetime NOT NULL, ScheduledFor datetime NULL, LastRan datetime NULL, -- and about 10 more fields after this ) 

Here is my BULK INSERT statement:

 SET DATEFORMAT dmy BULK INSERT Tasks FROM 'C:\TasksBulk.dat' WITH ( -- CHECK_CONSTRAINTS is not necessary as the only constraints are always enforced regardless of this option (UNIQUE, PRIMARY KEY, and NOT NULL) CODEPAGE = 'RAW', DATAFILETYPE = 'native', KEEPIDENTITY, MAXERRORS = 1, ORDER ( CallId ASC ), FIELDTERMINATOR = '\t', ROWTERMINATOR = '\0' ) 

And here is the first row of data in TasksBulk.dat:

 1000\t1092\t01/01/2010 04:00:17\t01/01/2010 04:00:17\t\t01/01/2010 04:00:14\0 

(For readability, reformatted with tabs replaced with 4 spaces :)

 1000 1092 01/01/2010 04:00:17 01/01/2010 04:00:17 01/01/2010 04:00:14\0 

However, when I run the BULK INSERT statement, I get this error:

Msg 4864, Level 16, State 1, Line 2 Bulk upload data conversion error (type mismatch or invalid character for the specified code page) for line 1, column 3 (created).

I tried to use different terminators for strings and fields and every other date / time format (including the "01/01/2010", "2010-01-01", with or without the "04:00:17" component). I do not know what I am doing wrong here.

+6
source share
3 answers

It turns out that changing DATAFILETYPE from "native" to "char" solved the problem. The "native" type implies a strict data format for everything, while the "char" is for more plaintext files.

+5
source

You have set CODDEPAGE to RAW (presumably for speed).

An error message means that your data contains characters outside the code page.

 CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' | 'code_page' ] 

Specifies the data code page in the data file. CODEEPERATION is relevant only if the data contains char, varchar, or text columns with Character Values ​​greater than 127 or less than 32.

But this can be misleading. Your example data line contains a missing column. If you are not using a format file, you must use each field in the table.

This way you can create a format file or create a staging table with varchar (25) for the datetime columns, import and then update from the staging table to the destination table. Thus, you have more control over the transformations and lack of data.

+1
source

The method I'm familiar with is inserting your dates as an integer.

I use the number of seconds starting from a certain date (I use one for 10 years in the past, since there is no data that I could access or generate that is older than this)

Date 2012-01-02 12: 15: 10.000 will be stored as 378637886 using the breakpoint on January 1, 2000.

When querying the database, you can return the column using DateAdd (SS, column_name, '2000-01-01').

You could do this in milliseconds if such precision was needed.

I use my own function to convert my time in seconds to any format that I would like to use, and I use another custom function to return dates in seconds.

I understand that this may not be a good method, because it may require database changes and code changes for you, but it may be a solution concept that others may find useful.

0
source

Source: https://habr.com/ru/post/895513/


All Articles