Sql Bulk inserts XML file with double quotes in terminator

I am trying to insert some data into a table from a csv document that has all fields separated by ""

t

APPLICANTID,NAME,CONTACT,PHONENO,MOBILENO,FAXNO,EMAIL,ADDR1,ADDR2,ADDR3,STATE,POSTCODE "3","Snoop Dogg","Snoop Dogg","411","","","","411 High Street","USA ","","USA", "1111" "4","LL Cool J","LL Cool J","","","","","5 King Street","","","USA","1111" 

I am using an xml file to try to overcome the "delimiters" because, in my opinion, I will have to update the data again after import to remove inital "if it is not.

My format file is as follows:

 <?xml version="1.0"?> <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <RECORD> <FIELD ID="1" xsi:type="NCharTerm" TERMINATOR='",' MAX_LENGTH="12"/> <FIELD ID="2" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="3" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="4" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="5" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="6" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="7" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="8" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="9" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="10" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="11" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/> <FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="Latin1_General_CI_AS"/> </RECORD> <ROW> <COLUMN SOURCE="1" NAME="APPLICANTID" xsi:type="SQLINT"/> <COLUMN SOURCE="2" NAME="NAME" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="3" NAME="CONTACT" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="4" NAME="PHONENO" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="5" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="6" NAME="FAXNO" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="7" NAME="EMAIL" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="8" NAME="ADDR1" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="9" NAME="ADDR2" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="10" NAME="ADDR3" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="11" NAME="STATE" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="12" NAME="POSTCODE" xsi:type="SQLCHAR"/> </ROW> </BCPFORMAT> 

and I start the import with the following:

 BULK INSERT [PracticalDB].dbo.applicant FROM 'C:\temp.csv' WITH (KEEPIDENTITY, FORMATFILE='C:\temp.xml', FIRSTROW = 2) 

I get an error message:

Msg 4864, Level 16, State 1, Line 1 Bulk data conversion error (type mismatch or invalid character for the specified code page) for line 2, column 1 (APPLICANTID).

for all lines.

I have tried various combinations for the terminator, including the use of:

 TERMINATOR="&quot;," TERMINATOR="\"," TERMINATOR='"," TERMINATOR='\"," 

and none of them work.

Is there a right way to avoid "so that it is correctly understood, assuming this is my problem here.

+6
source share
3 answers

Ok, so I figured it out!

You can use "instead" when you define xml attributes, i.e. TERMINATOR = '', then you can use "inside them without worrying."

I also needed to eat the first one with a field so that other columns could be parsed correctly. It ended up with a format file

 <?xml version="1.0"?> <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <RECORD> <FIELD ID="1" xsi:type="CharTerm" TERMINATOR='"' /> <FIELD ID="2" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="3" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="4" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="5" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="6" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="7" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="8" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="9" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="10" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="11" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="12" xsi:type="CharTerm" TERMINATOR='","' /> <FIELD ID="13" xsi:type="CharTerm" TERMINATOR='"\r\n' /> </RECORD> <ROW> <COLUMN SOURCE="2" NAME="APPLICANTID" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="3" NAME="NAME" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="4" NAME="CONTACT" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="5" NAME="PHONENO" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="6" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="7" NAME="FAXNO" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="8" NAME="EMAIL" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="9" NAME="ADDR1" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="10" NAME="ADDR2" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="11" NAME="ADDR3" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="12" NAME="STATE" xsi:type="SQLNVARCHAR"/> <COLUMN SOURCE="13" NAME="POSTCODE" xsi:type="SQLCHAR"/> </ROW> </BCPFORMAT> 

In cases where the first field simply throws one to remove the first "and the remaining fields, all separate by", "and the final division by" (new line)

+15
source

Tip. If only some of the fields are doubleqouted, use the openrowset version of the bulk insert, and you can control the contents of the field coming from the input file before inserting it into the target table.

In manipulation, you can do something with the contents of the field, for example. remove double quotes. The performance impact is not mentioned here, I have no measures in this regard.

+2
source

Hint: if your CSV file does not have a consistent format, for example, in ONE Column, some of the values ​​are doubleqouted, and some, besides this blog, will help you to do this in a simple way (this is a continuation of Estevez how to use openrowset - this is only the last step) : //ariely.info/Blog/tabid/83/EntryId/122/Using-Bulk-Insert-to-import-inconsistent-data-format-using-pure-T-SQL.aspx

+1
source

Source: https://habr.com/ru/post/903891/


All Articles