Quotes in tab delimited file

I have a simple application that opens a tab delimited text file and inserts this data into the database.

I use this CSV reader to read data: http://www.codeproject.com/KB/database/CsvReader.aspx

And everything works fine!

Now my client has added a new field to the end of the file, which is “ClaimDescription”, and in some of these claims descriptions the data contains quotation marks in it, for example:

"SUMISEY MARA No. 2" - Sea of ​​Japan

This seems to be causing a serious headache for my application. I get an exception that looks like this:

CSV seems to be damaged near field “1470” of record “26” at position “181”. Current source data: ...

And in this "raw data", of course, the claim description field contains the data with quotation marks in it.

I want to know if anyone had this problem before and got around it? Obviously, I can ask the client to change the data that they originally sent me, but this is an automatic process that they use to create a tab delimited file; and I would rather use this as a last resort.

I thought that I could open the file using the standard TextReader before starting work, avoid quotes, write the contents back to a new file, and then transfer this file to CSV Reader. It is probably worth mentioning that the average file size of these tab delimited files is around 40 MB.

Any help is much appreciated!

Cheers, Sean

0
7

- , , "Claim_Description". , , , !

- , .

, .

,

0
+2

, CsvReader , TSV , , , customDelimiter CsvReader

public static void ParseTSV(string filepath)
    {
        using (CsvReader csvReader = new CsvReader(new StreamReader(filepath), true, '\t')) {
        //if that didn't work, passing unlikely characters into the other params might help
        //using (CsvReader csvReader = new CsvReader(new StreamReader(filepath), true, '\t', '~', '`', '~', ValueTrimmingOptions.None)) {
            int fieldcount = csvReader.FieldCount;

            //Does not work, since it read only property
            //csvReader.Delimiter = "\t";

            string[] headers = csvReader.GetFieldHeaders();

            while (csvReader.ReadNextRecord()) {
                for (int i = 0; i < fieldcount; i++) {
                    string msg = String.Format("{0}\r{1};", headers[i],
                                               csvReader[i]);
                    Console.Write(msg);
                }
                Console.WriteLine();
            }
        }
    }
+1

, , .

0

, RFC CSV (RFC 4180), ,

( , Microsoft Excel, ). , .

, , , :

,""SUMISEI MARU NO 2" - sea of Japan",

, , , "" RFC 4180 CSV.

CSV-, , , .

, .

() , , .

0

Source: https://habr.com/ru/post/1736159/


All Articles