How can I quickly add a lot of data from XML to my database?

I use .NET to parse an XML file with approximately 20 million rows (1.56 GB), creating LINQ objects from the data and then pasting them into the SQL database. It takes a lot of time.

To improve performance, I am considering a channel delimited file. I was also wondering if Perl could be faster. Anyone have any suggestions for speeding up this process?

+3
source share
10 answers

This is a radical thought, and I honestly don't know if it will improve your productivity, but it has a good chance to do it. I bet you instantiate your context object once and then use it to insert all the records, right? This means that the context will keep track of all these objects until it is deleted, and that may explain the deteriorating performance over time.

, . , ( , ), , . , . , , , n- . , . ?

+2

, LINQ . - XmlTextReader node node .

+4

, , SQL, . , BCP .

→ LINQ → DB?
1 20 .

+4

, , . :

  • XML
  • LINQ

? , XML, , , . , , , , :) .NET-, Stackoverflow .

.NET, , -, Perl, . , Perl .

+3

. CPU, n chunks ( n - , ) .

, Perl , .NET, , , .

+2

, , XML. XML- . .

-, , RDBMS , SQL Server 2005 2008. , , , SQL Server (SSIS). SSIS . , XML , .

(BTW, , 20 , MB ?). SSIS . . Source. ( ). , XmlReader.ReadSubTree, XmlReader, .

+1

- , , .

- , , . , .

+1

. , CSV XMLTextReader, .

+1

, , .

, , , . LINQ to SQL ( , ).

, ETL, MS SQL, .NET .

, :

  • ADO.NET LINQ to SQL;
  • - ;
  • , XML - , , XML , - , XML .

Perl , .NET, .NET , Perl - .

+1

, , xml SqlBulkCopy. xml 18 , .

        // read xml file into datatable
        DataSet ds = new DataSet();
        DataTable callList = new DataTable();

        string AppDataPath = ConfigurationManager.AppSettings["AppDataPath"];
        string dbSchema = AppDataPath + "/" + "CLBulkInsertSchema.xml";

        //Create a FileStream to the XML Schema file in Read mode
        FileStream finschema = new FileStream(dbSchema, FileMode.Open,
                               FileAccess.Read, FileShare.Read);

        //Read the Schema into the DataSet
        ds.ReadXml(finschema);

        //Close the FileStream
        finschema.Close();

        //Create a FileStream to the Xml Database file in Read mode
        FileStream findata = new FileStream(tempFilePathName, FileMode.Open,
                             FileAccess.Read, FileShare.ReadWrite);

        //Read the DataBase into the DataSet
        ds.ReadXml(findata);

        //Close the FileStream
        findata.Close();

        DataTable callList = ds.Tables["PhoneBook"];

        string conn = ConfigurationManager.ConnectionStrings["dbConnectionString"].ToString();

        using (SqlConnection connection =
               new SqlConnection(conn))
        {
            connection.Open();

            using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connection))
            {
                bulkCopy.DestinationTableName =
                    "dbo.CallBatchItems";

                // mappings required because we're skipping the BatchItemId column
                // and letting SQL Server handle auto incrementing of primary key.
                // mappings not required if order of columns is exactly the same
                // as destination table definition. 

                bulkCopy.ColumnMappings.Add("BatchId", "BatchId");
                bulkCopy.ColumnMappings.Add("FullName", "FullName");
                ...
                // Write from the source to the destination.
                bulkCopy.WriteToServer(callList);
            }
        }
0

Source: https://habr.com/ru/post/1710620/


All Articles