OleDB and Excel mixed data types: missing data

I have an Excel worksheet that I want to read in a datatable - everything is fine except for one column in my Excel worksheet. The column "ProductID" is a combination of values, such as ########## and n######### .

I tried to let OleDB process everything by itself automatically , reading it in the / datatable dataset, but any values ​​in the "ProductID", for example n###### , are missing, ignored and left blank. I tried to manually create my DataTable by going through each row using a datareader, but with the same results.

Here is the code:

 // add the column names manually to the datatable as column_1, column_2, ... for (colnum = 0; colnum < num_columns; colnum ++){ ds.Tables["products"].Columns.Add("column_" +colnum , System.Type.GetType("System.String")); } while(myDataReader.Read()){ // loop through each excel row adding a new respective datarow to my datatable DataRow a_row = ds.Tables["products"].NewRow(); for (col = 0; col < num_columns; col ++){ try { a_row[col] = rdr.GetString(col); } catch { a_row[col] = rdr.GetValue(col).ToString(); } } ds.Tables["products"].Rows.Add(a_row); } 

I do not understand why he will not allow me to read values ​​such as n###### . How can i do this?

+47
excel dataset datatable oledb
Jul 12 2018-10-12T00:
source share
5 answers

Using .Net 4.0 and reading Excel files, I had a similar problem with the OleDbDataAdapter - that is, reading in a mixed data type in the "PartID" column in MS Excel, where the PartID value can be numeric (for example, 561) or text (for example, HL4354 ), although the Excel column has been formatted as "Text."

From what I can tell, ADO.NET selects the data type based on most of the values ​​in the column (with reference to the numeric data type). that is, if most of the PartID in the sample set is numeric, ADO.NET will declare the column numeric. Therefore, ADO.Net will try to assign each cell to a number that will not be executed for the “text” PartID values ​​and will not import these “text” PartIDs.

My solution was to set the OleDbConnection connectionstring to use Extended Properties=IMEX=1;HDR=NO to indicate that it is an import and that the tables (tables) will not contain headers. The excel file has a header line, so in this case, tell ado.net not to use it. Then, in the code, remove this header row from the dataset, and you have a mixed data type for this column.

 string sql = "SELECT F1, F2, F3, F4, F5 FROM [sheet1$] WHERE F1 IS NOT NULL"; OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + PrmPathExcelFile + @";Extended Properties=""Excel 8.0;IMEX=1;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text"""); OleDbCommand cmd = new OleDbCommand(sql, connection); OleDbDataAdapter da = new OleDbDataAdapter(cmd); DataSet ds = new DataSet(); ds.Tables.Add("xlsImport", "Excel"); da.Fill(ds, "xlsImport"); // Remove the first row (header row) DataRow rowDel = ds.Tables["xlsImport"].Rows[0]; ds.Tables["xlsImport"].Rows.Remove(rowDel); ds.Tables["xlsImport"].Columns[0].ColumnName = "LocationID"; ds.Tables["xlsImport"].Columns[1].ColumnName = "PartID"; ds.Tables["xlsImport"].Columns[2].ColumnName = "Qty"; ds.Tables["xlsImport"].Columns[3].ColumnName = "UserNotes"; ds.Tables["xlsImport"].Columns[4].ColumnName = "UserID"; connection.Close(); 

// now you can use LINQ to search for fields

  var data = ds.Tables["xlsImport"].AsEnumerable(); var query = data.Where(x => x.Field<string>("LocationID") == "COOKCOUNTY").Select(x => new Contact { LocationID= x.Field<string>("LocationID"), PartID = x.Field<string>("PartID"), Quantity = x.Field<string>("Qty"), Notes = x.Field<string>("UserNotes"), UserID = x.Field<string>("UserID") }); 
+96
Apr 19 '11 at 19:20
source share

Several forums that I found claim that adding IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text to the extended properties in the connection string will fix the problem, but it wasn’t. I finally solved this problem by adding “HDR = NO” to the advanced properties in the connection string (as Brian Wells shows) to import mixed types.

Then I added some common code to name the columns after the first row of data, and then delete the first row.

  public static DataTable ImportMyDataTableFromExcel(string filePath) { DataTable dt = new DataTable(); string fullPath = Path.GetFullPath(filePath); string connString = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=\"" + fullPath + "\";" + "Extended Properties=\"Excel 8.0;HDR=No;IMEX=1;\""; string sql = @"SELECT * FROM [sheet1$]"; using (OleDbDataAdapter dataAdapter = new OleDbDataAdapter(sql, connString)) { dataAdapter.Fill(dt); } dt = BuildHeadersFromFirstRowThenRemoveFirstRow(dt); return dt; } private static DataTable BuildHeadersFromFirstRowThenRemoveFirstRow(DataTable dt) { DataRow firstRow = dt.Rows[0]; for (int i = 0; i < dt.Columns.Count; i++) { if(!string.IsNullOrWhiteSpace(firstRow[i].ToString())) // handle empty cell dt.Columns[i].ColumnName = firstRow[i].ToString().Trim(); } dt.Rows.RemoveAt(0); return dt; } 
+9
Aug 21 '12 at 21:05
source share

No problem with sh4, glad this helps with a mixed type problem.

The DateTime column is all another animal, which, as I recall, caused me grief in the past ... we have one excel file that we process, that OleDbDataAdapter sometimes converts dates to a double data type (apparently Excel stores dates as doubles that encode the number of days that have passed since January 0, 1900).

The workaround was to use:

 OleDbConnection mobjExcelConn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + txtExcelFile.Text + @";Extended Properties=""Excel 8.0;IMEX=1;HDR=Yes;"""); OleDbDataAdapter mobjExcelDataAdapter = new OleDbDataAdapter("Select * from [" + txtSheet.Text + "$] where [Supplier ID] <> '' ", mobjExcelConn); DateTime dtShipStatus = DateTime.MinValue; shipStatusOrig = excelRow["Est Ship Date"].ToString(); // excelRow is DataRow in the DataSet via the OleDbDataAdapter if (shipStatusOrig != string.Empty) { // Date may be read in via oledb adapter as a double if (IsNumeric(shipStatusOrig)) { double d = Convert.ToDouble(shipStatusOrig); dtShipStatus = DateTime.FromOADate(d); if (DateTime.TryParse(dtShipStatus.ToString(), out dtShipStatus)) { validDate = true; Debug.WriteLine("{0} converted: ", dtShipStatus.ToString("s")); } } else { if (ValidateShipDate(shipStatusOrig)) { dtShipStatus = DateTime.Parse(shipStatusOrig); validDate = true; Debug.WriteLine("{0} converted: ", dtShipStatus.ToString("s")); } else { validDate = false; MessageBox.Show("Invalid date format in the Excel spreadsheet.\nLine # " + progressBar1.Value + ", the 'Ship Status' value '" + shipStatusOrig + "' is invalid.\nDate should be in a valid date time format.\ne.g. M/DD/YY, MDY, YYYY-MM-DD, etc.", "Invaid Ship Status Date"); } } ... } public static Boolean IsNumeric (Object Expression) { if(Expression == null || Expression is DateTime) return false; if(Expression is Int16 || Expression is Int32 || Expression is Int64 || Expression is Decimal || Expression is Single || Expression is Double || Expression is Boolean) return true; try { if(Expression is string) Double.Parse(Expression as string); else Double.Parse(Expression.ToString()); return true; } catch {} // just dismiss errors but return false return false; } public bool ValidateShipDate(string shipStatus) { DateTime startDate; try { startDate = DateTime.Parse(shipStatus); return true; } catch { return false; } } 
+6
Apr 29 2018-11-21T00:
source share

There are two ways to handle mixed data types and excel.

Method 1

  • Open the Excel spreadsheet and set the column format to the desired format manually. In this case, "Text."

Method 2

  • There is a "hack" that consists of adding "IMEX = 1" to your connection string :

    Provider = Microsoft.Jet.OLEDB.4.0; Data Source = myfile.xls; Advanced Properties = Excel 8.0; IMEX = 1

  • This will try to process mixed Excel formats according to how it is installed in your registry. This can be installed locally by you, but for the server this is probably not an option.

+5
Jul 12 '10 at 21:22
source share

@Brian Wells Thank you, your offer did the trick, but not completely ... It worked for an int-string mixed field, but the datetime columns went with strange characters after that, so I applied a "crack" to a "crack".

1.- Make System.Io.File.Copy and create a copy of the excel file.

2.- Change the headings of the Datetime columns programmatically at runtime to something in the datetime format, that is, "01/01/0001".

3.- Save the excel, and then apply the trick executing the query with HDR = NO to the modified file.

It is difficult, yes, but it worked, and reasonably quickly, if anyone has an alternative to this, I will be glad to hear.

Hey.

PD Sorry, my English is not my native language.

+1
Apr 28 '11 at 2:33 p.m.
source share



All Articles