Using LOAD DATA INFILE to load csv into mysql table

I am using LOAD DATA INFILE to load CSV into a table.

This is the table I created in my db:

CREATE TABLE expenses (entry_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(entry_id), ss_id INT, user_id INT, cost FLOAT, context VARCHAR(100), date_created DATE); 

Here is some of the sample data I'm trying to load (some of the rows have data for each column, some are missing a date column):

  1,1,20, Sandwiches after hike,
 1,1,45, Dinner at Yama,
 1,2,40, Dinner at Murphys,
 1,1,40.81, Dinner at Yama,
 1,2,1294.76, Flight to Taiwan, 1/17/2011
 1,2,118.78, Grand Hyatt @ Seoul, 1/22/2011
 1,1,268.12, Seoul cash withdrawal, 1/8/2011

Here is the LOAD DATA command that I cannot work with:

 LOAD DATA INFILE '/tmp/expense_upload.csv' INTO TABLE expenses (ss_id, user_id, cost, context, date) ; 

This command completes, loads the correct number of rows into the table, but each field is NULL. Anytime I try to add FIELDS ENCLOSED BY ',' or LINES TERMINATED BY '\ r \ n' I get a syntax error.

Other notes: csv was created in MS Excel.

If anyone has any advice or can point me in the right direction, it will be very appreciated!

+4
source share
2 answers

First of all, I would change FLOAT to DECIMAL to cost

 CREATE TABLE expenses ( entry_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, ss_id INT, user_id INT, cost DECIMAL(19,2), -- use DECIMAL instead of FLOAT context VARCHAR(100), date_created DATE ); 

Now try

 LOAD DATA INFILE '/tmp/sampledata.csv' INTO TABLE expenses FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' -- or \r\n (ss_id, user_id, cost, context, @date_created) SET date_created = IF(CHAR_LENGTH(TRIM(@date_created)) > 0, STR_TO_DATE(TRIM(@date_created), '%m/%d/%Y'), NULL); 

Which identifier:

  • it uses the correct syntax to specify terminator fields and columns
  • since your date values ​​in the file do not match the format, it first reads the value of the user / session variable, and then if it is not empty, it converts it to a date, otherwise it assigns NULL . The latter prevents the receipt of zero dates for 0000-00-00 .
+7
source

Here is my advice. Load the data into the staging table, where all columns are rows, and then inserted into the resulting table. This allows you to better check the results along the way:

 CREATE TABLE expenses_staging (entry_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(entry_id), ss_id varchar(255), user_id varchar(255), cost varchar(255), context VARCHAR(100), date_created varchar(255) ); LOAD DATA INFILE '/tmp/expense_upload.csv' INTO TABLE expenses_staging (ss_id, user_id, cost, context, date); 

This will allow you to see what is really loading. Then you can load this data into the summary table, making the necessary data transformations.

0
source

Source: https://habr.com/ru/post/1491551/


All Articles