"ERROR: additional data after the last expected column" when using PostgreSQL COPY

Please carry me as this is my first post.

I am trying to run the COPY command in PostgreSQL-9.2 to add a tab delimited table from a .txt file to a PostgreSQL database, for example:

COPY raw_data FROM '/home/Projects/TestData/raw_data.txt' WITH (DELIMITER ' '); 

I already created an empty table called "raw_data" in the database using the SQL command:

 CREATE TABLE raw_data (); 

When I try to run the COPY I get the following error message:

 ERROR: extra data after last expected column CONTEXT: COPY raw_data, line 1: " 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 ..." 

(The numbers here should be column headings)

I'm not sure if this is because I did not specify table columns when creating the db table, but I try not to manually enter 800 or columns.

Any suggestions on how to fix this?

Here is an example of what a .txt file looks like:

  1 2 3 4 5 6 7 8 9 binary1 1 1 0 1 1 1 1 1 1 binary2 1 0 0 1 0 1 1 0 0 binary3 1 0 1 1 1 0 0 1 0 binary4 1 1 1 1 0 1 0 1 0 
+6
source share
2 answers

An empty table will not work. You need a table that matches the structure of the input. Sort of:

 CREATE TABLE raw_data ( col1 int , col2 int ... ); 

You do not need to declare tab as DELIMITER since this is the default:

 COPY raw_data FROM '/home/Projects/TestData/raw_data.txt'; 

800 speakers say? This number of columns usually indicates a problem with your design. In any case, there are ways to CREATE TABLE script CREATE TABLE .

automation

Assuming simplified raw data

 1 2 3 4 -- first row contains "column names" 1 1 0 1 -- tab separated 1 0 0 1 1 0 1 1 

Define another DELIMITER (one that is not present in the import data at all) and import it into a temporary staging table with one text column:

 CREATE TEMP TABLE tmp_data (raw text); COPY tmp_data FROM '/home/Projects/TestData/raw_data.txt' WITH (DELIMITER 'Β§'); 

This query creates the CREATE TABLE script:

 SELECT 'CREATE TABLE tbl (col' || replace (raw, E'\t', ' bool, col') || ' bool)' FROM (SELECT raw FROM tmp_data LIMIT 1) t; 

More general and secure request:

 SELECT 'CREATE TABLE tbl(' || string_agg(quote_ident('col' || col), ' bool, ' ORDER BY ord) || ' bool);' FROM (SELECT raw FROM tmp_data LIMIT 1) t , unnest(string_to_array(t.raw, E'\t')) WITH ORDINALITY c(col, ord); 

Returns:

 CREATE TABLE tbl (col1 bool, col2 bool, col3 bool, col4 bool); 

Run after validation or run dynamically if you trust the result:

 DO $$BEGIN EXECUTE ( SELECT 'CREATE TABLE tbl (col' || replace(raw, ' ', ' bool, col') || ' bool)' FROM (SELECT raw FROM tmp_data LIMIT 1) t ); END$$; 

Then INSERT data using this query:

 INSERT INTO tbl SELECT (('(' || replace(replace(replace( raw , '1', 't') , '0', 'f') , E'\t', ',') || ')')::tbl).* FROM (SELECT raw FROM tmp_data OFFSET 1) t; 

Or easier with translate() :

 INSERT INTO tbl SELECT (('(' || translate(raw, E'10\t', 'tf,') || ')')::tbl).* FROM (SELECT raw FROM tmp_data OFFSET 1) t; 

The string is converted to a string literal, converted to the newly created table row type, and decomposed with (row).* .

Done.

You can put all this into the plpgsql function, but you will need to protect against SQL injection. (There are several related solutions here. Try the search.

db <> fiddle here
Old SQL fiddle

+11
source

you can directly create a table from the copy command, check the HEADER option in COPY, for example: COPY FROM '/path/to/csv/SourceCSVFile.csv' DELIMITERS ',' CSV HEADER

+1
source

Source: https://habr.com/ru/post/944249/


All Articles