I am using sqlite. I am importing a dataset where an identifier is assigned externally to a temporary table before inserting them into my permanent table. The persistent table uses an external identifier (RunId) and does not have another id column.
I am importing a table from csv into a new table, Book1, where C15 is the identifier column. Then I started the insert:
INSERT INTO PrimusRuns (RunId, TransientName, RunDateStart, RunType, TestDateStart, Gross, CPS, Shares, MaxExposure, PercentWin, Duration) SELECT a.C15, a.C1, JULIANDAY(a.C2), a.C3,JULIANDAY(a.C4), a.C6, a.C8, a.C9, a.C10, a.C11, a.C14 FROM Book1 as a;
however, I get a primary key constraint error:
[19] [SQLITE_CONSTRAINT_PRIMARYKEY] A PRIMARY KEY constraint failed (UNIQUE constraint failed: PrimusRuns.RunID)
At first, I thought some of these rows are already in the table:
SELECT * FROM Book1 WHERE C15 IN( SELECT RunID from PrimusRuns );
returns nothing.
Then I realized that when importing, the lines are repeated:
SELECT * FROM Book1 GROUP BY C15 HAVING COUNT(*) > 1
This aggregate query returns 95 rows, which means you must delete at least 95 rows. How to remove it to remove duplicates?
NOTE. There are other questions like this, however my question is different in that the identifier is also a duplicate. Other questions group all other columns and remove max (id). But in my case max id returns both rows not one.