I have a tab delimited file in the format:
sentenceID (sid) documentID (scid) sentenceText (sent)
eg.
100004 100 ๅณไพฟๆจๅ็ฑๆต่ฟ้
ๅง๏ผไนๅฎ็ถๅจ่ฟ่ฝปๆพๅฎ้ฒ็ไธ้
๏ผๆฅไธๅบ็่ๆฒ้็็บฆไผใ 100005 100 ๆจๅฏไปฅๆ
ขๆ
ขๆข็ฉถ่ๅไธๆๆ็็งๅฏๆๅใ
I want to put it in sqlite3 with the following schema:
CREATE TABLE sent ( sid INTEGER PRIMARY KEY, scid INTEGER, sent TEXT, );
Is there a quick way to use the python API for sqlite ( http://docs.python.org/2/library/sqlite3.html ) to put them in a table?
I did it as such:
#!/usr/bin/python # -*- coding: utf-8 -*- import sqlite3 as lite import sys, codecs con = lite.connect('mycorpus.db') with con: cur = con.cursor() cur.execute("CREATE TABLE Corpus(sid INT, scid INT, sent TEXT, PRIMARY KEY (sid))") for line in codecs.read('corpus.tab','r','utf8'): sid,scid,sent = line.strip().split("\t") cur.execute("INSERT INTO Corpus VALUES("+sid+","+scid+"'"+sent+"')")
alvas source share