Join the Stack Overflow Community
Stack Overflow is a community of 6.8 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python. the command is ".import .....". but it seems that it cannot work like this. Can anyone give me an example of how to do it in sqlite3? I am using windows just in case. Thanks

share|improve this question
3  
Please provide the actual command that didn't work and the actual error message. "import...." could be anything. "cannot work" is too vague for us to guess at. Without details, we can't help. – S.Lott May 22 '10 at 11:30
    
the actual command as I said is ".import" and it says syntax error new ".import" – Hossein May 22 '10 at 11:36
8  
Please actually post the actual command in the question. Please actually post the actual error message in the question. Please do not add comments that simply repeat things. Please update the question with actual copy and paste of what you're actually doing. – S.Lott May 22 '10 at 11:46
up vote 78 down vote accepted
import csv, sqlite3

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here

with open('data.csv','rb') as fin: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(fin) # comma is default delimiter
    to_db = [(i['col1'], i['col2']) for i in dr]

cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()
share|improve this answer
2  
In case you had the same problems I did: Make sure to change col1 and col2 to the column headers in the csv file. And close the connection to the database by calling con.close() at the end. – Jonas Aug 8 '16 at 21:31
    
Thanks, @Jonas. Updated post. – bernie Aug 8 '16 at 21:38
    
I keep getting not all arguments converted during string formatting when I attempt this method. – Whitecat Sep 1 '16 at 23:27

Creating an sqlite connection to a file on disk is left as an exercise for the reader ... but there is now a two-liner made possible by the pandas library

df = pandas.read_csv(csvfile)
df.to_sql(table_name, conn, if_exists='append', index=False)
share|improve this answer
    
thank you. I got an issue with panda. my csv is delimited by ';' and have ',' in entries. panda gives error on read_csv. any setting to read entries with commas w/out temporarily replace? – Alexei Martianov Jun 19 '16 at 6:42
2  
use sep=';'. The pandas documentation clearly outlines how to deal with this. – Tennessee Leeuwenburg Jun 20 '16 at 8:01
    
is there a way to use pandas but without using the RAM?, i have a huge .csv (7gb) i can't import as a dataframe and then appended to the DB. – Pablo A Nov 18 '16 at 12:57
    
Yes, there's a method in pandas which will read in chunks rather than all at once. I'm afraid I can't recall exactly off the top of my head. I think you add chunksize=<number_of_rows>, and then you get back an iterator which you can then use to append to a database piecewise. Let me know if you have trouble finding it and I can dig out a recipe. – Tennessee Leeuwenburg Dec 4 '16 at 7:00

The .import command is a feature of the sqlite3 command-line tool. To do it in Python, you should simply load the data using whatever facilities Python has, such as the csv module, and inserting the data as per usual.

This way, you also have control over what types are inserted, rather than relying on sqlite3's seemingly undocumented behaviour.

share|improve this answer
    
There is no need to prepare the insert. The source of SQL statements and compiled results are kept in a cache. – John Machin Jun 17 '10 at 4:13
    
@John Machin: Is there a link to how SQLite does this? – Marcelo Cantos Jun 17 '10 at 7:34
    
@Marcelo: If you are interested in HOW it's done (why?), look in the sqlite source or ask on the sqlite mailing list. – John Machin Jun 17 '10 at 7:54
    
@John Machin: I'm interested because in all the SQLite documentation that I've come across, there is not a single word about automatic caching of unprepared statements. I don't think it is reasonable to have to read source code or probe mailing lists to discover something as basic as whether I should prepare my SQL statements or not. What is your source of information on this? – Marcelo Cantos Jun 17 '10 at 8:40
1  
@Marcelo: Actually it's done in the Python sqlite3 wrapper module. docs.python.org/library/… says """The sqlite3 module internally uses a statement cache to avoid SQL parsing overhead. If you want to explicitly set the number of statements that are cached for the connection, you can set the cached_statements parameter. The currently implemented default is to cache 100 statements.""" – John Machin Jun 17 '10 at 10:25

Many thanks for bernie's answer! Had to tweak it a bit - here's what worked for me:

import csv, sqlite3
conn = sqlite3.connect("pcfc.sl3")
curs = conn.cursor()
curs.execute("CREATE TABLE PCFC (id INTEGER PRIMARY KEY, type INTEGER, term TEXT, definition TEXT);")
reader = csv.reader(open('PC.txt', 'r'), delimiter='|')
for row in reader:
    to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8"), unicode(row[2], "utf8")]
    curs.execute("INSERT INTO PCFC (type, term, definition) VALUES (?, ?, ?);", to_db)
conn.commit()

My text file (PC.txt) looks like this:

1 | Term 1 | Definition 1
2 | Term 2 | Definition 2
3 | Term 3 | Definition 3
share|improve this answer

My 2 cents (more generic):

import csv, sqlite3
import logging

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile, outputToFile = False):
    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("%s %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "CREATE TABLE ads (%s)" % ",".join(cols)

        con = sqlite3.connect(":memory:")
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO ads VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()

    return con
share|improve this answer
    
if len(feildslLeft) > 0: always true , so raising an exception . Please review and correct this. – shubham Jan 9 '16 at 22:34
    
Any way to do this without having to fseek(), so that it can be used on streams? – mwag Nov 20 '16 at 21:29
#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys, csv, sqlite3

def main():
    con = sqlite3.connect(sys.argv[1]) # database file input
    cur = con.cursor()
    cur.executescript("""
        DROP TABLE IF EXISTS t;
        CREATE TABLE t (COL1 TEXT, COL2 TEXT);
        """) # checks to see if table exists and makes a fresh table.

    with open(sys.argv[2], "rb") as f: # CSV file input
        reader = csv.reader(f, delimiter=',') # no header information with delimiter
        for row in reader:
            to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8")] # Appends data from CSV file representing and handling of text
            cur.execute("INSERT INTO neto (COL1, COL2) VALUES(?, ?);", to_db)
            con.commit()
    con.close() # closes connection to database

if __name__=='__main__':
    main()
share|improve this answer
    
Where is the unicode() function? – Love and peace - Joe Codeswell Dec 2 '16 at 20:39

You can do this using blaze & odo efficiently

import blaze
csv_path = 'data.csv'
bz.odo(csv_path, 'sqlite:///data.db::data')

Odo will store the csv file to data.db (sqlite database) under the schema data

Or you use odo directly, without blaze. Either ways is fine. Read this documentation

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.