Importing a CSV file into a sqlite3 database table using Python

Question

I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python. the command is ".import .....". but it seems that it cannot work like this. Can anyone give me an example of how to do it in sqlite3? I am using windows just in case. Thanks

Please provide the actual command that didn't work and the actual error message. "import...." could be anything. "cannot work" is too vague for us to guess at. Without details, we can't help. — S.Lott, May 22 '10 at 11:30
the actual command as I said is ".import" and it says syntax error new ".import" — Hossein, May 22 '10 at 11:36
Please actually post the actual command in the question. Please actually post the actual error message in the question. Please do not add comments that simply repeat things. Please update the question with actual copy and paste of what you're actually doing. — S.Lott, May 22 '10 at 11:46

bernie · Accepted Answer · 2016-08-08 21:37:54Z

up vote 78 down vote accepted

import csv, sqlite3

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here

with open('data.csv','rb') as fin: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(fin) # comma is default delimiter
    to_db = [(i['col1'], i['col2']) for i in dr]

cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()

edited Aug 8 '16 at 21:37

answered May 22 '10 at 12:20

bernie

78.7k9134146

2

In case you had the same problems I did: Make sure to change col1 and col2 to the column headers in the csv file. And close the connection to the database by calling con.close() at the end. – Jonas Aug 8 '16 at 21:31

Thanks, @Jonas. Updated post. – bernie Aug 8 '16 at 21:38

I keep getting not all arguments converted during string formatting when I attempt this method. – Whitecat Sep 1 '16 at 23:27

add a comment |

Tennessee Leeuwenburg · Answer 2 · 2015-03-02 04:14:26Z

up vote 26 down vote

Creating an sqlite connection to a file on disk is left as an exercise for the reader ... but there is now a two-liner made possible by the pandas library

df = pandas.read_csv(csvfile)
df.to_sql(table_name, conn, if_exists='append', index=False)

answered Mar 2 '15 at 4:14

Tennessee Leeuwenburg

50458

thank you. I got an issue with panda. my csv is delimited by ';' and have ',' in entries. panda gives error on read_csv. any setting to read entries with commas w/out temporarily replace? – Alexei Martianov Jun 19 '16 at 6:42

2

use sep=';'. The pandas documentation clearly outlines how to deal with this. – Tennessee Leeuwenburg Jun 20 '16 at 8:01

is there a way to use pandas but without using the RAM?, i have a huge .csv (7gb) i can't import as a dataframe and then appended to the DB. – Pablo A Nov 18 '16 at 12:57

Yes, there's a method in pandas which will read in chunks rather than all at once. I'm afraid I can't recall exactly off the top of my head. I think you add chunksize=<number_of_rows>, and then you get back an iterator which you can then use to append to a database piecewise. Let me know if you have trouble finding it and I can dig out a recipe. – Tennessee Leeuwenburg Dec 4 '16 at 7:00

add a comment |

Marcelo Cantos · Answer 3 · 2010-06-17 11:56:57Z

up vote 10 down vote

The .import command is a feature of the sqlite3 command-line tool. To do it in Python, you should simply load the data using whatever facilities Python has, such as the csv module, and inserting the data as per usual.

This way, you also have control over what types are inserted, rather than relying on sqlite3's seemingly undocumented behaviour.

edited Jun 17 '10 at 11:56

answered May 22 '10 at 11:34

Marcelo Cantos

122k22243298

There is no need to prepare the insert. The source of SQL statements and compiled results are kept in a cache. – John Machin Jun 17 '10 at 4:13

@John Machin: Is there a link to how SQLite does this? – Marcelo Cantos Jun 17 '10 at 7:34

@Marcelo: If you are interested in HOW it's done (why?), look in the sqlite source or ask on the sqlite mailing list. – John Machin Jun 17 '10 at 7:54

@John Machin: I'm interested because in all the SQLite documentation that I've come across, there is not a single word about automatic caching of unprepared statements. I don't think it is reasonable to have to read source code or probe mailing lists to discover something as basic as whether I should prepare my SQL statements or not. What is your source of information on this? – Marcelo Cantos Jun 17 '10 at 8:40

1

@Marcelo: Actually it's done in the Python sqlite3 wrapper module. docs.python.org/library/… says """The sqlite3 module internally uses a statement cache to avoid SQL parsing overhead. If you want to explicitly set the number of statements that are cached for the connection, you can set the cached_statements parameter. The currently implemented default is to cache 100 statements.""" – John Machin Jun 17 '10 at 10:25

| show 1 more comment

Cristian Ciupitu · Answer 4 · 2014-07-25 01:58:50Z

Many thanks for bernie's answer! Had to tweak it a bit - here's what worked for me:

import csv, sqlite3
conn = sqlite3.connect("pcfc.sl3")
curs = conn.cursor()
curs.execute("CREATE TABLE PCFC (id INTEGER PRIMARY KEY, type INTEGER, term TEXT, definition TEXT);")
reader = csv.reader(open('PC.txt', 'r'), delimiter='|')
for row in reader:
    to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8"), unicode(row[2], "utf8")]
    curs.execute("INSERT INTO PCFC (type, term, definition) VALUES (?, ?, ?);", to_db)
conn.commit()

My text file (PC.txt) looks like this:

1 | Term 1 | Definition 1
2 | Term 2 | Definition 2
3 | Term 3 | Definition 3

Fast Snail · Answer 5 · 2015-07-17 14:43:54Z

My 2 cents (more generic):

import csv, sqlite3
import logging

def _get_col_datatypes(fin):
    dr = csv.DictReader(fin) # comma is default delimiter
    fieldTypes = {}
    for entry in dr:
        feildslLeft = [f for f in dr.fieldnames if f not in fieldTypes.keys()]
        if not feildslLeft: break # We're done
        for field in feildslLeft:
            data = entry[field]

            # Need data to decide
            if len(data) == 0:
                continue

            if data.isdigit():
                fieldTypes[field] = "INTEGER"
            else:
                fieldTypes[field] = "TEXT"
        # TODO: Currently there's no support for DATE in sqllite

    if len(feildslLeft) > 0:
        raise Exception("Failed to find all the columns data types - Maybe some are empty?")

    return fieldTypes


def escapingGenerator(f):
    for line in f:
        yield line.encode("ascii", "xmlcharrefreplace").decode("ascii")


def csvToDb(csvFile, outputToFile = False):
    # TODO: implement output to file

    with open(csvFile,mode='r', encoding="ISO-8859-1") as fin:
        dt = _get_col_datatypes(fin)

        fin.seek(0)

        reader = csv.DictReader(fin)

        # Keep the order of the columns name just as in the CSV
        fields = reader.fieldnames
        cols = []

        # Set field and type
        for f in fields:
            cols.append("%s %s" % (f, dt[f]))

        # Generate create table statement:
        stmt = "CREATE TABLE ads (%s)" % ",".join(cols)

        con = sqlite3.connect(":memory:")
        cur = con.cursor()
        cur.execute(stmt)

        fin.seek(0)


        reader = csv.reader(escapingGenerator(fin))

        # Generate insert statement:
        stmt = "INSERT INTO ads VALUES(%s);" % ','.join('?' * len(cols))

        cur.executemany(stmt, reader)
        con.commit()

    return con

if len(feildslLeft) > 0: always true , so raising an exception . Please review and correct this. — shubham, Jan 9 '16 at 22:34
Any way to do this without having to fseek(), so that it can be used on streams? — mwag, Nov 20 '16 at 21:29

Christopher · Answer 6 · 2013-11-01 15:43:22Z

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys, csv, sqlite3

def main():
    con = sqlite3.connect(sys.argv[1]) # database file input
    cur = con.cursor()
    cur.executescript("""
        DROP TABLE IF EXISTS t;
        CREATE TABLE t (COL1 TEXT, COL2 TEXT);
        """) # checks to see if table exists and makes a fresh table.

    with open(sys.argv[2], "rb") as f: # CSV file input
        reader = csv.reader(f, delimiter=',') # no header information with delimiter
        for row in reader:
            to_db = [unicode(row[0], "utf8"), unicode(row[1], "utf8")] # Appends data from CSV file representing and handling of text
            cur.execute("INSERT INTO neto (COL1, COL2) VALUES(?, ?);", to_db)
            con.commit()
    con.close() # closes connection to database

if __name__=='__main__':
    main()

Where is the unicode() function? – Love and peace - Joe Codeswell Dec 2 '16 at 20:39 — Love and peace - Joe Codeswell, Dec 2 '16 at 20:39

Kathirmani Sukumar · Answer 7 · 2017-01-11 07:16:20Z

You can do this using blaze & odo efficiently

import blaze
csv_path = 'data.csv'
bz.odo(csv_path, 'sqlite:///data.db::data')

Odo will store the csv file to data.db (sqlite database) under the schema data

Or you use odo directly, without blaze. Either ways is fine. Read this documentation

asked	6 years ago
viewed	61242 times
active	1 month ago

Importing a CSV file into a sqlite3 database table using Python

7 Answers 7

Your Answer

Not the answer you're looking for? Browse other questions tagged python database sqlite csv sqlite3 or ask your own question.

Linked

Hot Network Questions

Importing a CSV file into a sqlite3 database table using Python

7 Answers 7

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python database sqlite csv sqlite3 or ask your own question.

Linked

Related

Hot Network Questions