3

I know of two ways to insert without duplication. The first is using a WHERE NOT EXISTS clause:

INSERT INTO table_name (col1, col2, col3)
SELECT %s, %s, %s
WHERE NOT EXISTS (
    SELECT * FROM table_name AS T
    WHERE T.col1 = %s
      AND T.col2 = %s)

the other is doing a LEFT JOIN:

INSERT INTO table_name (col1, col2, col3)
SELECT %s, %s, %s
FROM ( SELECT %s, %s, %s ) A
LEFT JOIN table_name B
ON  B.COL1 = %s
AND B.COL2 = %s
WHERE B.id IS NULL
LIMIT 1

Is there a general rule as to one being faster than the other, or does it depend on the tables? Is there a different way which is better than both?

2
  • Are you sure you want a SELECT * in the first example? Select all the columns? Generally, one wouuldn't do that, but I'm no Postgres expert. Commented Jun 25, 2010 at 18:05
  • Can you run both queries with EXPLAIN and post that? The analysis tools are there to help answer these questions on a case by case basis. Commented Jun 25, 2010 at 18:05

2 Answers 2

5

I would recommend defining a UNIQUE constraint on the column(s) you need to be unique (col1 & col2 in this case), and then just do the INSERT. Handle exceptions as needed.


Re your comment about the exception demanding a rollback, the solution for PostgreSQL is to set a transaction savepoint before you try the insert that may cause an exception. If you get the exception, rollback to the savepoint.

See:

5
  • i have the unique constraints in place already. however, when they're violated, i have to either commit or rollback the entire transaction so far. each transaction of mine requires multiple inserts, though, and i don't want to partially commit some of them. is there any way to handle exceptions in a better way, that doesn't mess up the cursor/transaction? (doing this from Python, with sqlobject or psycopg2) Commented Jun 25, 2010 at 18:21
  • Yes: catch the exception. If it's a duplicate key violation, then you can ignore it because it means your row already exists. If it's another type of exception, you would have had that problem anyway (e.g. SQL syntax error, disk full, network connection broken). Commented Jun 25, 2010 at 18:34
  • if i catch the exception, then attempt to execute another query with the same cursor, i get a psycopg2.InternalError: current transaction aborted; ignoring actions until end of transaction block, or something of the sort Commented Jun 25, 2010 at 18:41
  • Here is the exact error message: psycopg2.InternalError: current transaction is aborted, commands ignored until end of transaction block. what can i do to prevent this state after failing to insert a table? Commented Jun 25, 2010 at 18:52
  • i've tried to ask the question more eloquently here: stackoverflow.com/questions/3120688/… Commented Jun 25, 2010 at 18:56
-1

I think using EXISTS is more efficient!You could do like this:

if exists(select 1 from table_name where col1 = %s and col2 = %s) then
  insert into table_name (col1, col2, col3)
  select %s, %s, %s;
end if;

under test,using EXISTS is about 50 times faster then using NOT EXISTS.

another method is using EXCEPT .

INSERT INTO table_name (col1, col2, col3)
SELECT %s, %s, %s
except
select col1, col2, col3 from table_name

under test,using EXCEPT is about 3 times faster then using NOT EXISTS.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.