PostgreSQL Tutorial

  • Home
  • Stored Procedures
  • Triggers
  • Views
  • Interfaces
    • PostgreSQL PHP
    • PostgreSQL Python
    • PostgreSQL JDBC
  • Functions
    • Aggregate Functions
    • Date / Time Functions
    • String Functions
    • Math Functions
Home / PostgreSQL Tutorial / How To Delete Duplicate Rows in PostgreSQL

How To Delete Duplicate Rows in PostgreSQL

Summary: in this tutorial, you will learn how to use various techniques to delete duplicate rows in PostgreSQL.

Preparing sample data

First, create a new table named basket that stores fruits:

1
2
3
4
CREATE TABLE basket(
    id SERIAL PRIMARY KEY,
    fruit VARCHAR(50) NOT NULL
);

Second, insert some fruits into the basket table.

1
2
3
4
5
6
7
8
INSERT INTO basket(fruit) values('apple');
INSERT INTO basket(fruit) values('apple');
 
INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('orange');
 
INSERT INTO basket(fruit) values('banana');

Third, query data from the basket table:

1
2
3
4
5
SELECT
    id,
    fruit
FROM
    basket;

fruit table

As you can see, we have some duplicate rows such as 2 apples and 3 oranges in the basket table.

Finding duplicate rows

If the table has few rows, you can see which ones are duplicate immediately. However, it is not the case with the big table.

The find the duplicate rows, you use the following statement:

1
2
3
4
5
6
7
8
9
10
11
SELECT
    fruit,
    COUNT( fruit )
FROM
    basket
GROUP BY
    fruit
HAVING
    COUNT( fruit )> 1
ORDER BY
    fruit;

Deleting duplicate rows using DELETE USING statement

The following statement uses the DELETE USING statement to remove duplicate rows:

1
2
3
4
5
6
7
DELETE
FROM
    basket a
        USING basket b
WHERE
    a.id < b.id
    AND a.fruit = b.fruit;

In this example, we joined the basket table to itself and checked if two different rows (a.id < b.id) have the same value in the fruit column.

Let’s query the basket table again to verify whether the duplicate rows were deleted:

1
2
3
4
5
SELECT
id,
fruit
FROM
basket;

delete duplicate rows in postgresql example - keep highest id

As you can see, the statement removed the duplicate rows with lowest ids and keep the one with the highest id.

If you want to keep the duplicate rows with the lowest id, you use just need to flip the operator in the WHERE clause:

1
2
3
4
5
6
DELETE  FROM
    basket a
        USING basket b
WHERE
    a.id > b.id
    AND a.fruit = b.fruit;

To check whether the statement works correctly, let’s verify the data in the basket table:

1
2
3
4
5
SELECT
    id,
    fruit
FROM
    basket;

Result:

delete duplicate rows in postgresql example - keep lowest id

Perfect! the duplicate rows with the lowest ids are retained.

Deleting duplicate rows using subquery

The following statement uses a suquery to delete duplicate rows and keep the row with the lowest id.

1
2
3
4
5
6
7
8
9
DELETE FROM basket
WHERE id IN
    (SELECT id
    FROM
        (SELECT id,
         ROW_NUMBER() OVER( PARTITION BY fruit
        ORDER BY  id ) AS row_num
        FROM basket ) t
        WHERE t.row_num > 1 );

In this example, the subquery returned the duplicate rows except for the first row in the duplicate group. And the outer DELETE statement deleted the duplicate rows returned by the subquery.

If you want to keep the duplicate row with highest id, just change the order in the subquery:

1
2
3
4
5
6
7
8
9
DELETE FROM basket
WHERE id IN
    (SELECT id
    FROM
        (SELECT id,
         ROW_NUMBER() OVER( PARTITION BY fruit
        ORDER BY  id DESC ) AS row_num
        FROM basket ) t
        WHERE t.row_num > 1 );

In case you want to delete duplicate based on values of multiple columns, here is the query template:

1
2
3
4
5
6
7
8
9
10
DELETE FROM table_name
WHERE id IN
    (SELECT id
    FROM
        (SELECT id,
         ROW_NUMBER() OVER( PARTITION BY column_1,
         column_2
        ORDER BY  id ) AS row_num
        FROM table_name ) t
        WHERE t.row_num > 1 );

In this case, the statement will delete all rows with duplicate values in the column_1 and column_2 columns.

Deleting duplicate rows using an immediate table

To delete rows using an immediate table, you use the following steps:

  1. Create a new table with the same structure as the one whose duplicate rows should be removed.
  2. Insert distinct rows from the source table to the immediate table.
  3. Drop the source table.
  4. Rename the immediate table to the name of the source table.

The following illustrates the steps of removing duplicate rows from the basket table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- step 1
CREATE TABLE basket_temp (LIKE basket);
 
-- step 2
INSERT INTO basket_temp(fruit, id)
SELECT
    DISTINCT ON (fruit) fruit,
    id
FROM basket;
 
-- step 3
DROP TABLE basket;
 
-- step 4
ALTER TABLE basket_temp
RENAME TO basket;                

In this tutorial, you have learned how to delete duplicate rows in PostgreSQL using the DELETE USING statement, subquery, and the immediate table techniques.

Previous Tutorial: Learn PostgreSQL Recursive Query By Example
Next Tutorial: PostgreSQL Window Function

PostgreSQL Quick Start

  • What is PostgreSQL?
  • Install PostgreSQL
  • Connect to Database
  • Download PostgreSQL Sample Database
  • Load Sample Database
  • Explore Server and Database Objects

PostgreSQL Fundamentals

  • PostgreSQL Select
  • PostgreSQL Order By
  • PostgreSQL Select Distinct
  • PostgreSQL Where
  • PostgreSQL LIMIT
  • PostgreSQL IN
  • PostgreSQL Between
  • PostgreSQL Like
  • PostgreSQL Inner Join
  • PostgreSQL Left Join
  • PostgreSQL Full Outer Join
  • PostgreSQL Cross Join
  • PostgreSQL Natural Join
  • PostgreSQL Group By
  • PostgreSQL Having
  • PostgreSQL Union
  • PostgreSQL Intersect
  • PostgreSQL Except
  • PostgreSQL Subquery
  • PostgreSQL Insert
  • PostgreSQL Update
  • PostgreSQL Delete

Managing Table Structure

  • PostgreSQL Data Types
  • PostgreSQL Create Table
  • PostgreSQL Alter Table
  • PostgreSQL Drop Table
  • PostgreSQL Truncate Table
  • PostgreSQL CHECK Constraint
  • PostgreSQL Not-Null Constraint
  • PostgreSQL Foreign Key
  • PostgreSQL Primary Key
  • PostgreSQL UNIQUE Constraint

PostgreSQL Views

  • Managing PostgreSQL Views
  • Creating Updatable Views
  • PostgreSQL Materialized Views
  • The WITH CHECK OPTION Views
  • PostgreSQL Recursive View

PostgreSQL Triggers

  • Introduction to Trigger
  • Creating A Trigger
  • Managing PostgreSQL Triggers

About PostgreSQL Tutorial

PostgreSQLTutorial.com is a website dedicated to developers and database administrators who are working on PostgreSQL database management system.

We constantly publish useful PostgreSQL tutorials to keep you up-to-date with the latest PostgreSQL features and technologies. All PostgreSQL tutorials are simple, easy-to-follow and practical.

Recent PostgreSQL Tutorials

  • PostgreSQL ANY Operator
  • PostgreSQL EXISTS
  • How To Delete Duplicate Rows in PostgreSQL
  • PostgreSQL TO_CHAR Function
  • PostgreSQL TO_NUMBER Function
  • PostgreSQL TO_TIMESTAMP Function
  • PostgreSQL CEIL Function
  • PostgreSQL MOD Function
  • PostgreSQL FLOOR Function
  • PostgreSQL ABS Function

More Tutorials

  • PostgreSQL Cheat Sheet
  • PostgreSQL Administration
  • PostgreSQL PHP
  • PostgreSQL Python
  • PostgreSQL JDBC
  • PostgreSQL Resources

Site Info

  • Home
  • About Us
  • Contact Us
  • Privacy Policy

Copyright © 2017 by PostgreSQL Tutorial Website. All Rights Reserved.