You are on page 1of 5

PostgreSQL SELECT DISTINCT

Summary: in this tutorial, you will learn how to use the PostgreSQL SELECT DISTINCT clause to remove duplicate rows from
a result set returned by a query.

Introduction to PostgreSQL SELECT DISTINCT clause


The DISTINCT clause is used in the SELECT (https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-select/) statement to
remove duplicate rows from a result set. The DISTINCT clause keeps one row for each group of duplicates. The
DISTINCT clause can be applied to one or more columns in the select list of the SELECT statement.

The following illustrates the syntax of the DISTINCT clause:

SELECT
DISTINCT column1
FROM
table_name;

In this statement, the values in the column1 column are used to evaluate the duplicate.

If you specify multiple columns, the DISTINCT clause will evaluate the duplicate based on the combination of values of
these columns.

SELECT
DISTINCT column1, column2
FROM
table_name;

In this case, the combination of values in both column1 and column2 columns will be used for evaluating the duplicate.

PostgreSQL also provides the DISTINCT ON (expression) to keep the “first” row of each group of duplicates using the
following syntax:

SELECT
DISTINCT ON (column1) column_alias,
column2
FROM
table_name
ORDER BY
column1,
column2;
The order of rows returned from the SELECT statement is unspecified therefore the “first” row of each group of the
duplicate is also unspecified.

It is a good practice to always use the ORDER BY (https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-order-by/) clause with


the DISTINCT ON(expression) to make the result set predictable.

Notice that the DISTINCT ON expression must match the leftmost expression in the ORDER BY clause.

PostgreSQL SELECT DISTINCT examples

Let’s create a new table (https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-create-table/) called distinct_demo and insert


data (https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-insert/) into it for practicing the DISTINCT clause.

Note that you will learn how to create a table and insert data into a table in the subsequent tutorial. In this tutorial,
you just execute the statement in psql or pgAdmin to execute the statements.

First, use the following CREATE TABLE (https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-create-table/) statement to


create the distinct_demo table that consists of three columns: id , bcolor and fcolor .

CREATE TABLE distinct_demo (


id serial NOT NULL PRIMARY KEY,
bcolor VARCHAR,
fcolor VARCHAR
);

Second, insert some rows into the distinct_demo table using the following INSERT (https://www.postgresqltutorial.com/postgresql-
tutorial/postgresql-insert/) statement:

INSERT INTO distinct_demo (bcolor, fcolor)


VALUES
('red', 'red'),
('red', 'red'),
('red', NULL),
(NULL, 'red'),
('red', 'green'),
('red', 'blue'),
('green', 'red'),
('green', 'blue'),
('green', 'green'),
('blue', 'red'),
('blue', 'green'),
('blue', 'blue');

Third, query the data from the distinct_demo table using the SELECT (https://www.postgresqltutorial.com/postgresql-

tutorial/postgresql-select/) statement:
SELECT
id,
bcolor,
fcolor
FROM
distinct_demo ;

PostgreSQL DISTINCT one column example

The following statement selects unique values in the bcolor column from the t1 table and sorts
(https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-order-by/) the result set in alphabetical order by using the ORDER BY
(https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-order-by/) clause.

SELECT
DISTINCT bcolor
FROM
distinct_demo
ORDER BY
bcolor;

PostgreSQL DISTINCT multiple columns

The following statement demonstrates how to use the DISTINCT clause on multiple columns:

SELECT
DISTINCT bcolor,
fcolor
FROM
distinct_demo
ORDER BY
bcolor,
fcolor;

Because we specified both bcolor and fcolor columns in the SELECT DISTINCT clause, PostgreSQL combined the values
in both bcolor and fcolor columns to evaluate the uniqueness of the rows.

The query returns the unique combination of bcolor and fcolor from the distinct_demo table. Notice that the
distinct_demo table has two rows with red value in both bcolor and fcolor columns. When we applied the

DISTINCT to both columns, one row was removed from the result set because it is the duplicate.

PostgreSQL DISTINCT ON example

The following statement sorts the result set by the bcolor and fcolor , and then for each group of duplicates, it keeps
the first row in the returned result set.

SELECT
DISTINCT ON (bcolor) bcolor,
fcolor
FROM
distinct_demo
ORDER BY
bcolor,
fcolor;

Here is the output:


In this tutorial, you have learned how to use PostgreSQL SELECT DISTINCT statement to remove duplicate rows returned by
a query.

You might also like