Nine years after writing how to use the MERGE
statement in Oracle, I am writing how you implement an UPSERT
statement in PostgreSQL. I wrote an initial post going over the basics of PostgreSQL’s upsert implementation of the INSERT
statement with an DO UPDATE
clause and a DO NOTHING
clause.
I thought it was interesting that the PostgreSQL Upsert Using INSERT ON CONFLICT
Statement web page didn’t cover using a subquery as the source for an INSERT
statement.
Demonstration
Here are the steps to accomplish an import/upload with the COPY
statement and the INSERT
statement with DO UPDATE
and DO NOTHING
clauses.
Step #1 : Position your CSV file in the physical directory
The example uses the following upload directory:
/u01/app/postgres/upload
Put the following text in to the kingdom_import.csv
file.
Narnia,77600,Peter the Magnificent,1272-03-20,1292-06-19
Narnia,77600,Edmund the Just,1272-03-20,1292-06-19
Narnia,77600,Susan the Gentle,1272-03-20,1292-06-19
Narnia,77600,Lucy the Valiant,1272-03-20,1292-06-19
Narnia,42100,Peter the Magnificent,1531-04-12,1531-05-31
Narnia,42100,Edmund the Just,1531-04-12,1531-05-31
Narnia,42100,Susan the Gentle,1531-04-12,1531-05-31
Narnia,42100,Lucy the Valiant,1531-04-12,1531-05-31
Camelot,15200,King Arthur,0631-03-10,0686-12-12
Camelot,15200,Sir Lionel,0631-03-10,0686-12-12
Camelot,15200,Sir Bors,0631-03-10,0635-12-12
Camelot,15200,Sir Bors,0640-03-10,0686-12-12
Camelot,15200,Sir Galahad,0631-03-10,0686-12-12
Camelot,15200,Sir Gawain,0631-03-10,0686-12-12
Camelot,15200,Sir Tristram,0631-03-10,0686-12-12
Camelot,15200,Sir Percival,0631-03-10,0686-12-12
Camelot,15200,Sir Lancelot,0670-09-30,0682-12-12
Camelot,15200,Modred,0681-09-30,0682-12-12
Step #2 : Run the script that creates tables and sequences
Copy the following code into a create_kingdom_knight_tables.sql
file within a directory of your choice. Then, you run it as the student
user. Assuming you put the code in the create_kingdom_knight_tables.sql
script, you can call it like so
\i create_kingdom_knight_tables.sql
-- Conditionally drop three tables and sequences.
DO $$
DECLARE
/* Declare an indefinite length string and record variable. */
sql VARCHAR;
row RECORD;
/* Declare a cursor. */
table_cursor CURSOR FOR
SELECT table_name
FROM information_schema.tables
WHERE table_catalog = 'videodb'
AND table_schema = 'public'
AND table_name IN ('kingdom','knight','kingdom_knight_import');
BEGIN
/* Open the cursor. */
OPEN table_cursor;
LOOP
/* Fetch table names. */
FETCH table_cursor INTO row;
/* Exit when no more records are found. */
EXIT WHEN NOT FOUND;
/* Concatenate together a DDL to drop the table with prejudice. */
sql := 'DROP TABLE IF EXISTS '||row.table_name||' CASCADE';
/* Execute the DDL statement. */
EXECUTE sql;
END LOOP;
/* Close the cursor. */
CLOSE table_cursor;
END;
$$;
-- Create normalized kingdom table.
CREATE TABLE kingdom
( kingdom_id SERIAL
, kingdom_name VARCHAR(20)
, population INTEGER
, CONSTRAINT kingdom_uq_key
UNIQUE ( kingdom_name
, population ));
-- Create normalized knight table.
CREATE TABLE knight
( knight_id SERIAL
, knight_name VARCHAR(24)
, kingdom_allegiance_id INTEGER
, allegiance_start_date DATE
, allegiance_end_date DATE
, CONSTRAINT knight_uq_key
UNIQUE ( knight_name
, kingdom_allegiance_id
, allegiance_start_date
, allegiance_end_date ));
-- Create external import table.
CREATE TABLE kingdom_knight_import
( kingdom_name VARCHAR(20)
, population INTEGER
, knight_name VARCHAR(24)
, allegiance_start_date DATE
, allegiance_end_date DATE);
Step #3 : Run the COPY
command.
Run the COPY
command to move the data from the Comma Separated Values (CSV) file to the kingdom_knight_import
table. Then, run it as the student
account.
COPY kingdom_knight_import
FROM '/u01/app/postgres/upload/kingdom_import1.csv' DELIMITERS ',' CSV;
Step #4 : Create the upload_kingdom
procedure
Copy the following code into a create_kingdom_knight_procedure.sql
file within a directory of your choice. Assuming you put the code in the create_kingdom_knight_procedure.sql
script, you can call it like so
\i create_kingdom_knight_procedure.sql
CREATE PROCEDURE upload_kingdom() AS
$$
DECLARE
/* Declare error handling variables. */
err_num TEXT;
err_msg INTEGER;
BEGIN
/* Insert only unique rows. The DISTINCT operator compresses the
result set to a unique set and avoids consuming sequence values
for non-unique result sets. */
INSERT INTO kingdom
( kingdom_name
, population )
( SELECT DISTINCT
kki.kingdom_name
, kki.population
FROM kingdom_knight_import kki LEFT JOIN kingdom k
ON kki.kingdom_name = k.kingdom_name
AND kki.population = k.population)
ON CONFLICT ON CONSTRAINT kingdom_uq_key
DO NOTHING;
/* Insert only unique rows. */
INSERT INTO knight
( knight_name
, kingdom_allegiance_id
, allegiance_start_date
, allegiance_end_date )
(SELECT kki.knight_name
, k.kingdom_id
, kki.allegiance_start_date AS start_date
, kki.allegiance_end_date AS end_date
FROM kingdom_knight_import kki INNER JOIN kingdom k
ON kki.kingdom_name = k.kingdom_name
AND kki.population = k.population LEFT JOIN knight kn
ON k.kingdom_id = kn.kingdom_allegiance_id
AND kki.knight_name = kn.knight_name
AND kki.allegiance_start_date = kn.allegiance_start_date
AND kki.allegiance_end_date = kn.allegiance_end_date)
ON CONFLICT ON CONSTRAINT knight_uq_key
DO NOTHING;
EXCEPTION
WHEN OTHERS THEN
err_num := SQLSTATE;
err_msg := SUBSTR(SQLERRM,1,100);
RAISE NOTICE 'Trapped Error: %', err_msg;
END
$$ LANGUAGE plpgsql;
Step #5 : Run the upload_kingdom
procedure and query the results
You run the upload_kingdom
procedure with the CALL
statement and then query the results. Assuming you put the code in the call_kingdom_knight_procedure.sql
script, you can call it like so
\i call_kingdom_knight_procedure.sql
/* Call the upload_kingdom procedure. */
CALL upload_kingdom();
/* Query the kingdom_knight_import table. */
SELECT kingdom_name
, population
, knight_name
, date_trunc('second',allegiance_start_date AT TIME ZONE 'MST') AS allegiance_start_date
, date_trunc('second',allegiance_end_date AT TIME ZONE 'MST') AS allegiance_end_date
FROM kingdom_knight_import;
/* Query the kingdom table. */
SELECT *
FROM kingdom;
/* Query the knight table. */
SELECT kn.knight_id
, kki.knight_name
, k.kingdom_id
, date_trunc('second',kki.allegiance_start_date AT TIME ZONE 'MST') AS start_date
, date_trunc('second',kki.allegiance_end_date AT TIME ZONE 'MST') AS end_date
FROM kingdom_knight_import kki INNER JOIN kingdom k
ON kki.kingdom_name = k.kingdom_name
AND kki.population = k.population LEFT JOIN knight kn
ON k.kingdom_id = kn.kingdom_allegiance_id
AND kki.knight_name = kn.knight_name
AND kki.allegiance_start_date = kn.allegiance_start_date
AND kki.allegiance_end_date = kn.allegiance_end_date;
It prints the following results:
CALL
kingdom_name | population | knight_name | allegiance_start_date | allegiance_end_date
--------------+------------+-----------------------+-----------------------+---------------------
Narnia | 77600 | Peter the Magnificent | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
Narnia | 77600 | Edmund the Just | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
Narnia | 77600 | Susan the Gentle | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
Narnia | 77600 | Lucy the Valiant | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
Narnia | 42100 | Peter the Magnificent | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
Narnia | 42100 | Edmund the Just | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
Narnia | 42100 | Susan the Gentle | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
Narnia | 42100 | Lucy the Valiant | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
Camelot | 15200 | King Arthur | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Lionel | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Bors | 0631-03-09 23:59:56 | 0635-12-11 23:59:56
Camelot | 15200 | Sir Bors | 0640-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Galahad | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Gawain | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Tristram | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Percival | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
Camelot | 15200 | Sir Lancelot | 0670-09-29 23:59:56 | 0682-12-11 23:59:56
(18 rows)
kingdom_id | kingdom_name | population
------------+--------------+------------
1 | Narnia | 42100
2 | Narnia | 77600
3 | Camelot | 15200
(3 rows)
knight_id | knight_name | kingdom_id | start_date | end_date
-----------+-----------------------+------------+---------------------+---------------------
1 | Peter the Magnificent | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
2 | Edmund the Just | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
3 | Susan the Gentle | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
4 | Lucy the Valiant | 2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
5 | Peter the Magnificent | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
6 | Edmund the Just | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
7 | Susan the Gentle | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
8 | Lucy the Valiant | 1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
9 | King Arthur | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
10 | Sir Lionel | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
11 | Sir Bors | 3 | 0631-03-09 23:59:56 | 0635-12-11 23:59:56
12 | Sir Bors | 3 | 0640-03-09 23:59:56 | 0686-12-11 23:59:56
13 | Sir Galahad | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
14 | Sir Gawain | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
15 | Sir Tristram | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
16 | Sir Percival | 3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
17 | Sir Lancelot | 3 | 0670-09-29 23:59:56 | 0682-12-11 23:59:56
69 | Modred | 3 | 0681-09-29 23:59:56 | 0682-12-11 23:59:56
(18 rows)
As always, I hope this works to help those trying to solve a similar problem.