PostgreSQL Upsert Advanced

Nine years after writing how to use the MERGE statement in Oracle, I am writing how you implement an UPSERT statement in PostgreSQL. I wrote an initial post going over the basics of PostgreSQL’s upsert implementation of the INSERT statement with an DO UPDATE clause and a DO NOTHING clause.

I thought it was interesting that the PostgreSQL Upsert Using INSERT ON CONFLICT Statement web page didn’t cover using a subquery as the source for an INSERT statement.


Here are the steps to accomplish an import/upload with the COPY statement and the INSERT statement with DO UPDATE and DO NOTHING clauses.

Step #1 : Position your CSV file in the physical directory

The example uses the following upload directory:


Put the following text in to the kingdom_import.csv file.

Narnia,77600,Peter the Magnificent,1272-03-20,1292-06-19
Narnia,77600,Edmund the Just,1272-03-20,1292-06-19
Narnia,77600,Susan the Gentle,1272-03-20,1292-06-19
Narnia,77600,Lucy the Valiant,1272-03-20,1292-06-19
Narnia,42100,Peter the Magnificent,1531-04-12,1531-05-31
Narnia,42100,Edmund the Just,1531-04-12,1531-05-31
Narnia,42100,Susan the Gentle,1531-04-12,1531-05-31
Narnia,42100,Lucy the Valiant,1531-04-12,1531-05-31
Camelot,15200,King Arthur,0631-03-10,0686-12-12
Camelot,15200,Sir Lionel,0631-03-10,0686-12-12
Camelot,15200,Sir Bors,0631-03-10,0635-12-12
Camelot,15200,Sir Bors,0640-03-10,0686-12-12
Camelot,15200,Sir Galahad,0631-03-10,0686-12-12
Camelot,15200,Sir Gawain,0631-03-10,0686-12-12
Camelot,15200,Sir Tristram,0631-03-10,0686-12-12
Camelot,15200,Sir Percival,0631-03-10,0686-12-12
Camelot,15200,Sir Lancelot,0670-09-30,0682-12-12

Step #2 : Run the script that creates tables and sequences

Copy the following code into a create_kingdom_knight_tables.sql file within a directory of your choice. Then, you run it as the student user. Assuming you put the code in the create_kingdom_knight_tables.sql script, you can call it like so

\i create_kingdom_knight_tables.sql

-- Conditionally drop three tables and sequences.
DO $$
  /* Declare an indefinite length string and record variable. */
  sql  VARCHAR;
  row  RECORD;

  /* Declare a cursor. */
  table_cursor CURSOR FOR
    SELECT table_name
    FROM   information_schema.tables
    WHERE  table_catalog = 'videodb'
    AND    table_schema = 'public'
    AND    table_name IN ('kingdom','knight','kingdom_knight_import');
  /* Open the cursor. */
  OPEN table_cursor;
    /* Fetch table names. */
    FETCH table_cursor INTO row;

    /* Exit when no more records are found. */

    /* Concatenate together a DDL to drop the table with prejudice. */
    sql := 'DROP TABLE IF EXISTS '||row.table_name||' CASCADE';

    /* Execute the DDL statement. */
    EXECUTE sql;

  /* Close the cursor. */
  CLOSE table_cursor;
-- Create normalized kingdom table.
( kingdom_id    SERIAL
, kingdom_name  VARCHAR(20)
, population    INTEGER
, CONSTRAINT kingdom_uq_key
  UNIQUE ( kingdom_name
         , population ));

-- Create normalized knight table.
( knight_id             SERIAL
, knight_name           VARCHAR(24)
, kingdom_allegiance_id INTEGER
, allegiance_start_date DATE
, allegiance_end_date   DATE
, CONSTRAINT knight_uq_key 
  UNIQUE ( knight_name
         , kingdom_allegiance_id
         , allegiance_start_date
         , allegiance_end_date ));

-- Create external import table.
CREATE TABLE kingdom_knight_import
( kingdom_name          VARCHAR(20)
, population            INTEGER
, knight_name           VARCHAR(24)
, allegiance_start_date DATE
, allegiance_end_date   DATE);

Step #3 : Run the COPY command.

Run the COPY command to move the data from the Comma Separated Values (CSV) file to the kingdom_knight_import table. Then, run it as the student account.

COPY kingdom_knight_import
FROM '/u01/app/postgres/upload/kingdom_import1.csv' DELIMITERS ',' CSV;

Step #4 : Create the upload_kingdom procedure

Copy the following code into a create_kingdom_knight_procedure.sql file within a directory of your choice. Assuming you put the code in the create_kingdom_knight_procedure.sql script, you can call it like so

\i create_kingdom_knight_procedure.sql

CREATE PROCEDURE upload_kingdom() AS 

  /* Declare error handling variables. */
  err_num      TEXT;
  err_msg      INTEGER;


  /* Insert only unique rows. The DISTINCT operator compresses the
     result set to a unique set and avoids consuming sequence values
     for non-unique result sets. */
  INSERT INTO kingdom
  ( kingdom_name
  , population )
    ,        kki.population
    FROM     kingdom_knight_import kki LEFT JOIN kingdom k
    ON       kki.kingdom_name = k.kingdom_name
    AND      kki.population = k.population)

  /* Insert only unique rows. */
  INSERT INTO knight
  ( knight_name
  , kingdom_allegiance_id
  , allegiance_start_date
  , allegiance_end_date )
  (SELECT kki.knight_name
   ,      k.kingdom_id
   ,      kki.allegiance_start_date AS start_date
   ,      kki.allegiance_end_date AS end_date
   FROM   kingdom_knight_import kki INNER JOIN kingdom k
   ON     kki.kingdom_name = k.kingdom_name
   AND    kki.population = k.population LEFT JOIN knight kn 
   ON     k.kingdom_id = kn.kingdom_allegiance_id
   AND    kki.knight_name = kn.knight_name
   AND    kki.allegiance_start_date = kn.allegiance_start_date
   AND    kki.allegiance_end_date = kn.allegiance_end_date)
    err_num := SQLSTATE;
    err_msg := SUBSTR(SQLERRM,1,100);
    RAISE NOTICE 'Trapped Error: %', err_msg;
$$ LANGUAGE plpgsql;

Step #5 : Run the upload_kingdom procedure and query the results

You run the upload_kingdom procedure with the CALL statement and then query the results. Assuming you put the code in the call_kingdom_knight_procedure.sql script, you can call it like so

\i call_kingdom_knight_procedure.sql

/* Call the upload_kingdom procedure. */
CALL upload_kingdom();

/* Query the kingdom_knight_import table. */
SELECT   kingdom_name
,        population
,        knight_name
,        date_trunc('second',allegiance_start_date AT TIME ZONE 'MST') AS allegiance_start_date
,        date_trunc('second',allegiance_end_date AT TIME ZONE 'MST') AS allegiance_end_date
FROM     kingdom_knight_import;

/* Query the kingdom table. */
FROM     kingdom;

/* Query the knight table. */
SELECT   kn.knight_id
,        kki.knight_name
,        k.kingdom_id
,        date_trunc('second',kki.allegiance_start_date AT TIME ZONE 'MST') AS start_date
,        date_trunc('second',kki.allegiance_end_date AT TIME ZONE 'MST') AS end_date
FROM     kingdom_knight_import kki INNER JOIN kingdom k
ON       kki.kingdom_name = k.kingdom_name
AND      kki.population = k.population LEFT JOIN knight kn
ON       k.kingdom_id = kn.kingdom_allegiance_id
AND      kki.knight_name = kn.knight_name
AND      kki.allegiance_start_date = kn.allegiance_start_date
AND      kki.allegiance_end_date = kn.allegiance_end_date;

It prints the following results:

 kingdom_name | population |      knight_name      | allegiance_start_date | allegiance_end_date 
 Narnia       |      77600 | Peter the Magnificent | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      77600 | Edmund the Just       | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      77600 | Susan the Gentle      | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      77600 | Lucy the Valiant      | 1272-03-19 23:59:56   | 1292-06-18 23:59:56
 Narnia       |      42100 | Peter the Magnificent | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Narnia       |      42100 | Edmund the Just       | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Narnia       |      42100 | Susan the Gentle      | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Narnia       |      42100 | Lucy the Valiant      | 1531-04-11 23:59:56   | 1531-05-30 23:59:56
 Camelot      |      15200 | King Arthur           | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Lionel            | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Bors              | 0631-03-09 23:59:56   | 0635-12-11 23:59:56
 Camelot      |      15200 | Sir Bors              | 0640-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Galahad           | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Gawain            | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Tristram          | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Percival          | 0631-03-09 23:59:56   | 0686-12-11 23:59:56
 Camelot      |      15200 | Sir Lancelot          | 0670-09-29 23:59:56   | 0682-12-11 23:59:56
(18 rows)

 kingdom_id | kingdom_name | population 
          1 | Narnia       |      42100
          2 | Narnia       |      77600
          3 | Camelot      |      15200
(3 rows)

 knight_id |      knight_name      | kingdom_id |     start_date      |      end_date       
         1 | Peter the Magnificent |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         2 | Edmund the Just       |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         3 | Susan the Gentle      |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         4 | Lucy the Valiant      |          2 | 1272-03-19 23:59:56 | 1292-06-18 23:59:56
         5 | Peter the Magnificent |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         6 | Edmund the Just       |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         7 | Susan the Gentle      |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         8 | Lucy the Valiant      |          1 | 1531-04-11 23:59:56 | 1531-05-30 23:59:56
         9 | King Arthur           |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        10 | Sir Lionel            |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        11 | Sir Bors              |          3 | 0631-03-09 23:59:56 | 0635-12-11 23:59:56
        12 | Sir Bors              |          3 | 0640-03-09 23:59:56 | 0686-12-11 23:59:56
        13 | Sir Galahad           |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        14 | Sir Gawain            |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        15 | Sir Tristram          |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        16 | Sir Percival          |          3 | 0631-03-09 23:59:56 | 0686-12-11 23:59:56
        17 | Sir Lancelot          |          3 | 0670-09-29 23:59:56 | 0682-12-11 23:59:56
        69 | Modred                |          3 | 0681-09-29 23:59:56 | 0682-12-11 23:59:56
(18 rows)

As always, I hope this works to help those trying to solve a similar problem.

