Insert Into… Select … Where …

About a month ago one of my teammates asked to confirm how to do a Insert Statement in SQL. He wanted a quick check about using an Insert-Values style statement, but instead of Insert-Values, I recommended doing a Insert-Select-Where. One of my other teammates asked why I would do it that way and I explained my reasoning to him, and decided it would make a good post.

Let us start with a schema for some_table. (NOTE: I will be using PostgreSQL for these examples, but I originally started off doing this with SQLServer, so the concepts translate across databases although syntax might change.)

proctor=# d some_table
               Table "public.some_table"
 Column  |            Type             |   Modifiers   
---------+-----------------------------+---------------
 name    | character varying(40)       | 
 id      | integer                     | 
 size    | integer                     | 
 modtime | timestamp without time zone | default now()

A number of people would be happy doing an Insert-Values statement to get the data in the table as shown below.

proctor=# INSERT INTO some_table(id, name, size) VALUES(1, 'a', 14);
INSERT 0 1
proctor=# select * from some_table;
 name | id | size |          modtime           
------+----+------+----------------------------
 a    |  1 |   14 | 2013-12-10 22:07:58.706064
(1 row)

That works and I have nothing against that, but I have found myself more fond of doing a Insert-Select-Where style statement.

This stems from doing database migrations on SQLServer for .NET development before there were any real tools to do it. We would have to do the migrations ourselves, later with a homegrown tool to help handle it, but even then our goal was to make our scripts “SQL Safe.” SQL Safe was our term for making any SQL that we ran idempotent, so if you ran a script, it would determine if it should run again. Schema modifications checked the schema to see if the needed modification already existed. Insert statements would check to see if the values to be inserted was already represented in the system.

It started with doing an IF check against the Insert-Values.

proctor=# DO
proctor-# $BODY$
proctor-# BEGIN
proctor-#   IF NOT EXISTS (SELECT 1 from some_table where name = 'a' and size = 14) THEN
proctor-#     INSERT INTO some_table(id, name, size) VALUES(1, 'a', 14);
proctor-#   END IF;
proctor-# END;
proctor-# $BODY$;
DO
proctor=# select * from some_table;
 name | id | size |          modtime           
------+----+------+----------------------------
 a    |  1 |   14 | 2013-12-10 22:07:58.706064
(1 row)

Which is not bad, but it is not great either. We moved to the Insert-Select-Where style, which looks like the following:

proctor=# INSERT INTO some_table(id, name, size)
proctor-# SELECT 1, 'a', 14
proctor-# WHERE NOT EXISTS (SELECT 1 from some_table where name = 'a' and size = 14);
INSERT 0 0
proctor=# select * from some_table;
 name | id | size |          modtime           
------+----+------+----------------------------
 a    |  1 |   14 | 2013-12-10 22:07:58.706064
(1 row)

What is nice about this is:

  • it makes the Insert idempotent;
  • more likely to be cross platform instead of syntax differences in how IF statements are handled, and
  • it becomes easy to craft the statement I need to create the values and run a test SELECT by just commenting out the INSERT INTO portion.
proctor=# --INSERT INTO some_table(id, name, size)
proctor-# SELECT 1, 'q', 37
proctor-# WHERE NOT EXISTS (SELECT 1 from some_table where name = 'q' and size = 37);
 ?column? | ?column? | ?column? 
----------+----------+----------
        1 | q        |       37
(1 row)

I would love to hear your feedback about this.

–Proctor