Is UPDATE the same as DELETE + INSERT in PostgreSQL?

UPDATE is not the same as DELETE + INSERT! — © Laurenz Albe 2020

Table of Contents

Introduction

We know that PostgreSQL does not update a table row in place. Rather, it writes a new version of the row (the PostgreSQL term for a row version is “tuple”) and leaves the old row version in place to serve concurrent read requests. VACUUM later removes these “dead tuples”.

If you delete a row and insert a new one, the effect is similar: we have one dead tuple and one new live tuple. This is why many people (me, among others) explain to beginners that “an UPDATE in PostgreSQL is almost the same as a DELETE, followed by an INSERT”.

This article is about that “almost”.

Example showing the difference between `UPDATE` and `DELETE` + `INSERT`

Let's take this simple test table:

CREATE TABLE uptest (
   id smallint PRIMARY KEY,
   val smallint NOT NULL
);

INSERT INTO uptest VALUES (1, 42);

CREATE TABLE uptest (

id smallint PRIMARY KEY,

val smallint NOT NULL

);

INSERT INTO uptest VALUES (1, 42);

In the two following tests, we will issue statements from two concurrent sessions.

First, the UPDATE:

 Session 1                     Session 2

BEGIN;

UPDATE uptest SET id = 2
   WHERE val = 42;
                               SELECT id FROM uptest
                                  WHERE val = 42
                                  FOR UPDATE;  -- hangs
COMMIT;
                               -- one row is returned

Session 1 Session 2

BEGIN;

UPDATE uptest SET id = 2

WHERE val = 42;

SELECT id FROM uptest

WHERE val = 42

FOR UPDATE; -- hangs

COMMIT;

-- one row is returned

Let's reset the table before the second test;

TRUNCATE uptest;
INSERT INTO uptest VALUES (1, 42);

1 2	TRUNCATE uptest; INSERT INTO uptest VALUES (1, 42);

Now let's repeat the experiment with DELETE and INSERT:

 Session 1                     Session 2

BEGIN;

DELETE FROM uptest
   WHERE id = 1;

INSERT INTO uptest VALUES (2, 42);
                               SELECT id FROM uptest
                                  WHERE val = 42
                                  FOR UPDATE;  -- hangs
COMMIT;
                               -- no row is returned

Session 1 Session 2

BEGIN;

DELETE FROM uptest

WHERE id = 1;

INSERT INTO uptest VALUES (2, 42);

SELECT id FROM uptest

WHERE val = 42

FOR UPDATE; -- hangs

COMMIT;

-- no row is returned

Explanation for the observed difference

The documentation describes what happens when an SQL statement runs into a lock in a transaction with the default READ COMMITTED isolation level:

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row. In the case of SELECT FOR UPDATE and SELECT FOR SHARE, this means it is the updated version of the row that is locked and returned to the client.

The above shows that there is some way for PostgreSQL to find the new version of an updated row. That is why the first experiment returned a result row. In the second experiment, there was no connection between the old, deleted row and the newly inserted one, that's why we get no result in that case.

To figure out how the old and the new version are connected, we have to look deeper.

Looking at `UPDATE` with “pageinspect”

The extension pageinspect allows us to see all data in a PostgreSQL data page. It requires superuser permissions.

Let's use it to see what's on disk after the first experiment:

TRUNCATE uptest;
INSERT INTO uptest VALUES (1, 42);

UPDATE uptest SET id = 2 WHERE val = 42;

SELECT lp,
       t_xmin AS xmin,
       t_xmax AS xmax,
       t_ctid,
       to_hex(t_infomask2) AS infomask2,
       to_hex(t_infomask) AS infomask,
       t_attrs       
FROM heap_page_item_attrs(get_raw_page('uptest', 0), 'uptest');

 lp |  xmin  |  xmax  | t_ctid | infomask2 | infomask |        t_attrs        
----+--------+--------+--------+-----------+----------+-----------------------
  1 | 385688 | 385689 | (0,2)  | 2002      | 100      | {'\x0100','\x2a00'}
  2 | 385689 |      0 | (0,2)  | 2         | 2800     | {'\x0200','\x2a00'}
(2 rows)

TRUNCATE uptest;

INSERT INTO uptest VALUES (1, 42);

UPDATE uptest SET id = 2 WHERE val = 42;

SELECT lp,

t_xmin AS xmin,

t_xmax AS xmax,

t_ctid,

to_hex(t_infomask2) AS infomask2,

to_hex(t_infomask) AS infomask,

t_attrs

FROM heap_page_item_attrs(get_raw_page('uptest', 0), 'uptest');

----+--------+--------+--------+-----------+----------+-----------------------

1 | 385688 | 385689 | (0,2) | 2002 | 100 | {'\x0100','\x2a00'}

2 | 385689 | 0 | (0,2) | 2 | 2800 | {'\x0200','\x2a00'}

(2 rows)

The first entry is the old version of the row, the second the new version.

The lp is the line pointer number, which stands for the number of the tuple within the data page. Together with the page number, this constitutes the physical address (tuple ID or tid) of a tuple.

It follows that the t_ctid stored in the tuple header is usually redundant, since it is implicit in the line pointer. However, it becomes relevant after an UPDATE: then t_ctid contains the tuple identifier of the updated version of the row.

This is the “missing link” between the old row version and the updated one!

Looking at `DELETE` + `DELETE` with “pageinspect”

Let's compare this to DELETE + INSERT:

TRUNCATE uptest;
INSERT INTO uptest VALUES (1, 42);

BEGIN;
DELETE FROM uptest WHERE id = 1;
INSERT INTO uptest VALUES (2, 42);
COMMIT;

SELECT lp,
       t_xmin AS xmin,
       t_xmax AS xmax,
       t_ctid,
       to_hex(t_infomask2) AS infomask2,
       to_hex(t_infomask) AS infomask,
       t_attrs       
FROM heap_page_item_attrs(get_raw_page('uptest', 0), 'uptest');

 lp |  xmin  |  xmax  | t_ctid | infomask2 | infomask |        t_attrs        
----+--------+--------+--------+-----------+----------+-----------------------
  1 | 385691 | 385692 | (0,1)  | 2002      | 100      | {'\x0100','\x2a00'}
  2 | 385692 |      0 | (0,2)  | 2         | 800      | {'\x0200','\x2a00'}
(2 rows)

TRUNCATE uptest;

INSERT INTO uptest VALUES (1, 42);

BEGIN;

DELETE FROM uptest WHERE id = 1;

INSERT INTO uptest VALUES (2, 42);

COMMIT;

SELECT lp,

t_xmin AS xmin,

t_xmax AS xmax,

t_ctid,

to_hex(t_infomask2) AS infomask2,

to_hex(t_infomask) AS infomask,

t_attrs

FROM heap_page_item_attrs(get_raw_page('uptest', 0), 'uptest');

----+--------+--------+--------+-----------+----------+-----------------------

1 | 385691 | 385692 | (0,1) | 2002 | 100 | {'\x0100','\x2a00'}

2 | 385692 | 0 | (0,2) | 2 | 800 | {'\x0200','\x2a00'}

(2 rows)

Here the t_ctid column from the old, deleted tuple is unchanged and there is no link to the new tuple. The second tuple is not found by the SELECT ... FOR UPDATE, since it is “invisible” to the “snapshot” used for scanning the table.

Differences in `infomask` and `infomask2`

There are also some relevant differences in the attributes infomask and infomask2. You can see the meaning of these flags in the PostgreSQL source file src/include/access/htup_details.h.

for the old tuple, the values are the same in both cases:
- infomask2: 2 is the number of columns, and HEAP_KEYS_UPDATED (0x2000) means that the tuple is deleted or updated
- infomask: HEAP_XMIN_COMMITTED (0x0100) means that the tuple was valid before it was removed (a hint bit)
for the new tuple, there are some differences:
- infomask: both cases have HEAP_XMAX_INVALID (0x0800) set (they have not been deleted), but the UPDATE case also has HEAP_UPDATED (0x2000), which shows that this is the result of an UPDATE

Conclusion

To understand the difference between UPDATE and DELETE+INSERT, we had a closer look at the tuple headers. We saw infomask, infomask2 and t_ctid, where the latter provides the link between the old and the new version of a row.

PostgreSQL's row header occupies 23 bytes, which is more storage overhead than in other databases, but is required for PostgreSQL's special multiversioning and tuple visibility implementation.

UPDATE can be challenging in PostgreSQL: if you want to read more about its problems and how to deal with them, read my article on HOT update.

Is UPDATE the same as DELETE + INSERT in PostgreSQL?

Introduction

Example showing the difference between `UPDATE` and `DELETE` + `INSERT`

Explanation for the observed difference

Looking at `UPDATE` with “pageinspect”

Looking at `DELETE` + `DELETE` with “pageinspect”

Differences in `infomask` and `infomask2`

Conclusion

Leave a Reply Cancel reply

Laurenz Albe

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts

Is UPDATE the same as DELETE + INSERT in PostgreSQL?

Introduction

Example showing the difference between UPDATE and DELETE + INSERT

Explanation for the observed difference

Looking at UPDATE with “pageinspect”

Looking at DELETE + DELETE with “pageinspect”

Differences in infomask and infomask2

Conclusion

Leave a Reply Cancel reply

Laurenz Albe

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts

Example showing the difference between `UPDATE` and `DELETE` + `INSERT`

Looking at `UPDATE` with “pageinspect”

Looking at `DELETE` + `DELETE` with “pageinspect”

Differences in `infomask` and `infomask2`