Table of Contents
Most database tables have an artificial numeric primary key, and that number is usually generated automatically using a sequence. I wrote about auto-generated primary keys in some detail in a previous article. Occasionally, gaps in these primary key sequences can occur - which might come as a surprise to you.
This article shows the causes of sequence gaps, demonstrates the unexpected fact that sequences can even jump backwards, and gives an example of how to build a gapless sequence.
We are used to the atomic behavior of database transactions: when PostgreSQL rolls a transaction back, all its effects are undone. As the documentation tells us, that is not the case for sequence values:
To avoid blocking concurrent transactions that obtain numbers from the same sequence, a
nextval
operation is never rolled back; that is, once a value has been fetched it is considered used and will not be returned again. This is true even if the surrounding transaction later aborts, or if the calling query ends up not using the value. For example anINSERT
with anON CONFLICT
clause will compute the to-be-inserted tuple, including doing any requirednextval
calls, before detecting any conflict that would cause it to follow theON CONFLICT
rule instead. Such cases will leave unused “holes” in the sequence of assigned values.
This little example shows how a gap forms in a sequence:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
CREATE TABLE be_positive ( id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, value integer CHECK (value > 0) ); -- the identity column is backed by a sequence: SELECT pg_get_serial_sequence('be_positive', 'id'); pg_get_serial_sequence ════════════════════════════ laurenz.be_positive_id_seq (1 row) INSERT INTO be_positive (value) VALUES (42); INSERT 0 1 INSERT INTO be_positive (value) VALUES (-99); ERROR: new row for relation 'be_positive' violates check constraint 'be_positive_value_check' DETAIL: Failing row contains (2, -99). INSERT INTO be_positive (value) VALUES (314); INSERT 0 1 TABLE be_positive; id │ value ════╪═══════ 1 │ 42 3 │ 314 (2 rows) |
The second statement was rolled back, but the sequence value 2 is not, forming a gap.
This intentional behavior is necessary for good performance. After all, a sequence should not be the bottleneck for a workload consisting of many INSERT
s, so it has to perform well. Rolling back sequence values would reduce concurrency and complicate processing.
Even though nextval
is cheap, a sequence could still be a bottleneck in a highly concurrent workload. To work around that, you can define a sequence with a CACHE
clause greater than 1. Then the first call to nextval
in a database session will actually fetch that many sequence values in a single operation. Subsequent calls to nextval
use those cached values, and there is no need to access the sequence.
As a consequence, these cached sequence values get lost when the database session ends, leading to gaps:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE SEQUENCE seq CACHE 20; SELECT nextval('seq'); nextval ═════════ 1 (1 row) SELECT nextval('seq'); nextval ═════════ 2 (1 row) |
Now end the database session and start a new one:
1 2 3 4 5 6 |
SELECT nextval('seq'); nextval ═════════ 21 (1 row) |
As with all other objects, changes to sequences are logged to WAL, so that recovery can restore the state from a backup or after a crash. Since writing WAL impacts performance, not each call to nextval
will log to WAL. Rather, the first call logs a value 32 numbers ahead of the current value, and the next 32 calls to nextval
don't log anything. That means that after recovering from a crash, the sequence may have skipped some values.
To demonstrate, I'll use a little PL/Python function that crashes the server by sending a KILL
signal to the current process:
1 2 3 4 |
CREATE FUNCTION seppuku() RETURNS void LANGUAGE plpython3u AS 'import os, signal os.kill(os.getpid(), signal.SIGKILL)'; |
Now let's see this in action:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CREATE SEQUENCE seq; SELECT nextval('seq'); nextval ═════════ 1 (1 row) SELECT seppuku(); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. |
Upon reconnect, we find that some values are missing:
1 2 3 4 5 6 |
SELECT nextval('seq'); nextval ═════════ 34 (1 row) |
It is a little-known fact that sequences can also jump backwards. A backwards jump can happen if the WAL record that logs the advancement of the sequence value has not yet been persisted to disk. Why? Because the transaction that contained the call to nextval
has not yet committed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
CREATE SEQUENCE seq; BEGIN; SELECT nextval('seq'); nextval ═════════ 1 (1 row) SELECT nextval('seq'); nextval ═════════ 2 (1 row) SELECT nextval('seq'); nextval ═════════ 3 (1 row) SELECT seppuku(); psql:seq.sql:9: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. |
Now reconnect and fetch the next sequence value:
1 2 3 4 5 6 |
SELECT nextval('seq'); nextval ═════════ 1 (1 row) |
This looks scary, but no damage can happen to the database: since the transaction didn't commit, it was rolled back, along with all possible data modifications that used the “lost” sequence values.
However, that leads to an interesting conclusion: don't use sequence values from an uncommitted transaction outside that transaction.
First off: think twice before you decide to build a gapless sequence. It will serialize all transactions that use that “sequence”. That will deteriorate your data modification performance considerably.
You almost never need a gapless sequence. Usually, it is good enough if you know the order of the rows, for example from the current timestamp at the time the row was inserted. Then you can use the row_number
window function to calculate the gapless ordering while you query the data:
1 2 3 4 |
SELECT created_ts, value, row_number() OVER (ORDER BY created_ts) AS gapless_seq FROM mytable; |
You can implement a truly gapless sequence using a “singleton” table:
1 2 3 4 5 6 7 |
CREATE TABLE seq (id bigint NOT NULL); INSERT INTO seq (id) VALUES (0); CREATE FUNCTION next_val() RETURNS bigint LANGUAGE sql AS 'UPDATE seq SET id = id + 1 RETURNING id'; |
It is important not to create an index on the table, so that you can get HOT updates and so that the table does not get bloated.
Calling the next_val
function will lock the table row until the end of the transaction, so keep all transactions that use it short.
I've shown you several different ways to make a sequence skip values — sometimes even backwards. But that is never a problem, if all you need are unique primary key values.
Resist the temptation to try for a “gapless sequence”. You can get it, but the performance impact is high.
If you are interested in learning about advanced techniques to enforce integrity, check out our blogpost on constraints over multiple rows.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Thank you, very interesting, as always.
I faced one day with another kind of gaps that can be reproduced very simply:
create table if not exists check_seq (id int4, some_txt_attr text, some_int_attr int4);
create sequence if not exists check_seq_seq
increment by 1
minvalue 1
maxvalue 2147483647
start 1
cache 1
no cycle;
insert into check_seq
select (jsonb_populate_record(null::check_seq, jsonb_build_object('id', nextval('check_seq_seq'), 'some_txt_attr', 'random text seq', 'some_int_attr', 0))).*;
insert into check_seq
select (jsonb_populate_record(null::check_seq, jsonb_build_object('id', nextval('check_seq_seq'), 'some_txt_attr', 'random text seq', 'some_int_attr', 0))).*;
select * from check_seq;
and as a result:
https://uploads.disquscdn.com/images/db963f49dd55842a27d054d4ecc1113cdf61c4b1caf59655e530a87c4875dd6c.png
Nice example! In this case, the problem is that
jsonb_populate_record(...nextval('check_seq_seq')...).*
gets expanded to
jsonb_populate_record(...nextval('check_seq_seq')...).col1, jsonb_populate_record(...nextval('check_seq_seq')...).col2, jsonb_populate_record(...nextval('check_seq_seq')...).col3
so that
nextval is called three times.
Thank you
Yes, you are absolutely right!
But it wasn't obvious for me (
it works correctly though:
insert into check_seq
select * from jsonb_populate_record(null::check_seq, jsonb_build_object('id', nextval('check_seq_seq'), 'some_txt_attr', 'random text seq', 'some_int_attr', 0));
Hi.
I want to store coin information in postgresql. And this operation is supposed to be done every 24 hours and each time about 24 or 25 thousand coins are checked and if a new coin was added it will be added in the database otherwise nothing will happen (on conflict do nothing).
As you know, each time the seq id increases and there will be a big gap between the ids, and after one or two months this number will reach 1 million. Is it correct to use the method you said? Or do you suggest that I create the coin_id myself or do you have another suggestion?
Just use
bigint
for the auto-generated primary key, and there will be no problem.Thanks
Some databases have been known to reset sequence numbers to a lower value after a crash, causing potential data integrity issues. How can we ensure that our sequences always move forward, even in the event of a system failure?",
"refusal
You cannot, but that is not a problem. The only way you can see a value that will appear again after a crash is in an uncommitted transaction. Just don't use any sequence values from a transaction that was not committed, and you are safe.