Auto-generated primary keys: UUID, serial or identity column?

Performance comparison `bigint` versus `uuid` autogenerated primary keys
	`bigint`	`uuid`
inserts per second	107090	74947
index growth per row	30.5 bytes	41.7 bytes

15 responses to “UUID, serial or identity columns for PostgreSQL auto-generated primary keys?”

Adam Brusselback says:

May 21, 2021 at 3:48 am

Why not include some more "well behaved" UUID generation methods rather than random? The Sequential UUIDs extension is quite useful and helps with index bloat / WAL generation: https://pgxn.org/dist/sequential_uuids

Reply
sribe says:

June 29, 2021 at 3:30 pm

> Now, sometimes a table has a natural primary key, for example the social security number of a country’s citizens.

1) Children are citizens at birth, but do not have SS #s at birth.

2) Back in the 1970s, a few duplicate SSNs were accidentally issued.

3) A person can request a new SSN under a few circumstances.

So social security numbers fail as natural keys on ALL requirements.

I've never really seen anyone argue that natural keys shouldn't be used when available, but rather that natural keys are nowhere near as prevalent as their proponents claim.

Reply
- laurenz says:
  
  June 29, 2021 at 4:00 pm
  
  I won't argue that good natural primary keys are common, and I grant your points (except that I live in a country with a functional bureaucracy that doesn't issue duplicate social security numbers).
  
  Reply
Michael Kariv says:

August 18, 2021 at 6:37 am

I would like to add one possible advantage of uuid, which might be a private case of already stated advantage of a distributed application. Say you have a client mobile or web app and an API server. The app would let user generate a content item. Then it would REST JSON POST it to the server. With uuid as a primary key, it could have been generated on the client side (the app), so that the client logic is simpler - no waiting for the item back in response, and replacing the one that was client-generated already with the new one that has primary key (serial id) filled in by the server. The performance might add up, taking into consideration the offline scenario for a mobile app, where items created in an offline mode will have to be posted to the API server all at once once the app is back online.

Reply
madovsky says:

September 16, 2021 at 10:28 am

good article, but the comics looks very stupid.

Reply
- laurenz says:
  
  September 16, 2021 at 10:40 am
  
  My expertise is databases.
  I am not trying to make a career as an artist..
  
  Reply
Jimbo says:

October 20, 2021 at 1:01 pm

Laurenz,
Thanks for the great article. I appreciated the clear advice, comparisons and benchmarks.

Is there a recommended technique available in Postgres for when you want to use bigint as the primary key but still have a 'public' identifier that is not guessable (something that looks like a UUID or hash)?

Reply
- Amreesh says:
  
  October 29, 2021 at 3:34 am
  
  I have created a module goldflake which can be used to generate random bigint in postgres as well as in golang if you need to generate the same in application. It is based on Twitter snowflake.
  https://github.com/AmreeshTyagi/goldflake https://github.com/AmreeshTyagi/goldflake-pg
  
  And goldflake id is not guessable pure random, no conflict for 174 years even if you generate them on 8k different machines at same time.
  
  Feel free to check code & raise PR if you find any issue. Thanks.
  
  Reply
- Amreesh says:
  
  October 29, 2021 at 3:38 am
  
  I have created a module goldflake which can be used to generate random bigint in postgres as well as in golang if you need to generate the same in application. It is based on Twitter snowflake.
  https://github.com/AmreeshTyagi/goldflake-pg
  
  Reply
- fr0bar says:
  
  June 19, 2024 at 2:54 pm
  
  Look at squids.
  
  Reply
Joseph says:

April 4, 2022 at 8:21 am

Good article Laurenz, convinced me to go with bigint for my database.

Reply
Georg Klimm says:

September 13, 2022 at 1:40 pm

I have not yet understood the advantage of loading all tables with UUIDs. However, this seems to be common practice by now.
One table with UUID may not be the problem yet. Maybe you should compare several with UUID and several JOINS against a solution with a "normal" ID (bigint), so that the UUID generation understands what it is doing to the database and where the performance is ...

Reply
Colin 't Hart says:

June 19, 2023 at 11:45 am

I have a customer using UUIDs running a batch job that deletes thousands of rows, does some calculations, and inserts thousands more rows. Usually the deletes are sub-second, but occasionally they balloon out over 20 seconds -- while a count(*) on the same table forced to use a full table scan takes around 16 seconds.

I wonder if there any implications for the statistics collected by analyze on a UUID key vs a (big)int?

The where clause for the deletion contains a date, status, and an array of UUIDs.

Reply
- laurenz says:
  
  June 19, 2023 at 12:02 pm
  
  UUIDs are a bit slower, but that won't explain what you observe. I'd turn on log_lock_waits and see if locks are involved.
  
  Reply
Prasad Nair says:

September 7, 2023 at 7:46 pm

I agree!

Security reviewers who approve proposals based on "security by obscurity" are shallow thinkers. I once under pressure had to obfuscate my java classes(of an encryption program that uses AES like algo internally) just to satisfy the ego of a security reviewer who argued the byte code can be easily decompiled to get the source code. Just 3 months down the line a smarter reviewer got to my custom classloader class to find what my obfuscation logic was.. Busted! In fact, the real strength of my program lied NOT in the AES like algo BUT in the length and randomness of the shared secret. Hence the fear that anyone would decompile the bytecode and get to the source code in this case was baseless.

Reply

UUID, serial or identity columns for PostgreSQL auto-generated primary keys?

Why auto-generated primary keys?

Techniques for auto-generated primary keys in PostgreSQL

Generating keys with a sequence

Generating UUIDs

Defining auto-generated primary keys

Using the `DEFAULT` clause

Using the `serial` and `bigserial` pseudo-types

Using identity columns

Using `BEFORE INSERT` triggers

Should I use `integer`(`serial`) or `bigint`(`bigserial`) for my auto-generated primary key?

Should I use `bigserial` or an identity column for my auto-generated primary key?

Should I use `bigint` or `uuid` for an auto-generated primary key?

Real differences

Imaginary differences

Security considerations

Benchmarking `bigint` versus `uuid`

Conclusion

15 responses to “UUID, serial or identity columns for PostgreSQL auto-generated primary keys?”

Leave a Reply Cancel reply

Laurenz Albe

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts

UUID, serial or identity columns for PostgreSQL auto-generated primary keys?

Why auto-generated primary keys?

Techniques for auto-generated primary keys in PostgreSQL

Generating keys with a sequence

Generating UUIDs

Defining auto-generated primary keys

Using the DEFAULT clause

Using the serial and bigserial pseudo-types

Using identity columns

Using BEFORE INSERT triggers

Should I use integer(serial) or bigint(bigserial) for my auto-generated primary key?

Should I use bigserial or an identity column for my auto-generated primary key?

Should I use bigint or uuid for an auto-generated primary key?

Real differences

Imaginary differences

Security considerations

Benchmarking bigint versus uuid

Conclusion

15 responses to “UUID, serial or identity columns for PostgreSQL auto-generated primary keys?”

Leave a Reply Cancel reply

Laurenz Albe

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts

Using the `DEFAULT` clause

Using the `serial` and `bigserial` pseudo-types

Using `BEFORE INSERT` triggers

Should I use `integer`(`serial`) or `bigint`(`bigserial`) for my auto-generated primary key?

Should I use `bigserial` or an identity column for my auto-generated primary key?

Should I use `bigint` or `uuid` for an auto-generated primary key?

Benchmarking `bigint` versus `uuid`