CYBERTEC PostgreSQL Logo

ER diagrams with SQL and Mermaid

02.2023 / Category: / Tags:

What does ER stand for?

An Entity Relationship (ER) diagram is one of the most important tools for database design. It helps you visualize the relationships between different entities and how they interact with each other. Many GUI tools have their own tools to build ER diagrams, e.g. pgAdmin IV, DBeaver, etc.
In this blog post, we'll explore how to create an ER diagram for a PostgreSQL database using plain SQL and Mermaid. Mermaid is a JavaScript-based diagramming and charting tool that renders Markdown-inspired text definitions to create and modify diagrams dynamically.

mermaid ER diagram
Mermaid ER diagram example

Setup

Today I will show you how to generate an ER diagram for any PostgreSQL database using only plain SQL and Mermaid. I propose to use the Pagila example database as a target. You may either install it locally, or run a Docker Compos script.

The task is to create an SQL script which will output valid Mermaid syntax. Later we can use either Mermaid Live Editor or a local installation to view and save the ER diagram in one of the preferred formats, e.g. .svg, .png, etc.

Script

If you examine Mermaid ER syntax, you will find that:

  • it's possible to define tables separately from references;
  • that you can specify table columns with types;
  • that you can add a name for the relationships.

Mermaid ER header

I've split the SQL script into 3 parts. The first one is very basic - it outputs the special keyword erDiagram to indicate how Mermaid should visualize the diagram.

Collecting tables and columns

Here's what the above snippet does:

  1. It selects the table name (c.relname) and the associated column names (a.attname) with data types (t.oid) from the pg_class, pg_namespace, pg_attribute, and pg_type tables.
  2. It uses left join for columns and types because PostgreSQL allows the creation of tables without attributes, e.g. create table foo(); is a valid DDL statement.
  3. The join condition for attributes explicitly specifies it wants only user-defined columns (a.attnum > 0) and columns that are still valid (not a.attisdropped).
  4. It filters out any entities that are not regular tables (partitioned are OK), and excludes any tables in the PostgreSQL system or information_schema schemas. The partitions themselves are not interesting for us, because they are only implementation details.
  5. To produce a correct type name it uses special format_type() system function. Pay attention, the snippet uses special string constants with C-style escapes. That allows you to easily format tab indents as well as new lines.
  6. To aggregate column names with data types, a special string_agg() function is used.
  7. It then formats the output as a new-line-delimited string with the column definitions in braces.

Collecting relationships

This snippet is much easier:

  1. It gets all the foreign key constraints in the database (c.contype = 'f').
  2. It filters out any foreign keys that are on a partition table or reference a partition table.
  3. It formats the output to show the table name, the referenced table name, and the constraint name. The cardinality is one-to-many by default.

Finally

Here is the final script to copy-paste:

And here is how the final result looks for the Pagila database (click to enlarge):

 

Thanks for reading, I hope you enjoyed it! If you liked this one, check out my blog post about usql, universal psql!

10 responses to “ER diagrams with SQL and Mermaid”

  1. I love it! Thanks. Just tried it on a schema of mine containing names
    with spaces (I like problems), and the mermaid parser failed. It should
    render names with double quotes, something that format() should be
    able to handle with a %I formatter, no?

  2. I want to make updates to your SQL sample:
    from
    ____pg_class c
    ____join pg_namespace n on n.oid = c.relnamespace
    ____left join pg_attribute a ON c.oid = a.attrelid

    I would write like this:
    FROM pg_class c
    __JOIN pg_namespace n
    ____ON n.oid = c.relnamespace
    __LEFT JOIN pg_attribute a
    ____ON a.attrelid = c.oid

    please note:
    - reserved keywords are in upper case
    - visually easy to find joined tables and conditions
    - in join condition expression for joined table is always on left

    • Thanks. UPPERCASE reserved SQL words are kind of holy war nowadays. 🙂 If you check my previous posts, I always use uppercase there. This time I decided to use lowercase.

      Are these rules specified somewhere or are they your personal preferences?

    • The choice of using all uppercase or lowercase SQL reserved words in
      coding is a matter of personal preference, which may have originated
      from a time when code editors did not have color-coding features. While I
      used to favor all uppercase SQL reserved words, I am now gravitating
      towards using all lowercase. Regardless of the choice, it's essential to
      maintain consistency throughout.

  3. The code samples don't show backslashes that are required to represent tab and newlines characters. Copying and pasting and SQL code as-is doesn't work. For example:

    should be

  4. It seems like the web page strips single backslashes from string. My comment from a few months ago had as its 2nd code sample: format(E' followed by backslash-t -- but the backslash isn't shown in my comment. Probably the same is true with the code sample.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram