CYBERTEC PostgreSQL Logo

What is a schema in PostgreSQL?

06.2023 / Category: / Tags: |

One way to organize data in PostgreSQL is to make use of schemas. What is a schema in PostgreSQL? And more importantly: What is the purpose of a schema and how can schemas be used to make your life easier? Let's dive in and find out.

The purpose of a schema

Before you figure out how to use schemas, you need to know what the purpose of a schema is in the first place. To understand that, first take a look at how PostgreSQL is structured:

  • Instance
  • Database
  • Schema
  • Table
  • Row

An “instance” is basically what you start when you deploy PostgreSQL. The next layer is a database. In reality this is what you connect to: in PostgreSQL a connection is always bound to a database inside an instance, which happens early on, right after user authentication.
What is important is the next layer down, between databases and tables: Schemas.

Schemas group tables

Basically, schemas are a way to group tables together. Let's assume there's a fairly large data structure: Having 500 tables in one place is certainly harder to manage and understand than to have 10 buckets containing 50 tables each.
It's simply like organizing pictures: You wouldn't put all of them into the same folder, but rather group them by year, location, etc. The same logic can be applied to tables.

Schemas and PostgreSQL

Now we can focus on how this concept can be applied to PostgreSQL. The first thing we have to look at is the public schema.

Using the “public” schema

The beauty of PostgreSQL is that it doesn't matter much if you know nothing at all about schemas. The reason is the existence of the public schema, which is there by default. How can we find out which schemas there are in PostgreSQL? psql provides the dn command to display this information:

In a default scenario, a table will end up in the public schema. Here's an example:

This is a basic table. The table can be found in the desired schema. d will reveal the truth:

In this case, both the schema and the sequence are found in the default schema as expected. As you can see, you don't need any knowledge about schemas to proceed. If you happen to use the public schema, we also recommend checking out the new security policy introduced in recent versions of PostgreSQL.

Basically, there are two ways to query the table.

The first method is:

Another method to query the table

However, you can also explicitly use the schema name as a prefix to the table name, which constitutes a fully qualified name. I've seen a couple of ORM's do exactly that-- in order to reduce the risk of accessing the wrong table due to misconfiguration. We'll also see it later in this post:

After this brief introduction to the public schema, we can move forward and create our first new schema.

Creating schemas

How can we create a schema in PostgreSQL? The CREATE SCHEMA command is the answer:

The syntax is quite easy and allows us to define a name as well as the schema owner. Otherwise, everything is really straightforward:

Once the schema has been created, we can create a table inside the schema:

By using a schema name as a prefix to the table name, you can define the schema you want to use. Mind that the schema itself does NOT impact the way data is stored. The data files associated with our table are still in the same PostgreSQL data directory. Therefore schemas do not impact performance and are not about storage optimization. The purpose of a schema is simply to group things together and to help organize a solid security policy by assigning permissions to schemas:

There are two things worth pointing out here: First of all, it is possible to have two tables with the same name in two different schemas. There is a public.t_product and a warehouse.t_product table. This is perfectly possible and actually quite common. The second important aspect is that we don’t have to prefix the table in the public schema. The reason is the following parameter:

Everything that is to be found in the search_path can be accessed directly without explicitly providing the name of the schema. We can easily try this out:

Note that the parameter is only changed in your session - it does not break your production system if you are running this in your interactive session.

From now on, the second table called t_product will be displayed, because PostgreSQL knows in which schema to look:

Now that the search_path has been changed, we have to prefix the public schema, as it is not in the path any more:

After this basic introduction to schemas, let's figure out what it means to use schemas in combination with views:

Views and schemas

A view is a good way to allow developers easier access to data. The important point is that schemas are not normally a barrier (For the nitty-gritty details about views and permissions, see here). A query can freely join tables from different schemas, and the view using the query can expose the data in a schema of your choice (assuming you have the permissions to do that):

However, there are implications for people who want to migrate from Oracle to PostgreSQL.
Hint: Check out the CYBERTEC Migrator

Renaming schemas in PostgreSQL

In PostgreSQL, everything that can be created can also be renamed. The same is true for schemas:

Renaming a schema causes repercussions which are outlined in the next listing. Mind what happened to the view:

The view does not reference tables directly - it references internal object IDs, which is really important here, because renaming the schema only means attaching a different text label to an internal ID. The actual view definition does not depend on names, so renaming objects does render a view invalid. By contrast, in databases such as Oracle, renaming objects can leave a view in an invalid state.

Dropping schemas

Dropping schemas in PostgreSQL follows the same logic:

Schema dependencies

We don't want to orphan objects, so we can't just drop them without collateral damage. PostgreSQL will tell us exactly what would happen, but not really drop the table - in order to avoid breaking the dependencies on the schema:

In case we really want to drop the schema and face all the consequences associated with it, the CASCADE option can be added:

As you can see, all dependent objects have been dropped and we're left with a clean, consistent database which doesn't contain any stale or invalid objects. For more info, see Laurenz Albe's blog on view dependencies.

Finally …

Even if you're not aware of it, schemas are always part of the game; they offer a good way to organize data more clearly, in a way that is easier to understand. See this blog about ALTER DEFAULT PRIVILEGES for more info on how to allow other users access to objects in a particular schema.

If you are interested in other PostgreSQL-related topics, we recommend you check out Ants Aasma's blog post about “Faceting Large Result Sets in PostgreSQL”.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram