CYBERTEC PostgreSQL Logo

Creating and refreshing materialized views in PostgreSQL

06.2021 / Category: / Tags: |

A “materialized view” is a database object which stores the result of a precalculated database query and makes it easy to refresh this result as needed. Materialized views are an integral feature of pretty much all advanced database systems. Naturally, PostgreSQL also provides support for materialized views, and offers the end-user a powerful tool to handle more time-consuming requests.

The main questions are now: What are the pitfalls, and how can you make use of materialized views in the first place? Let’s dive in and find out.

Creating a materialized view

Before we can actually take a look at materialized views and figure out how they work, we have to import some sample data which we can use as the basis for our calculations:

We have created 10 million rows organized in 2 groups.

To create a materialized view in PostgreSQL, we can make use of the following syntax specification:

Basically, a materialized view has a name, some parameters, and is based on a query.

Here is an example of a materialized view:

What is really important to note here is the size of the materialized view compared to the underlying table:

423 MB vs. 16 KB is a real difference. If you want to query the materialized view, you can do it just like you would query a normal PostgreSQL table:

Looking at the execution plan (explain) shows that there is no difference between a normal table and a materialized view. We can simply query the data.
At this point, the only suboptimal thing is the query optimizer’s estimate (= 1130 rows). However, that can easily be fixed.

The process is the same as for any other table:

ANALYZE has recreated the new optimizer statistics and fixed the estimate nicely. If you want to know more about ANALYZE in general, we recommend our posts about PostgreSQL optimizer statistics and autovacuum. Autovacuum is also able to recreate optimizer statistics.

CREATE MATERIALIZED VIEW … USING …

However, there is more. PostgreSQL offers some additional important features which are relevant in this field. One of those features is the USING clause. What is the purpose of this one? At the moment, PostgreSQL supports only one storage format (heap). However, in the future, we hope to see more storage formats such as zheap or zedstore.

The idea of USING is to have the ability to select the best storage format for the given requirement. At the moment, heap is the default format, and that’s perfectly fine.

CREATE MATERIALIZED VIEW … WITH …

The WITH-clause is the same as in CREATE TABLE. It allows you to define storage parameters such as autovacuum behavior, FILLFACTOR and so on.

Here’s an example:

In general, a materialized view is stored just like a table, so simply follow the same recommendations as you would follow when creating a standard table.

CREATE MATERIALIZED VIEW … TABLESPACE …

The same is true for the TABLESPACE keyword. In the default setup, the materialized view will be stored in the default tablespace which is $PGDATA:

SHOW is an easy way to figure out where that is. If you want to find out more about tablespaces, consider checking out our blog post about tablespaces in PostgreSQL.

CREATE MATERIALIZED VIEW … DATA vs NO DATA …

By default, a materialized view is created in a way that it contains the result. However, if the object is created using the NO DATA option, it is empty. Only the definition is created.

One has to be aware of the fact that in this case, the view cannot be queried until it has been refreshed to contain data. We have not seen the NO DATA option being used too frequently in database support. It’s still worth mentioning that this option does exist.

Modifying a materialized view in PostgreSQL

Sometimes a materialized view has to be modified later on. A classical example would be to rename a column of the view without recreating it.

Let’s take a look at the definition of the view:

Modifying a materialized view is done using the ALTER MATERIALIZED VIEW command which is a powerful tool to facilitate all kinds of changes.

The syntax is pretty similar to what ALTER TABLE can do for you:

What is noteworthy here is that you can also do things such as CLUSTER (= index organize), SET STATISTICS (= adjust size of histogram for the PostgreSQL optimizer) and so on. The process is the same as for a normal table.

Now, let’s try it out and rename a column:

Before we move on to refreshing, I want to focus your attention on indexing:

It is a good idea to index a materialized view to ensure fast querying. Again, the same rules as for a table exist. Simply index entries which are diverse and offer good selectivity when the materialized view is read.

Refresh materialized view

A materialized view contains a snapshot of the query result. It is not updated periodically, unless the user forces PostgreSQL to do so. In Oracle, materialized view support is a bit more sophisticated. Hopefully, PostgreSQL will soon catch up in this area a bit.

However, at the moment,a materialized view in PostgreSQL has to be refreshed.

Refresh manually using the REFRESH MATERIALIZED VIEW command:

Running the refresh is simple:

PostgreSQL knows the query which has to be re-executed to refresh the content of the materialized view. There is just one downside: PostgreSQL has to lock the object for the time of the refresh, which means that nobody can access it while it is refreshed.

Refreshing materialized views concurrently

Sometimes it is simply not acceptable to tolerate a locked materialized view. In that case, concurrent updates are needed. However, there is a problem:

The materialized view needs a unique column to support concurrent refreshing.

That means you have to determine a “primary key” and create a unique index:

You need to be a bit careful and make sure that table bloat does not take over. Technically, it is a DELETE / INSERT, which is prone to cause table bloat. Make sure a proper VACUUMpolicy is in place.

Finally …

Materialized views are an important feature in most databases, including PostgreSQL. They can help to speed up large calculations - or at least to cache them.

If you want to make sure that your materialized views are up to date and if you want to read more about PostgreSQL right now, check out our blog about pg_timetable which shows you how to schedule jobs in PostgreSQL. Why is pg_timetable so useful? Our scheduler makes sure that identical jobs cannot overlap, and that they simply don’t execute again, in case the same job is already running. In case of long jobs, this is super important - especially if you want to use materialized views.
 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

3 responses to “Creating and refreshing materialized views in PostgreSQL”

  1. Hi,

    Excellent post, congratulations on your work.

    Is there any change to be implemented in the future for a command like

    REFRESH MATERIALIZED VIEW MyView as (SELECT * FROM MyView WHERE condition)

    I would like to update only a few records.

    Thank you so much

    Claudio Oliveira.
    MSI Soluções.

  2. Hi,
    what is the mechanism used for MV refresh concurrently, how postgres knows which rows should be deleted or updated in the materialized view?

    thank you in advance for your details

    Fabrice

    • That's exactly what the unique index is for. Without it, rows wouldn't have an identity, so you couldn't tell which row needs to be updated or deleted.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram