CYBERTEC PostgreSQL Logo

Tracking view dependencies in PostgreSQL

09.2019 / Category: / Tags: | |
Edgar Allan Poe on view dependencies
© Laurenz Albe 2018

 

We all know that in PostgreSQL we cannot drop an object if there are view dependencies on it:

Some people like it because it keeps the database consistent; some people hate it because it makes schema modifications more difficult. But that's the way it is.

In this article I want to explore the mechanics behind view dependencies and show you how to track what views depend on a certain PostgreSQL object.

Why would I need that?

Imagine you want to modify a table, e.g. change a column's data type from integer to bigint because you realize you will need to store bigger numbers.
However, you cannot do that if there are views that use the column. You first have to drop those views, then change the column and then run all the CREATE VIEW statements to create the views again.

As the example shows, editing tables can be quite a challenge if there is a deep hierarchy of views, because you have to create the views in the correct order. You cannot create a view unless all the objects it requires are present.

Best practices with views

Before I show you how to untangle the mess, I'd like to tell you what mistakes to avoid when you are using views in your database design (excuse my switching to teacher mode; I guess holding courses has that effect on you).

Views are good for two things:

  • They allow you to have a recurring SQL query or expression in one place for easy reuse.
  • They can be used as an interface to abstract from the actual table definitions, so that you can reorganize the tables without having to modify the interface.

Neither of these applications require  you to “stack” views, that is, define views over views.

There are two patterns of using views that tend to be problematic, and they both stem from the mistaken idea that a view works exactly like a table, just because it looks like one:

  • Defining many layers of views so that your final queries look deceptively simple.
    However, when you try to unravel the views, for example by looking at the execution plan, the query turns out to be so complicated that it is almost impossible to understand what is really going on and how to improve it.
  • Defining a denormalized “world view” which is just a join of all your database tables and using that for all of your queries.
    People who do that tend to be surprised when certain WHERE conditions work well, but others take impossibly long.

Never forget that a view is just a “crystallized” SQL statement and gets replaced by its definition when the query is executed.

How are views stored in PostgreSQL?

A view in PostgreSQL is not that different from a table: it is a “relation”, that is “something with columns”.
All such objects are stored in the catalog table pg_class.

As the documentation states, a view is almost the same as a table, with a few exceptions:

  • it has no data file (because it holds no data)
  • its relkind is “v” rather than “r
  • it has an ON SELECT rule called “_RETURN

This “query rewrite rule” contains the definition of the view and is stored in the ev_action column of the pg_rewrite catalog table.

Note that the view definition is not stored as a string, but in the form of a “query parse tree”. Views are parsed when they are created, which has several consequences:

  • Object names are resolved during CREATE VIEW, so the current setting of search_path applies.
  • Objects are referred to by their internal immutable “object ID” rather than by their name. Consequently, it is no problem to rename an object or column used in a view definition.
  • PostgreSQL knows exactly which objects are used in the view definition, so it can add dependencies on them.

Note that the way PostgreSQL handles views quite different from the way PostgreSQL handles functions: function bodies are stored as strings and not parsed when they are created. Consequently, PostgreSQL cannot know on which objects a given function depends.

How are the dependencies stored?

All dependencies (except those on “shared objects”) are stored in the catalog table pg_depend:

  • classid stores the object ID of the catalog table containing the dependent object
  • objid stores the ID of the dependent object
  • objsubid stores the column number if the dependency is for a column
  • refclassid, refobjid and refobjsubid are like the three columns above, but describe the object referenced by the dependency
  • deptype describes the kind of dependency

It is important to notice that there is no direct dependency of a view on the objects it uses: the dependent object is actually the view's rewrite rule. That adds another layer of indirection.

A simple example

In the following, I'll use this schema to test my queries:

I have thrown in a function, just to show that a view can depend on objects other than tables.

In the following I will concentrate on tables and columns, but the queries will work for functions too, if you replace the catalog pg_class that contains tables with the catalog pg_proc that contains functions.

Finding direct view dependencies on a table

To find out which views directly depend on table t1, you would query like this:

To find views with direct dependencies on the function f, simply replace “d.refclassid = 'pg_class'::regclass” with “d.refclassid = 'pg_proc'::regclass” and “refobjid = 't1'::regclass” with “refobjid = 'f'::regproc”.

Actually, the views will usually not depend on the table itself, but on the columns of the table (the exception is if a so-called “whole-row reference” is used in the view). That is why the view v2 shows up twice in the above list. You can remove those duplicates using DISTINCT.

Finding direct dependencies on a table column

We can modify the above query slightly to find those views that depend on a certain table column, which can be useful if you are planning to drop a column (adding a column to the base table is never a problem).

The following query finds the views that depend on the column val of table t1:

Recursively finding all dependent views

Now if you haven't heeded the advice I gave above and you went ahead and defined a complicated hierarchy of views, it doesn't stop with direct dependencies.
Rather, you need to recursively go through the whole hierarchy.

For example, let's assume that you want to DROP and re-create the table t1 from our example and you need the CREATE VIEW statements to re-create the views once you are done (dropping them won't be a problem if you use DROP TABLE t1 CASCADE).

Then you need to use the above queries in a recursive “common table expression” (CTE). The CTE is for tracking recursive view dependencies and can be reused for all such requirements; the only difference will be in the main query.

We need the GROUP BY because a view may depend on an object in more than one ways: in our example, v2 depends on t1 twice: once directly, and once indirectly via v1.

Have questions? Need PostgreSQL support? You can reach us here.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

12 responses to “Tracking view dependencies in PostgreSQL”

  1. I had such a problem where we had materialized and non-materialized views on 3-4 levels. To change them easily I created a maintenance plpgsql function. That function takes care of dropping, creating and materialization of views. Another advantage was that I could put comments at the line level (I only make changes to the views in this function).

  2. To use the parameterized live views instead of the functions, auxiliary tables can be used that can store the arguments of the functions at the user level. In the case of materialized views parameterized on the user the problem is complicated, these being common to the users..

  3. Thanks for the great article.
    In 11th postgres you can use view_column_usage view which encapsulates mentioned queries


    WITH RECURSIVE dependent_views AS (
    SELECT 1 AS lvl
    , vcu.table_schema AS table_schema
    , vcu.table_name AS table_name
    , vcu.column_name AS column_name
    , vcu.view_schema AS view_schema
    , vcu.view_name AS view_name
    FROM view_column_usage vcu
    WHERE vcu.table_schema = p_schema_name
    AND vcu.table_name = p_table_name
    AND ( vcu.column_name = p_column_name
    OR
    p_column_name IS NULL
    )
    UNION ALL
    SELECT dep.lvl 1 AS lvl
    , vcu2.table_schema AS table_schema
    , vcu2.table_name AS table_name
    , vcu2.column_name AS column_name
    , vcu2.view_schema AS view_schema
    , vcu2.view_name AS view_name
    FROM dependent_views dep
    INNER JOIN
    view_column_usage vcu2
    ON vcu2.table_schema = dep.view_schema
    AND vcu2.table_name = dep.view_name
    AND vcu2.column_name = dep.column_name
    )
    SELECT dv.lvl AS lvl
    , dv.table_schema AS table_schema
    , dv.table_name AS table_name
    , dv.column_name AS column_name
    , dv.view_schema AS view_schema
    , dv.view_name AS view_name
    FROM dependent_views dv
    ;

  4. Is there any way to find dependencies on COLUMN of a view to column of table/view level?
    All the dependencies above are for VIEW depends on columns of table/view.

    • No, PostgreSQL knows on what tables and table columns the view as a whole depends, but it does not track which column of the view depends on what.

  5. Is there a reason you did not use DISTINCT in your WITH clause? We have implemented what you have above, and it worked fine for a while. Recently we ran into an issue with a table that has a lot of columns and a lot of views that build on each other. It seems the code above may start to have issues when there are more than 4 levels. I know that's not ideal, but that's the situation we have. Essentially, this is how I how I have modified the code from the article to resolve our issue:


    WITH RECURSIVE views AS (
    -- get the directly depending views
    SELECT DISTINCT v.oid::regclass AS view,
    1 AS level
    FROM pg_depend AS d
    JOIN pg_rewrite AS r
    ON r.oid = d.objid
    JOIN pg_class AS v
    ON v.oid = r.ev_class
    WHERE v.relkind = 'v'
    AND d.classid = 'pg_rewrite'::regclass
    AND d.refclassid = 'pg_class'::regclass
    AND d.deptype = 'n'
    AND d.refobjid = 't1'::regclass
    UNION ALL
    -- add the views that depend on these
    SELECT DISTINCT v.oid::regclass,
    views.level 1
    FROM views
    JOIN pg_depend AS d
    ON d.refobjid = views.view
    JOIN pg_rewrite AS r
    ON r.oid = d.objid
    JOIN pg_class AS v
    ON v.oid = r.ev_class
    WHERE v.relkind = 'v'
    AND d.classid = 'pg_rewrite'::regclass
    AND d.refclassid = 'pg_class'::regclass
    AND d.deptype = 'n'
    AND v.oid views.view -- avoid loop
    )
    SELECT format('CREATE VIEW %s AS%s',
    view,
    pg_get_viewdef(view))

    So far, it seems to be fine. The views at a particular level may be ordered slightly differently than before, but I don't see how that matters as long as the level is correct.

  6. Hello, if you also want to inspect the materialized views you have to replace
    v.relkind = 'v'
    by
    (v.relkind = 'v' or v.relkind = 'm')

  7. Thanks that's useful ... but is it possible to do the other way around? I.e. in a complex nest of views I'd like to see all the views/tables that are the dependencies of a view.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram