PostgreSQL performance: How the optimizer handles views

“How does the PostgreSQL optimizer handle views?” or “Are views good or bad?” I assume that every database consultant and every SQL performance expert has heard this kind of question already. Given the fact that views are a really essential feature of SQL, it makes sense to take a closer look at the topic in general, and hopefully help some people to write better and faster code.

Table of Contents

PostgreSQL and the optimizer: What they do to views

Let us create a simple table containing just 10 rows, which can be used throughout the blog to show how PostgreSQL works and how the optimizer treats things:

test=# CREATE TABLE data AS
	SELECT 	*
	FROM 	generate_series(1, 10) AS id;
SELECT 10

Then I have created a very simple view:

test=# CREATE VIEW v1 AS
	SELECT 	*
	FROM 	data
	WHERE 	id < 4;
CREATE VIEW

test=# CREATE TABLE data AS

SELECT *

FROM generate_series(1, 10) AS id;

SELECT 10

Then I have created a very simple view:

test=# CREATE VIEW v1 AS

SELECT *

FROM data

WHERE id < 4;

CREATE VIEW

The idea here is to filter some data and return all the columns.

Inlining and flattening

The key aspect is: The optimizer will process the view just like a “pre-processor” directive. It will try to inline the code and to flatten it. Here is an example:

test=# explain SELECT * FROM v1;
                      QUERY PLAN                       
-------------------------------------------------------
 Seq Scan on data  (cost=0.00..41.88 rows=850 width=4)
   Filter: (id < 4)
(2 rows)

test=# explain SELECT * FROM v1;

QUERY PLAN

-------------------------------------------------------

Seq Scan on data (cost=0.00..41.88 rows=850 width=4)

Filter: (id < 4)

(2 rows)

When we try to read from the view it is just like running the SQL statement directly. The optimizer will perform the following steps:

SELECT	*
FROM	(SELECT 	*
		FROM 	data
		WHERE 	id < 4
	) AS v1;

SELECT *

FROM (SELECT *

FROM data

WHERE id < 4

) AS v1;

In the next step the subselect will be flattened out completely, which leaves us with:

SELECT * FROM data WHERE id < 4;

1	SELECT * FROM data WHERE id < 4;

We could have done this transformation by ourselves, but it is easier to make the optimizer do it for us and enjoy the convenience offered by the view.

Joining views

Basically the same mechanism applies, when you are working with joins. PostgreSQL will again inline, flatten and then create a plan to join those tables together. The output is therefore not surprising:

test=# explain SELECT 	*
	FROM 	v1 AS a, v1 AS b
	WHERE 	a.id = 1
			AND a.id = b.id;
                               QUERY PLAN                               
------------------------------------------------------------------------
 Nested Loop  (cost=0.00..96.71 rows=16 width=8)
   ->  Seq Scan on data  (cost=0.00..48.25 rows=4 width=4)
         Filter: ((id < 4) AND (id = 1))
   ->  Materialize  (cost=0.00..48.27 rows=4 width=4)
         ->  Seq Scan on data data_1  (cost=0.00..48.25 rows=4 width=4)
               Filter: ((id < 4) AND (id = 1))
(6 rows)

test=# explain SELECT *

FROM v1 AS a, v1 AS b

WHERE a.id = 1

AND a.id = b.id;

QUERY PLAN

------------------------------------------------------------------------

Nested Loop (cost=0.00..96.71 rows=16 width=8)

-> Seq Scan on data (cost=0.00..48.25 rows=4 width=4)

Filter: ((id < 4) AND (id = 1))

-> Materialize (cost=0.00..48.27 rows=4 width=4)

-> Seq Scan on data data_1 (cost=0.00..48.25 rows=4 width=4)

Filter: ((id < 4) AND (id = 1))

(6 rows)

Again PostgreSQL provides us with a simple plain.

Destructive behavior

However, views are not only good. Sometimes they can be a little destructive too. Consider the following example:

test=# CREATE VIEW v2 AS
	SELECT 	*
	FROM 	data
	WHERE 	id < 4
	ORDER BY 1;
CREATE VIEW

test=# CREATE VIEW v2 AS

SELECT *

FROM data

WHERE id < 4

ORDER BY 1;

CREATE VIEW

In this case the view provides us with some ordered data. Let us see what happens in the following query:

test=# explain SELECT * FROM v2 ORDER BY 1 DESC;
                            QUERY PLAN                             
-------------------------------------------------------------------
 Sort  (cost=135.22..137.34 rows=850 width=4)
   Sort Key: data.id DESC
   ->  Sort  (cost=83.23..85.36 rows=850 width=4)
         Sort Key: data.id
         ->  Seq Scan on data  (cost=0.00..41.88 rows=850 width=4)
               Filter: (id < 4)
(6 rows)

test=# explain SELECT * FROM v2 ORDER BY 1 DESC;

QUERY PLAN

-------------------------------------------------------------------

Sort (cost=135.22..137.34 rows=850 width=4)

Sort Key: data.id DESC

-> Sort (cost=83.23..85.36 rows=850 width=4)

Sort Key: data.id

-> Seq Scan on data (cost=0.00..41.88 rows=850 width=4)

Filter: (id < 4)

(6 rows)

As you can see PostgreSQL will sort the data twice, which is definitely not good for performance. PostgreSQL won't optimize this for you, so you have to take action yourself and be very aware of the issue.

In general it is a good idea to make a mental distinction between two types of views:

• Final results
• Intermediate results

If a view already has the “final” results, you can do sorting and all kinds of stuff. However, if you are using a view to perform some large calculations on top of it, it might be a good idea to avoid formatting, sorting and alike.

However, there is more to views. Consider the following example:

test=# CREATE VIEW v3 AS
	SELECT 	*
	FROM 	data
	WHERE 	id < 4
	LIMIT 100000000000;
CREATE VIEW

test=# CREATE VIEW v3 AS

SELECT *

FROM data

WHERE id < 4

LIMIT 100000000000;

CREATE VIEW

In this case the view contains a LIMIT clause, which basically says that “all” the data will be returned. However, during those formal transformation the optimizer cannot “rely” on the fact that all data will be included in the final result. Therefore the optimizer cannot silently skip the LIMIT clause, which of course leads to the following plan:

test=# explain SELECT * FROM v3 ORDER BY 1 DESC;
                            QUERY PLAN                             
-------------------------------------------------------------------
 Sort  (cost=91.73..93.86 rows=850 width=4)
   Sort Key: data.id DESC
   ->  Limit  (cost=0.00..41.88 rows=850 width=4)
         ->  Seq Scan on data  (cost=0.00..41.88 rows=850 width=4)
               Filter: (id < 4)
(5 rows)

test=# explain SELECT * FROM v3 ORDER BY 1 DESC;

QUERY PLAN

-------------------------------------------------------------------

Sort (cost=91.73..93.86 rows=850 width=4)

Sort Key: data.id DESC

-> Limit (cost=0.00..41.88 rows=850 width=4)

-> Seq Scan on data (cost=0.00..41.88 rows=850 width=4)

Filter: (id < 4)

(5 rows)

Conclusion

As always the world is neither only black nor white. Views are a valuable feature and when used nicely there is nothing wrong with them. However, keep in mind what is going on behind the scenes and avoid building views on top of view on top of views and so on. Here is the documentation on query planning and the optimizer, for reference.

Learn more about query tuning, see our latest blogs on the topic.

PostgreSQL performance: How the optimizer handles views

PostgreSQL and the optimizer: What they do to views

Inlining and flattening

Joining views

Destructive behavior

Conclusion

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts