It happens on a quite regular basis that people contact the CYBERTEC PostgreSQL Support Desk to get support for a pretty common thing: Paginators on websites. What is the main issue? How hard can be it to display simple tables after all?
Table of Contents
Suppose you want to display a table on your website. You definitely don't want to display 1 million entries at once, so you turn to a pagination. What many people do is:
… and then …
What you have just seen is a classical recipe for disaster. Why that? The first query might be highly efficient and execute on milliseconds. However, what about the second one? Can anybody be sure that it also takes just milliseconds? No, because if the WHERE clause is not selective, the count might yield a really high number. Just recently our support team had a typical case: The second query took 9 (!) minutes to complete.
Therefore you have to ask yourself: Do you really want to wait for minutes to have an exact page count? The user has to wait minutes just to figure out that his query has yielded exactly 4.353.902.535 pages. How many people do you know, whoever checked more than maybe the first 5 pages?
To fix that problem you basically got two choices
As far as estimates are concerned: Have you ever used Google? Google will stop providing you with answers after around 30 pages. Google does that for a reason: Nobody will ever go to page number 31. Everybody will refine his search long before.
So, estimates are fine. In PostgreSQL you can simply use EXPLAIN to see, how many rows PostgreSQL expects. This is fine in many cases - especially when result sets are very large.
The second option is a query like that:
Assuming that you want to display exactly 100 rows, you can always tell the user that there are more than 100 rows or more than, say, 10 pages. However, there is an even bigger advantage: Deterministic runtime. If you fetch 101 rows only - can execution times vary? Sure, they can but they will certainly stay within an acceptable range. If there is no limit, runtimes can theoretically be infinite.
Let us assume we got 1.000.000 queries, which need 1 ms each, and let us assume we got 1.000.000 queries, which need 1 second each.
In the first scenario, life is easy: A single CPU can do that in around 16 minutes.
In the second case 12 CPUs will be busy for a day.
So just imagine what will happen, if your “slow” query does not take 1 second but 1 minute? This is not uncommon in case you really want to count everything, to display an exact page count (which is usually done in grey, font size = 6).
In case you need any assistance, please feel free to contact us.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.
You need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou need to load content from reCAPTCHA to submit the form. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Leave a Reply