Bonus cards, “Miles & more”, bonus points - don’t we all love and hate them at the same time? Recently we had an interesting use case which made me think about sharing some of the techniques we used in this area to reduce client code by writing some clever SQL. This post will show you how to efficiently code bonus programs in SQL.
Table of Contents
Suppose we want to run a bonus program. What we want is to know how many bonus points somebody had at any given point in time. This is how we might want to store the data:
1 2 3 4 5 6 |
CREATE TABLE t_bonus_card ( card_number text NOT NULL, d date, points int ); |
For each bonus card, we want to store how many points were awarded when. So far this is relatively easy. Let's load some sample data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
COPY t_bonus_card FROM stdin DELIMITER ';'; A4711;2022-01-01;8 A4711;2022-01-04;7 A4711;2022-02-12;3 A4711;2022-05-05;2 A4711;2022-06-07;9 A4711;2023-02-02;4 A4711;2023-03-03;7 A4711;2023-05-02;1 B9876;2022-01-07;8 B9876;2022-02-03;5 B9876;2022-02-09;4 B9876;2022-10-18;7 . |
In my example, we have data for two bonus cards which receive some rewards from time to time. To run our bonus program using PostgreSQL, we might want to answer some basic questions:
Let's answer these questions using…
To answer all these questions we can use windowing functions along with some advanced, fancy frame clauses. Let's take a look at a basic query:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
SELECT *, array_agg(points) OVER (ORDER BY d RANGE BETWEEN '6 months' PRECEDING AND CURRENT ROW) FROM t_bonus_card WHERE card_number = 'A4711' ; card_number | d | points | array_agg -------------+------------+--------+------------- A4711 | 2022-01-01 | 8 | {8} A4711 | 2022-01-04 | 7 | {8,7} A4711 | 2022-02-12 | 3 | {8,7,3} A4711 | 2022-05-05 | 2 | {8,7,3,2} A4711 | 2022-06-07 | 9 | {8,7,3,2,9} A4711 | 2023-02-02 | 4 | {4} A4711 | 2023-03-03 | 7 | {4,7} A4711 | 2023-05-02 | 1 | {4,7,1} (8 rows) |
What this does is simple: It goes through our data set line by line (sorted by date). Then it checks if there are rows between our current row and a value 6 months earlier. For debugging purposes, we aggregate those values into an array. What we see is that on June 7th we have 5 entries. But keep in mind: The rules of our bonus program say that points awarded are taken away after 6 months. By using a sliding window, we can easily achieve this goal.
Note that in SQL we have “ROWS
”, “RANGE
” and “GROUP
” as possible keywords in our frame clause. ROWS
means that we want to see a specific number of older rows in our frame. However, this makes no sense here - what we need is an interval and this is exactly what RANGE
can do for us. Rewards might be granted at random points in time so we certainly need to operate with intervals here.
The array_agg
function is really useful to debug things. However, in a real world scenario, we need to add up those numbers using sum
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
SELECT *, sum(points) OVER (ORDER BY d RANGE BETWEEN '6 months' PRECEDING AND CURRENT ROW) FROM t_bonus_card WHERE card_number = 'A4711' ; card_number | d | points | sum -------------+------------+--------+----- A4711 | 2022-01-01 | 8 | 8 A4711 | 2022-01-04 | 7 | 15 A4711 | 2022-02-12 | 3 | 18 A4711 | 2022-05-05 | 2 | 20 A4711 | 2022-06-07 | 9 | 29 A4711 | 2023-02-02 | 4 | 4 A4711 | 2023-03-03 | 7 | 11 A4711 | 2023-05-02 | 1 | 12 (8 rows) |
We have seen that points drop in 2023 again. That's exactly what we wanted.
PARTITION BY
Maybe you have noticed that we did the entire calculation for just one card number. However, what has to be done to make this work for any number of cards? The answer is PARTITION BY
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
SELECT *, sum(points) OVER (PARTITION BY card_number, date_trunc('year', d) ORDER BY d RANGE BETWEEN '6 months' PRECEDING AND CURRENT ROW) FROM t_bonus_card ; card_number | d | points | sum -------------+------------+--------+----- A4711 | 2022-01-01 | 8 | 8 A4711 | 2022-01-04 | 7 | 15 A4711 | 2022-02-12 | 3 | 18 A4711 | 2022-05-05 | 2 | 20 A4711 | 2022-06-07 | 9 | 29 A4711 | 2023-02-02 | 4 | 4 A4711 | 2023-03-03 | 7 | 11 A4711 | 2023-05-02 | 1 | 12 B9876 | 2022-01-07 | 8 | 8 B9876 | 2022-02-03 | 5 | 13 B9876 | 2022-02-09 | 4 | 17 B9876 | 2022-10-18 | 7 | 7 (12 rows) |
PARTITION BY card_number
ensures that our calculations are done for each incarnation of card_number
separately. In other words: User A’s points cannot be mixed with user B’s points anymore. But there is more to this query: We want that at the beginning of every year those points should be set to zero and counting should resume. We can achieve this by using PARTITION BY
as well. By rounding out dates to full years we can use the year as partition criteria.
As you can see, SQL is really powerful. A lot can be done without having to write a single line of client code. A handful of SQL statements can produce terrific results and it makes sense to leverage your application.
If you want to know more about PostgreSQL 15 and if you are interested in merging data, check out my post about MERGE - which can be found here.
+43 (0) 2622 93022-0
office@cybertec.at
You are currently viewing a placeholder content from Facebook. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More InformationYou are currently viewing a placeholder content from X. To access the actual content, click the button below. Please note that doing so will share data with third-party providers.
More Information
Leave a Reply