Rolling sum and average query results are possible by combining the aggregate functions SUM()
or AVG()
with the OVER()
clause, making for powerful analytic queries. I recently learned how to compute a rolling average or sum of values by using the Windowing option of the OVER()
clause, applying the concepts to a data set I am familiar with. I wanted to share my learning with any readers who might be interested…
OS and DB used:
- Linux Mint 20 “Ulyana”
- MySQL 8.0.22
Self-Promotion:
If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like!
Rolling sum and average: OVER() clause options
The OVER()
clause accepts 3 optional parameters. I like to think of them as these broad descriptions:
- Partitioning: Sub-grouping or dividing of rows.
- Ordering: Sorting and ordering of rows.
- Windowing: Placing boundaries or restrictions on rows.
Placing boundaries on rows involves specifying either ROWS BETWEEN
or RANGE BETWEEN
in the Windowing portion of the OVER()
clause.
I have a ‘walking_stats’ table with this structure I will use for the example queries:
1
2 3 4 5 6 7 8 9 10 11 12 |
DESC walking_stats;
+————–+————–+——+—–+———+——-+ | Field | Type | Null | Key | Default | Extra | +————–+————–+——+—–+———+——-+ | day_walked | date | YES | | NULL | | | cal_burned | decimal(4,1) | YES | | NULL | | | miles_walked | decimal(4,2) | YES | | NULL | | | duration | time | YES | | NULL | | | mph | decimal(2,1) | YES | | NULL | | | shoe_id | int | YES | | NULL | | +————–+————–+——+—–+———+——-+ 6 rows in set (0.0656 sec) |
For the examples in this blog post, I am only interested in placing boundaries on actual rows of data for the ‘cal_burned’ column and not a range of values, in which case I would instead use RANGE BETWEEN
.
Related: Not sure when to use ROWS BETWEEN
or RANGE BETWEEN
? I have written a blog post, ROWS and RANGE: Windowing clause defaults – learning by example in MySQL with some examples to help clear up their differences.
Rolling sum and average: Using SUM() aggregate function
To retrieve a 3-day moving total of combined calories burned, I need to use the SUM()
aggregate function with the OVER()
clause. The ‘day_walked’ column will form the sort criteria for the ORDER BY
clause portion of the analytic. The boundary on the rows is: ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
, which stipulates the rows included in the calculation for any one row.
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
simply means, consider the current row and the 2 previous rows (PRECEDING
) in the result set as sorted by the ORDER BY
clause and apply the SUM()
function to those specific rows’ ‘cal_burned’ column values. The WHERE
clause predicate using the MONTHNAME()
date function restricts the number of rows to a smaller result set for easier on-screen presentation and handling.
Below is the rolling sum query:
1
2 3 4 |
SELECT day_walked, cal_burned,
SUM(cal_burned) OVER(ORDER BY day_walked ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS 3_day_sum FROM walking_stats WHERE MONTHNAME(day_walked) = ‘November’; |
Consider the returned query results in the screen-shot above where I have provided context for 2 different rows. On the row with date ‘2019-11-04’ the ‘3_day_sum’ value is 1101.3, which is derived from summing up the ‘cal_burned’ column values for that particular row as well as the 2 preceding rows (because of the ROWS BETWEEN 2 PRECEDING
boundary stipulation). The row for date ‘2019-11-15′ adheres to this same boundary on the rows’ calculation.
We can do the math by hand, confirming the calculations as well:
1
2 |
SELECT 367.8 + 365.6 + 367.9;
1101.3 |
1
2 |
SELECT 364.4 + 364.2 + 354.6;
1083.2 |
Rolling sum and average: Using AVG() aggregate function
We are not limited to only calculating rolling sums as we can use the AVG()
aggregate function with the OVER()
clause for a rolling average result.
Simply replace the SUM()
aggregate function with AVG()
and keep the same windowing criteria or change it up depending on your requirement.
I’ve used the ROUND()
function to restrict the decimal place to 1 value in the following query for a 5-day rolling average:
1
2 3 4 |
SELECT day_walked, cal_burned,
ROUND(AVG(cal_burned) OVER(ORDER BY day_walked ROWS BETWEEN 4 PRECEDING AND CURRENT ROW), 1) AS 5_day_avg FROM walking_stats WHERE MONTHNAME(day_walked) = ‘November’; |
Again, ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
places a boundary of the current row and the preceding 4 rows, in the AVG()
aggregate function calculation.
Here is the math for the above calculation:
1
2 |
SELECT ROUND((357.8 + 354.8 + 364.9 + 357.7 + 364.4) / 5, 1);
359.9 |
MySQL Window Functions Similar Posts
Are you interested in learning more about Window Functions? Check out these other related blog posts I have written on them:
What are some of your favorite Windowing clauses you have used for various requirements? Tell me all about them in the comments below.
Like what you have read? See anything incorrect? Please comment below and thank you for reading!!!
A Call To Action!
Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.
Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.
To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…)
Be sure and visit the “Best Of” page for a collection of my best blog posts.
Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.
Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.
The post Rolling sum and average – Window Functions MySQL appeared first on Digital Owl’s Prose.