In this blog post, we will learn about the rolling average, it’s also called a moving average, a running average, a moving mean, or a rolling mean. You’ll very often see rolling averages used in time series data to analyze trends, they are calculated as the mean of the current and a specified number of immediately preceding or following values for each point in time.
This blog post is designed using SAP HANA XS Advanced (SAP Web IDE). (SAP HANA on premise and SAP HANA, express edition).
Use case: – A rolling average helps us to find trends of a certain measure for a particular time window.
In this blog post, we will be discussing three scenarios of Rolling Average.
1. Rolling Average for immediately following/forward values.
2. Rolling Average for immediately preceding/previous values.
3. Rolling Average for immediately preceding and following values.
DATA and SOLUTION
In our table, we have three columns (YEAR_MONTH, MONTH and REVENUE). This is a monthly series data set since each revenue is associated with a specific month.
Scenario 1: –
Rolling Average for immediately following/forward values.
In this case, we will calculate the average for a window of rows between the CURRENT ROW AND 3 FOLLOWING ROWS and the average must be calculated only using the revenue values of the current row and the three forward rows. Then, for each row in the result set, the rolling average will be calculated based on a different set of four revenue values. We can see this in the following formula:
Rolling Average = ( Revenue (Current Row) + Revenue (Next Row) + Revenue (Current Row+2) + Revenue (Current Row+3) ) / 4
For the month of JAN, the three-month moving average is calculated as the mean of revenues from that month (111) and the next three months that is FEB (120), MAR (123) and APR (100). So, the moving average for JAN is the average of these four values, or 113.5 as shown in the image below.
The rolling average is calculated in the same way for each of the remaining months.
To achieve this, I have used Table Function, which can be further consumed in a Calculation view to get the result.
In the below query, we’ll see how to use HANA SQL to calculate the moving average for the column “REVENUE” based on the current month’s revenue value and the 3 next values.
SELECT "YEAR_MONTH","MONTH","REVENUE",
AVG("REVENUE") OVER (ORDER BY "YEAR_MONTH" ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING) RollingAverage
FROM "SCHEMA"."TABLE" ;
OUTPUT: –
NOTE: –
For the month of OCT, when we try to see the next three months, we see there are only two months left. So, in this situation, the three-month moving average is calculated as the mean of revenues from that month (132) and the next remaining two months that is NOV (134) and DEC (123). So, the moving average for OCT is the average of these THREE values and the sum will be divided by 3 instead of 4 as in the case of other month values explained above, and same goes for the NOV and DEC, the sum will be divided by the number of rows summed up.
Scenario 2: –
Rolling Average for immediately preceding/previous values.
In this case, we will calculate the average for a window of rows between the BETWEEN 2 PRECEDING AND CURRENT ROW and the average must be calculated only using the revenue values of the current row and the two preceding rows. Then, for each row in the result set, the rolling average will be calculated based on a different set of three revenue values. We can see this in the following formula:
Rolling Average = ( Revenue (Current Row) + Revenue (Previous Row) + Revenue (Current Row-2) ) /3
For the month of APR, the two-month moving average is calculated as the mean of revenues from that month (100) and the previous/preceding two months that is MAR (123) and FEB (120). So, the moving average for APR is the average of these three values, or 114.33 as shown in the image below.
The rolling average is calculated in the same way for each of the remaining months.
Directly going to see the query, being used to calculate the moving average for the column “REVENUE” based on the current month’s revenue value and 2 previous/preceding values.
SELECT "YEAR_MONTH","MONTH","REVENUE",
AVG("REVENUE") OVER (ORDER BY "YEAR_MONTH" ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) RollingAverage
FROM "SCHEMA"."TABLE" ;
OUTPUT: –
Scenario 3: –
Rolling Average for immediately preceding and following values.
In this case, we will calculate the average for a window of rows between the 2 PRECEDING AND 2 FOLLOWING for the CURRENT ROW and the average must be calculated only using the revenue values of the current row and the two preceding rows as well two following rows. Then, for each row in the result set, the rolling average will be calculated based on a different set of five revenue values. We can see this in the following formula:
Rolling Average = ( Revenue (Previous Row) + Revenue (Current Row-2) +
Revenue (Current Row) +
Revenue (Next Row) + Revenue (Current Row+2) ) / 5
For the month of APR, the two-month moving average in the preceding and following direction is calculated as the mean of revenues from that month (100) and the previous/preceding two months that is MAR (123) and FEB (120) as well as the two following months that is MAY(114) and JUN(121). So, the moving average for APR is the average of these five values, or 115.6 as shown in the image below.
The rolling average is calculated in the same way for each of the remaining months.
Directly going to see the query, being used to calculate the moving average for the column “REVENUE” based on the current month’s revenue value and 2 previous/preceding values along with 2 following values.
SELECT "YEAR_MONTH","MONTH","REVENUE",
AVG("REVENUE") OVER (ORDER BY "YEAR_MONTH" ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) RollingAverage
FROM " SCHEMA "." TABLE " ;
OUTPUT: –
NOTE:-
If you want to make it dynamic, you can use a Table Function or Procedure with an Input Parameter(s) for any of the explained cases above and calculate the average for a window of rows between the “X” PRECEDING OR “Y” FOLLOWING rows for the CURRENT ROW.
SELECT "YEAR_MONTH","MONTH","REVENUE",
AVG("REVENUE") OVER (ORDER BY "YEAR_MONTH" ROWS BETWEEN X PRECEDING AND Y FOLLOWING) RollingAverage
FROM " SCHEMA "." TABLE " ;
No comments:
Post a Comment