You are on page 1of 32

openSAP SAP Analytics Cloud – Become an

Augmented BI Expert
Week 4, Unit 5: Predict Future Outcomes with Smart Predict
Forecasting

Forecast National Parks Visits

PUBLIC
TABLE OF CONTENTS
USE CASE ........................................................................................................................................................ 3
UNDERSTANDING THE PLANNING MODEL ................................................................................................. 4
CREATING THE PREDICTIVE SCENARIO ..................................................................................................... 7
Defining the Model Settings ........................................................................................................................... 8
Training the Predictive Model ........................................................................................................................ 9
Understanding the Predictive Model ............................................................................................................. 9
The Forecast Tab .............................................................................................................................................. 9
The Explanation Tab .......................................................................................................................................... 9
CREATING THE STORY AND THE PLANNING VERSION .......................................................................... 10
Setting up the Story ....................................................................................................................................... 10
Creating a Planning Private Version ............................................................................................................ 13
SAVING THE PREDICTIVE FORECASTS INTO THE VERSION.................................................................. 14
DEVELOPING THE STORY ........................................................................................................................... 15
Calculating the Total Visits for 2021 (Actual + Predicted) ......................................................................... 15
Visualizing the Visits over Time ................................................................................................................... 17
FORECASTING AT NATIONAL PARK LEVEL ............................................................................................. 19
Creating a new Predictive Model ................................................................................................................. 19
Creating a new Planning Version ................................................................................................................. 19
Checking the Predictions (in the Predictive Model) ................................................................................... 20
Checking the Predictions (in the Story) ...................................................................................................... 22
COMPARING PREDICTIONS & REALITY (OPTIONAL) .............................................................................. 27
Creating the Predictive Model ...................................................................................................................... 27

2
PUBLIC

USE CASE

Galen is responsible at national level to make sure visitors enjoy the best experience when visiting worldwide
known national parks like Yosemite or Yellowstone, to name two of the most well-known of these jewels.
Galen uses SAP Analytics Cloud (SAC) to create a time series forecasting model to:
▪ Understand the future trends – will people visit parks again despite the COVID-19 pandemic?
▪ Generate predictive forecasts at national level and national park level
▪ Report on the predictive forecasts in stories and compare actuals, predictions, and latest trends.

Here is the sequence of actions you will follow in this hands on:
• Create a predictive forecast at national level and build a story to report the results
• Create a predictive forecast at park level and build a story to report the results
• Forecast known month values and compare the predictions to what happened in reality

The data foundation you will use is a planning model, this will give you the possibility to experience the full
end-to-end loop of SAP Analytics Cloud predictive planning.

3
UNDERSTANDING THE PLANNING MODEL
The source data contains 4712 records. It contains the evolution of recreational* visits per month and per
national park in the period ranging from January 2015 to April 2021 included.

Variable name What does the variable stand for?


Park Name As the name hints, it stands for the name of the National Park. There are 62
different parks in the data source.
Month The corresponding month in the period 2015-01 till 2021-04 (76 months in total for
the period).
RecreationVisits The number of recreational visits for the corresponding the National Park and the
month.
*Recreational visits are the ones made by park visitors for leisure purposes, as opposed to professionals
visiting the park.
Logon into SAP Analytics Cloud.
Navigate to the folder My Files/Public/openSAP BI Expert/Week 4 Augmented Analytics /Unit 4.5
Predict Future Outcomes with Smart Predict Forecasting.
You will see a planning model named US National Park Visit Forecasting.

The source file (Excel) is also provided. For the rest of the exercise, you will use the planning model as the
data source. If you want to recreate the planning model from the Excel file, you are free to do so.
Copy the planning model to your My Files folder.

4
Under the My Files folder, click on the folder creation icon.

Name the new folder US National Park Visit Forecasting. Click OK.

5
Keep the name US National Park Visit Forecasting and click on the OK button.
You can open the planning model and display the Model Structure tab to have a look to the structure of the
planning model.

6
You can click on the different model dimensions to see the different values per dimension. For this, click the
top-right arrows directly in the dimensions in the diagram or click the dimension hyperlinks in the left-hand
side panel.

CREATING THE PREDICTIVE SCENARIO


In the left hand-side panel, click on Predictive Scenarios.
Then click on the Time Series Forecast icon.

Name the predictive scenario US National Park Visit Forecasting and save it in the folder My Files / US
National Park Visit Forecasting. Click the OK button.

7
You will now define the settings of this predictive scenario.
Defining the Model Settings
Fill the right-hand side Settings panel as follows.

• The Description should be Forecast May 2021 to December 2021 visits at national level
• The Time Series Data Source is the planning model in your My Files folder
• The Version is Actual (Actuals) (default version storing actual values)
• The Account is RecreationVisits – this is the variable that will be forecasted
• The Date is Month.
• Set the Number of Forecast Periods to 8. This corresponds to the May 2021 to December 2021
periods as the last known actual period is April 2021.

8
• You do not need to change any other default settings.

Training the Predictive Model


Once your predictive model is set, click the Train & Forecast button at the bottom right of the Settings
panel.

In the Predictive Models bottom panel, the status shows the model creation progress.

It shows the status Trained once the time series forecasting model is built.

Understanding the Predictive Model


Smart Predict displays detailed information about the trained predictive model.
The Forecast Tab
The Forecast tab displays different information about the predictive model.
The Expected MAPE (Mean Absolute Percentage Error) for this model equals 57.21%. This evaluates the
error that the model is making based on known actuals.

The Forecast vs. Actual shows the actual values, the forecasted values, the detected outliers, and the
minimal and maximal error:
• You can spot outliers (unusual values compared to normal) in 2020, caused by the COVID-19
pandemic impact.
• The forecasting error is due to the difference between the summer peaks and the predictions prior to
2020.

Next you can see the details of the forecasted values, min / max errors, and the outliers. Hint: you can also
transform the visualization Forecast vs Actual into a table in this case click & explore the options offered
using the gear wheel icon).
The Explanation Tab
This explains what the predictive model is about and presents the different model components.

9
This predictive model is made of:
• a decreasing trend (due to the 2020 impact) – in red.
• yearly cycles – in orange.
• small fluctuations – in purple. Fluctuations is when a forecasted value at time T depends on values
at times T-N.

The Time Series Component Impact presents the relative impact of each part of the model to the predicted
values.
Note that the predictive model does explain all the variability of the actuals but 4.87% (Final Residuals).

CREATING THE STORY AND THE PLANNING VERSION


You will now create a story. This will help you visualize the data and compare predicted values and actual
values.

Setting up the Story

10
Create a new Canvas story. Use the Classic Design Experience.

Save your story in your work folder US National Park Visit Forecasting under My Files.

11
Click on Add data and select the planning model you created. Move from Data to Story mode and add a
table to your story.

Set the Builder for the table as follows.

12
If you expand the table, it should look like this. The recent year with most visits at national level is 2019.

Note: do not forget to save the story on a regular basis as you will progress in the hands-on.
Creating a Planning Private Version
Select the table and open the Version Management panel.

13
Copy the Actual version and create a new private version named Predicted Visits. This version
automatically appears in the story. It is empty for now, as you asked for, and private (only visible by you).
The next step consists in saving the predicted values in this version. Do not forget to save your story first.

SAVING THE PREDICTIVE FORECASTS INTO THE VERSION


Go back to your predictive model My Files / US National Park Visit Forecasting (for instance open the
Predictive Scenarios application, you can find the most recent ones there).

14
Click the factory-like icon from the predictive model toolbar, select the private version you just created, click
Save.

Once the Status field shows Applied, you can refresh the story.

DEVELOPING THE STORY

Calculating the Total Visits for 2021 (Actual + Predicted)


You know the actual numbers from January 2021 to April 2021 and you have the predicted numbers from
May 2021 to December 2021. Now you want to have the total number of visits for 2021.

Make sure your story is in Edit mode. Select the 2021 member for the Actual category and the 2021
member (hold CTRL key for Windows users or CMD key for Mac users) for the Predicted Visits category. In
the right-click menu select Add calculation / Sum / Single. You can rename the column from Sum to Total
2021 Visits (just double click on the cell while on story Edit mode).
You can see the total number of visits (actual + predicted) for 2021. You can now gauge if this is likely to be
back to the level of the pre COVID-19 years.

15
Filter the Month dimension on the year 2021 in the Designer to focus your table on the recent months.
For this, use the Month (Member) option.

16
Rename the page to 2021 Visits (Table). The page will look as below (please note the column order might
look different depending on your selections).

Save the story.


Visualizing the Visits over Time
Add a new Canvas page to the story. Move your mouse cursor right to the last created page / tab and a +
sign will appear.

Add a new chart.

17
Configure the chart as follows:
• First select RecreationVisits as the measure
• Select Month as the dimension
• Set the chart structure to Trend / Time Series.
• Add the dimension Category to the field Color.
• In Filters / Category, select the versions Actual and Predicted Visits.
• You can adapt the color palette as you see fit – for instance selecting the second option in the drop
down.

Rename the page as Visits per Month (Actual + Predicted). Save the story.
You reached the first milestone in the hands-on.

18
FORECASTING AT NATIONAL PARK LEVEL
Creating a new Predictive Model
Go back to your predictive scenario. Duplicate your existing predictive model.

Click on the newly created Model 2:


• Enter the description Forecast May 2021 to December 2021 visits at park level.
• In the Entity field, select Park Name. You ask for one forecasting model to be created for each of the
different national parks, to have more precise predictions for each park. Each predictive model will
correspond to a specific data model (evolution of recreation visits per month at park level) and
deliver the predictions at this level. You will compare this approach with the forecasting done at
national level.
• Enable the option Convert Negative Forecast Values to Zero as a negative number of visits do not
make sense.
• You can leave the rest of the settings unchanged (with their default values)

Once done, you can hit the button Train & Forecast. The training phase will require more time, as you now
train 62 predictive models (one predictive model for each park) on more granular data (so please grab
yourself a coffee, tea, or just a glass of water as you prefer). While the predictive model for each park is
being trained, you can go ahead with the next section of the hands-on, you will come back to the predictive
model later.
Creating a new Planning Version
Let the predictive scenario process and go back to the story (or open the story in a separate tab in your
browser).

Give the focus to the table and open the Version Management panel.
Create a private version named Predicted Visits - At Park Level as follows.

19
Checking the Predictions (in the Predictive Model)
Once the predictive model is trained you should see the following:
• You get one entity per national park
• Top Entities correspond to the 10 most accurate forecasting models
• Bottom Entities correspond to the 10 least accurate forecasting models
• The average expected MAPE is an average across the 62 models, it is pretty high as it gets skewed
due to some entities with an extremely high MAPE.
• Then you get the list of All Entities for each & every national park.

20
You can for instance open the entity Yellowstone NP in the Forecast tab to have a look to the
corresponding model. You can see the Forecast and Explanation tab in the next page.
You can do this for other entities as you see fit.

21
Next, you will save the predictive forecasts in the version you just created.

Checking the Predictions (in the Story)


Once you are back in the story, you can check the difference between the predictions at national level and
the predictions at park level.
Open the page 2021 Visits (Table).
Add the version Predicted Visits – At Park Level to the Designer in case it is not yet there. If the version
management is still open, close it and go into Edit mode.

22
The table will look like this.

Now add a new calculation, exactly like you did before. You will add the Actual values for 2021 and the
Predicted Visits – At Park Level, also for 2021.

Rename this calculation as Total 2021 Visits - at Park Level.


Now create a calculation that corresponds to the difference between the two Total 2021 Visits, the one you
created in the first part of the exercise minus the one you created in the second part of the exercise. Rename
the calculated column to Difference as below.

23
Note the order of the columns in the table might vary.

Next you will create a new Canvas page.

Insert a new chart as follows. Please note you will have to set the dimension & measure so that you can
enable the Time Series chart and use it.

24
The visualization will look like this.

Now add an input control to the page.

Select Park Name in Dimensions.

In the window Set Filters for Park Name, check All Members.

25
Expand the input control and rename the page Visits Per Month (At Park Level).
Your page should look as follows – you can now see the predicted visits per national park.

Congratulations – if you have reached this point, you are done with the main part of the hands-on. Well
done! If you still have time / energy, you can go ahead with the optional part of the exercise or save this for
later…

26
COMPARING PREDICTIONS & REALITY (OPTIONAL)
This is an optional exercise. If you are reading this, congratulations as you want to do more!
Let us do some backtesting!
As per Wikipedia: Backtesting is a term used in modeling to refer to testing a predictive model on historical
data.
Here you will keep it simple:
• You will stop your training set at the end of December 2020, pretending you use a magical travel
time machine and do not know anything of what will exactly happen in the course of 2021.
• You will ask for 12 predictive forecasts, e.g., January 2021 to December 2021.
• Because you know the actual values for the period January 2021 to April 2021, this will give us a
possibility to evaluate the effective accuracy of the predictive model in real life. This will not be
possible for the period going from May 2021 to December 2021 as you do not know the actual
values yet.

Creating the Predictive Model


You go back once more to your predictive scenario. You will once again duplicate Model 1.

• Set the description to Back testing January 2021 to December 2021


• Set the Number of Forecast Periods to 12
• Set the Until field to User-Defined Date / Dec 31, 2020.

Click Train & Forecast.

27
You can zoom on the graph Forecast vs. Actual. What differences do you notice? What conclusions can
you draw from this? Are people significantly back in the parks compared to the predictions?
Tip: you can change the visual representation from line chart to table.

You can then see the actual & forecasted values for the first four months of 2021.

28
Similarly to what you have done in the main part of the hands-on you can create a new private version to
output the predictive forecasts.
Name this version Predicted Visits – 2021.

Save the predictive forecasts to the version Predicted Visits – 2021.


Go into the story, switch to Edit mode. Add a new page.
Add a new table to the page, configure the table as follows.

29
30
Add a calculation, as a percentage difference (see screenshot). Please make sure to select both columns
using the CTRL key of the keyboard for Windows users or the CMD key for Mac users. Select Predicted
Visits – 2021 first and Actual second, otherwise you’ll get a different calculation result!
Compared to reality did the predictive model overestimate or underestimate what happened in the months
going from January to April 2021?

You are done with the bonus exercise!

31
Coding Samples
Any software coding or code lines/strings (“Code”) provided in this documentation are only examples and are not intended for use in a production system environment. The Code is only intended to better
explain and visualize the syntax and phrasing rules for certain SAP coding. SAP does not warrant the correctness or completeness of the Code provided herein and SAP shall not be liable for errors or
damages cause by use of the Code, except where such damages were caused by SAP with intent or with gross negligence.

www.sap.com/contactsap

© 2022 SAP SE or an SAP affiliate company. All rights reserved.


No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company.

The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.

These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable
for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty statements
accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality
mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are
all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation
to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are
cautioned not to place undue reliance on these forward-looking statements, and they should not be relied upon in making purchasing decisions.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other
countries. All other product and service names mentioned are the trademarks of their respective companies. See www.sap.com/trademark for additional trademark information and notices.

You might also like