You are on page 1of 17

openSAP

A First Step Towards SAP HANA Query


Optimization
Week 4 Unit 1

00:00:05 Hello, and welcome to unit one of week four. In this week, we will work on the case studies
using the knowledge from the last units.
00:00:14 We will cover two case studies. The first case study is about understanding data flow and
business logic,
00:00:21 and the second one is more about deep dive on query processing. Let's have a look at the
first case study, which is a sporadic composite OOM case.
00:00:34 The issue is about sporadic composite OOM. For this case study, we have a composite
OOM dump,
00:00:41 business logic explanation, and monitoring view information for M_SQL_PLAN_CACHE,
HOST_SQL_PLAN_CACHE,
00:00:50 and M_EXECUTED_STATEMENTS. As a goal of this case study, we need to find a reason
why OOM occurs.
00:01:01 Before we work on this case study, I will give you quick overview of history of this OOM
issue. There is a company A in the food industry.
00:01:11 According to the company, they executed a procedure job, and they know the executed
query string.
00:01:19 The job is to check the material cost, calculate the production cost, and analyze the cost
and profit, building the strategy.
00:01:28 And OOM occurred at the step of cost calculation. Actually, several OOMs occurred before
20th of February, but for this case study,
00:01:38 we only have the composite OOM dump on 20th of February. So this case study will be
handled with this information of this composite OOM dump.
00:01:52 Here, we will work on the initial analysis with given composite OOM dump. As we
discovered from last week,
00:02:00 composite OOM is related to query execution and statement memory limit. So as a very first
step, we have to check out the statement memory limit
00:02:13 and find the problematic query. The statement memory limit was 500 gigabytes,
00:02:22 and it means SAP HANA required additional memory, more than 500 gigabytes, at the time
when it executed the problematic query.
00:02:32 But it was not able to allocate the further memory due to statement memory limit, and it
results in composite OOM.
00:02:43 From the OOM dump, we could find the top allocator and check out the problematic queries,
the connection ID, and statement ID.
00:02:55 With the information of connection ID and statement ID, we can search the problematic
query in this OOM dump.
00:03:04 Here, we can use our knowledge of analyzing the OOM dump that we learnt from last week.
Once we found the thread ID, we are looking for its parent ID.
00:03:18 Here, the parent ID is 33511. To verify this original query executed when OOM occurred,
00:03:30 we need to check the parent thread. So we searched the thread of the Parent ID,
00:03:36 and found that this is the top-most hierarchy since there is no parent ID for this thread. And
the query related to the thread is identical to the OOM query that the company executed.
00:03:51 So, we can conclude that this OOM occurred due to the following query. But the main query
led to OOM was at the line number 1680
00:04:01 in the procedure AFTER_INSERT_03. We found this query in top allocator section in the
composite OOM dump.
00:04:11 When we encountered an issue, reproducing the issue is very helpful, but unfortunately, this
issue was not reproduced.
00:04:20 Therefore, we need to start analysis from this point where we found the main problematic
query.
00:04:30 Let’s make timeline of this OOM issue. We know that job was running and OOM occurred.
00:04:38 And the issue occurrence time is at 1:28. Unfortunately, this OOM was not reproduced,
00:04:47 and this might be because something has been changed such as plan or data. Therefore,
we need to understand the dataflow in the business logic with the query plan,
00:05:01 and also we need to find out compilation and execution time, as well as data statistics. So
we can summarize investigation list as the following.
00:05:15 From the initial analysis, we know that the composite OOM occurred during the query
execution and the main problematic query that led to OOM
00:05:25 is at line number 1680 in the procedure. But unfortunately, this OOM was not reproduced.
00:05:34 So for the further analysis, we need to check the query plan from the OOM dump, and if
there is faster case, then we can compare the faster case with this OOM dump.
00:05:47 It is also important to understand the business logic and its procedure definition. Lastly, we
need to check the OOM query’s compilation time and execution time.
00:06:01 That’s the end of unit one of week four. In the next unit, we will continue to work on this
case study.
00:06:09 Thank you for your attention, and looking forward to seeing you there. Bye.

2
Week 4 Unit 2

00:00:05 Hello and welcome to unit two of week four. Today we will continue to work on the case
study 1 - sporadic composite OOM issue -
00:00:14 and we will analyze the issue in terms of plan comparison. From the last unit, we made the
further investigation list,
00:00:23 which is plan comparison with a fast case, understanding business logic and its data flow.
00:00:30 Also checking out the compilation and execution time. So in today's unit, we will look into
plan comparison,
00:00:39 business logic and its data flow. From the composite OOM dump,
00:00:47 we can find the problematic query and its query plan.
00:00:51 To see it in a better view, you can just copy the plan and paste it into Excel.
00:00:58 Then you will see this query plan. This is the query plan that caused the OOM from the
OOM dump.
00:01:09 When you look at the OPERATOR_NAME column, there is one big COLUMN SEARCH.
00:01:14 And inside the COLUMN SEARCH, there is another big ESX SEARCH,
00:01:19 and there is LEFT OUTER JOIN. With that, we are going to draw the query plan.
00:01:26 We learned how to draw the query optimizer plan from week 2. So using the information in
OPERATOR_NAME column,
00:01:36 we can draw the query optimizer plan. As you can see, the OPERATOR_NAME column
shows the indentation.
00:01:47 So by following the indentation, you can check which operators are processed in such an
order.
00:01:56 There are two main LEFT OUTER JOINs, and the first LEFT OUTER JOIN
00:02:00 is the JOIN between LEFT OUTER JOIN and a group of operators consisting of HASH
JOIN.
00:02:07 The second LEFT OUTER JOIN is a JOIN between two HASH JOINs. Unfortunately, this
issue was not reproduced.
00:02:16 That is, OOM did not occur for the same query execution. So, then we can get the good
case,
00:02:24 which is the non-OOM case. As we discovered last week,
00:02:33 comparing the bad case and good case is another way of performance issue analysis. Now
we have a plan for the OOM case,
00:02:43 So we can get a plan for the non-OOM case. So the company A reproduced the issue
00:02:50 by executing the same query as the OOM case. To compare the good case and bad case,

00:02:57 we need a plan for the non-OOM case. Using the knowledge from week two - explain plan
part -
00:03:04 we can find out the plan for the non-OOM case. After the query execution,
00:03:10 firstly we need to find out the plan ID of the reproduction scenario in
M_SQL_PLAN_CACHE,
00:03:18 and use the SQL command to see the stored existing explain plan. So here, we use these
SQL commands.
00:03:27 Since company A knew the value of STATEMENT_HASH after they executed the query,
00:03:33 they used STATEMENT_HASH in order to find the plan ID of the reproduction scenario.
00:03:41 After they found the plan ID, they ran the following SQL to see a plan of the reproduction
scenario.
00:03:52 After that, we could see this query plan. Like the OOM case,

3
00:04:00 using the information in the OPERATOR_NAME column, we can draw the following query
optimizer plan.
00:04:07 So we can get the overview of the query plan for the non-OOM case. Let's compare two
cases.
00:04:18 On the left-hand side is the OOM case, which is the bad case, and on the right-hand side is
the non-OOM case, which is the good case.
00:04:26 These plans look very similar, but there is one different point,
00:04:31 which is the FILTER location. When you look at the OOM case,
00:04:36 the two FILTERs are located at the top. However, when you look at the non-OOM case,
00:04:43 the FILTERs are pushed down under the HASH JOIN. When we go back to the query
optimizer tree,
00:04:53 in the OOM case, the FILTER is not pushed down, and it is located after HASH JOIN.
00:05:00 But as you can see, in the non-OOM case,
00:05:03 the FILTER is pushed down under the HASH JOIN. Generally speaking, JOIN can make the
result become large or small.
00:05:14 However, FILTER always makes the result smaller by filtering out the records. As you can
see the grey-color JOIN part at the left-hand side in the OOM case,
00:05:26 the HASH JOIN is a join between TEMP_TABLE A and TEMP_TABLE B. This is also for
the non-OOM case.
00:05:36 But the difference is the FILTER location. So in the OOM case, we can say lots of memory

00:05:43 was required at the HASH JOIN part. In this situation, what can be the dominant operator
and possible key reducer?
00:05:54 Let's have a look in detail at the part of HASH JOIN and FILTERs for the OOM case.
00:06:01 There are base tables TEMP_TABLE A and TEMP_TABLE B. And these are processed
with the HASH JOIN.
00:06:12 Then the result from the HASH JOIN is filtered out Again, FILTER makes the results smaller
by filtering out the records
00:06:25 With that, let's find the dominant operator We can assume that the dominant operator could
be HASH JOIN in the OOM case
00:06:38 because the only difference between these two cases is the FILTER location, and we can
imagine that this HASH JOIN needed lots of memory
00:06:47 to process data before the FILTER. Then the possible key reducer could be this FILTER.
00:06:57 Please recall the concept of possible key reducer. It is usually an ancestor operator of the
dominant operator
00:07:06 that could reduce the results from the dominant operator. But when you look at the non-
OOM case,
00:07:13 since there are FILTERs before the top HASH JOIN, the results can be reduced.
00:07:19 And the JOIN was processed between the filtered records, therefore there might be not
much memory required at HASH JOIN.
00:07:28 So OOM would not happen. So far, using the method of plan comparison,
00:07:36 we found the dominant operator and possible key reducer. Now, we can make a better plan
using an SQL hint
00:07:47 based on the observation from the good case. Let's have a look.
00:07:56 In the OOM case, in order to make the good plan like the non-OOM case, we can think
about the SQL hint by moving the operator.
00:08:08 Here, what we want to move is FILTER. So we want to make the FILTER moved to under
the HASH JOIN.
00:08:19 From the OOM explain plan, we know this step is enumerated by JOIN_THRU_FILTER.
00:08:26 Therefore, as an SQL hint, we can come up with NO_JOIN_THRU_FILTER,

4
00:08:33 which is the counterpart hint for the JOIN_THRU_FILTER. So after applying this SQL hint,
00:08:42 it becomes fast. But there is one remaining point
00:08:48 on the different plan generation for the same query. For this, we need to check the data flow
of the business logic.
00:08:59 Let's go back to explain plan and check out the data size. Last time, we just checked the
query plan itself.
00:09:08 And now, we will look into the size of each operator. As you can see, in the OOM case,
00:09:17 the TEMP_TABLE size is very small. Its size is just 1.
00:09:22 On the other hand, when you look at the non- OOM case from the reproduction scenario,
00:09:27 the TEMP_TABLEs display the different table sizes, which are pretty big.
00:09:33 That means, when the plan is compiled, the OOM case was compiled with a small size of
the table,
00:09:41 and the non-OOM case was compiled with a big size of the table. If we apply this to the
diagram,
00:09:50 we can see the diagram like this. Here, please be aware that this is the estimated size
00:09:58 when the plan is compiled, since we only have explain plan, not PlanViz.
00:10:04 So we can only get the information of the estimated size for each operator. So the
TEMP_TABLE sizes are different.
00:10:14 Why does it have a different estimated table size? Let's check out the query tree of the
procedure,
00:10:21 especially the part where OOM occurred. The OOM occurred when company A ran one
simple job.
00:10:32 But when we looked into the OOM dump, the actual query that caused the OOM was one of
parts in the procedure at line number 1680.
00:10:44 We had a definition of the procedure, and this is a diagram that describes the part of
procedure at line number 1680.
00:10:53 Here, the TEMP_TABLEs A and B actually are the table variables. And as you can see,
00:11:04 there is target table ZITEMCOST_A at the left- hand side of each JOIN. Then here, we can
get a clue.
00:11:16 So depending on this table size, the result can be changed.
00:11:21 Then we need to check where this target table comes from. Let's find out the dataflow of
company A's business logic.
00:11:34 This business logic was given by the company A. When the batch job is started,
00:11:41 the very first step of the job is to truncate all the tables. Then it inserts new values of
material costs.
00:11:50 After that, it calls the procedures. One of the procedures is about calculating material costs

00:11:57 and another procedure is to analyze the costs. And the OOM procedure was executed at
the highlighted step,
00:12:07 and according to the business logic, the previous query results are inserted into the target
table in this step.
00:12:19 With that, we can summarize our findings as follows. First, all the target tables are left tables
of LEFT OUTER JOIN
00:12:29 in the problematic procedure. Second, according to the business logic,
00:12:35 all the target tables are truncated at the beginning. Third, target tables' data is inserted
using the previous query result.
00:12:45 Lastly, depending on the target table size, the TEMP_TABLE size could be 1 or around
800,000 at the compilation step.
00:12:56 That's the end of unit two of week four. In the next unit,

5
00:13:01 we will continue to work on the case study 1 about issue conclusion. Thank you for your
attention.
00:13:08 Bye.

6
Week 4 Unit 3

00:00:05 Hello, and welcome to unit three of week four. Today, as of the last part of case study one,
00:00:13 we will continue to work on it to find out the main root cause of the issue. From the last unit,
we could summarize our findings.
00:00:23 First, all the target tables are left tables of LEFT OUTER JOIN in the problematic procedure.
Second, all the target tables are truncated at the beginning in the business logic.

00:00:38 Third, target tables' data are inserted using previous query results. And lastly, depending on
the target table size,
00:00:48 the temp table size could be small or very large at the compilation step. With the findings
from the last unit, we need to check the timeline of each process.
00:01:02 In order to check the query compilation and execution timeline, we can search the
information in
00:01:08 M_SQL_PLAN_CACHE or HOST_SQL_PLAN_CACHE. However, there is no plan cache
entry found in M_SQL_PLAN_CACHE
00:01:22 at the time of the OOM issue. Therefore, we can use HOST_SQL_PLAN_CACHE instead.
00:01:31 In order to search HOST_SQL_PLAN_CACHE, we need to know at least the statement
string or statement hash.
00:01:41 For this information, we can use another monitoring view called
M_OUT_OF_MEMORY_EVENTS.
00:01:50 When you search the monitoring view M_OUT_OF_MEMORY_EVENTS, you can find the
statement hash that caused OOM
00:01:58 and also you can check the time of the OOM. So, we have found the statement hash value,

00:02:07 and also we could check the OOM occurrence time. Okay, now we know the statement
hash value,
00:02:16 so we can search HOST_SQL_PLAN_CACHE using this information. As an SQL statement
identifier, statement hash can be used
00:02:29 to retrieve the performance data from the same SQL statement from other monitoring views.
So here, we search statement string, statement hash,
00:02:43 last preparation timestamp, last execution timestamp, and is valid column from
HOST_SQL_PLAN_CACHE with this statement hash value.
00:02:57 Then you can find this information. If we make the timeline, then it will be like this.
00:03:09 So the OOM query was compiled on 18th of February around 3pm, and this compiled plan
was used for the execution.
00:03:19 And with the business logic, we can assume that there might be data insertion between the
time when the query was compiled
00:03:28 and the time when the query was executed So the execution was on 20th of February at
1:27
00:03:38 and right after the query execution, OOM occurred. Let's recall the business logic again.
00:03:47 When the batch job is started, all the target tables are truncated, and the procedure is run,
and the results from the previous query
00:03:56 are inserted into the target tables. And let's look at the timeline of the process.
00:04:05 Then, the following plan was compiled on 18th of February around 3pm, and this plan was
used for the execution, and OOM occurred on 20th of February.
00:04:18 Here, the unknown point is that the temp_table size is 1 when the query plan was
generated.
00:04:25 That implies, the tables, including target table, might not have had any data at the
compilation step, considering business logic.

7
00:04:35 Therefore, we need to check the time when the target table was truncated. And also, we
can build the following assumption.
00:04:47 OOM plan and non-OOM plan are different. In OOM plan, the temp_table size was 1.
00:04:57 And, when we check the business logic, the target tables are truncated at the very
beginning of the job.
00:05:07 And, the target tables are located at left side of LEFT OUTER JOIN when we check the
query structure of a problematic part of the procedure.
00:05:20 And through the information of the HOST_SQL_PLAN_CACHE, we found that the
problematic query was compiled
00:05:28 two days before the OOM query execution. Therefore, we can build the following
assumption, such as
00:05:39 what if the target table size was 0 and the plan was compiled with a different table size? So
let's assume that the target table size is 0.
00:05:52 If the target table size is 0, then the result of the LEFT OUTER JOIN would be 0, since the
target table is the left table in LEFT OUTER JOIN.
00:06:03 Then, the temp_table size would also be 0. The final result of the LEFT OUTER JOIN would
be 0 again.
00:06:14 So the plan could be compiled with different values of table. Then here, it is important to
check when the target table was truncated.
00:06:26 In order to check the target table truncation timeline, we can find information in the
monitoring view called M_EXECUTED_STATEMENTS.
00:06:36 This monitoring view provides all the statement executions of DDL operations, including
TRUNCATE.
00:06:45 So we can use this monitoring view to check truncation time. Then, as you can see, we
could check the truncation time of the target table.
00:06:59 You can also check the statement string, so you can confirm the target table truncation.
Then the timeline of all the process would be like this.
00:07:16 When we recall the business logic, the batch job was started, and target tables were
truncated,
00:07:23 and while the problematic part of procedure was running, OOM occurred.
00:07:30 So if we draw the timeline then it will be like this. The target table truncation was done on
18th of February around 3pm
00:07:41 and the plan was compiled right after the table truncation. That is, we can conclude that the
plan was compiled with small sizes of the tables,
00:07:52 and this plan was used for the query execution. And this might not have been appropriate
for the table sizes when the query was executed.
00:08:03 Therefore, OOM occurred. To conclude, this OOM issue was due to the timing
00:08:12 of the truncation, insertion, and execution. The problematic query was compiled with
different data,
00:08:20 therefore the inefficient plan was generated and this led to OOM. Through this case study,
we now know that the data size at query compilation
00:08:32 is important for the query execution. Since different data sets and compiled plans
00:08:39 may lead to different JOIN orders or query processing, we need to make sure the query is
compiled with correct data in correct timing.
00:08:50 That's the end of the case study one and unit three of week four. In the next unit, we will
work on another case study on query processing.
00:09:01 Thank you for your attention, and looking forward to meeting you in the next unit.

8
Week 4 Unit 4

00:00:05 Hello, and welcome to unit four of week four. Today, we will work on the second case study

00:00:12 and we will find out the dominant operator and possible key reducer in this case study. The
case study is about performance slowness for one query.
00:00:26 There is a reproducible scenario, so we will reproduce the issue for performance analysis.
00:00:35 And the goal is to collect a useful trace, and using the trace, we need to find a way to
improve the query performance to less than 100ms.
00:00:48 This is the reproduction scenario of the issue. First, you need to create the data in order to
reproduce the issue.
00:00:56 Here, we will use RAND functions to generate data, and we commit and merge delta. After
that, you can run the problematic query.
00:01:07 We used the RAND function, so you might have different data, but it does not matter in this
case,
00:01:14 since this function is used to make large volume of data. You can find more information on
that on the system information page of this course.
00:01:29 This query only returns 20, rows but it took around two seconds to retrieve the data. So for
this second case study, the goal is to reduce this execution time to under 100ms.
00:01:46 For the performance issue analysis, the first thing to do is to collect PlanViz
00:01:51 in order to understand the execution tree and how data is processed from the table. Since
the PlanViz shows the query execution process in graphical view,
00:02:03 this is the most useful tool to check out the query execution. In order to capture the
visualized plan, you need to select the query,
00:02:13 right click, and choose Visualize Plan, then click Execute. Then you will see this optimized
plan.
00:02:24 You can see the overview of plan for this problematic query. At the bottom, there is column
search and analytical search.
00:02:35 From the left side of column search, 20 rows are returned, and from the right side of
analytical search, around one million rows are returned.
00:02:45 And those results are processed at another column search which is located at the top
hierarchy.
00:02:53 And final results, 20 rows are generated. Since the logical plan shows the shape of the
query optimizer tree
00:03:04 and it contains structural information, we will check out the logical plan first.
00:03:11 Then, you will have this inner plan. We called the bottom left column search column search
1,
00:03:19 and bottom right analytical search as column search 2, and the top column search as
column search 3.
00:03:30 The two column searches, 1 and column search 2, which is processed at the bottom, their
results are processed at top column search
00:03:42 And we can draw an optimized plan based on the information from the PlanViz. In column
search 1, table T1 is processed at the very beginning,
00:03:55 and the data from T1 is ordered by T1.A in ascending order. After going through the LIMIT
operator, only 20 rows are returned in column search 1.
00:04:08 And in the column search 2, firstly, table T2 and T3 are processed with INNER JOIN. And
the result from the JOIN is processed by the GROUP BY operator.
00:04:22 Lastly, in column search 3, those two results from column search 1 and column search 2
are processed with LEFT OUTER JOIN,
00:04:31 and the final result is processed with the ORDER BY operator. With that, let's find out the
dominant column search.

9
00:04:42 With the knowledge from week two, visualized plan, we can find the dominant operator by
following the highlighted orange line.
00:04:50 So, we know this analytical search is the dominant column search. As another way, you can
also find the dominant column search
00:05:02 by comparing the time consumed to process it. As you can see, the exclusive time of
analytical search indicates
00:05:14 that this is the dominant column search because it has the longest exclusive time among
the column searches.
00:05:27 And let's check out the result size of each column search. Here, the analytical search
generates more than one million rows,
00:05:36 and it is obvious that it takes long to process one million pieces of data compared to other
column searches.
00:05:45 Therefore, we can say this analytical search is the dominant column search that caused the
performance slowness.
00:05:57 Let's go back to the diagram of the query optimizer tree and PlanViz. Now, we will check out
its dataflow.
00:06:06 So there are three main column searches, and when we check out its processing time and
generated data,
00:06:15 the most time-consuming column search is column search 2. Therefore the most dominant
column search is analytical search, which is column search 2.
00:06:34 In order to check out the dataflow of dominant column search, let's check out its inner plan
of the dominant column search.
00:06:47 Then you will see this query plan. Let's check out the other column search's inner plan.
00:06:58 Then, as you can see, in the column search 1, the result of T1 is processed with the
ORDER BY operator and a limit is applied.
00:07:09 And when you look at the column search 3, it joins column search 1 and column search 2 ,

00:07:17 and it processes ORDER BY at the end. From today's unit, we reproduced the issue
00:07:25 and collected visualized plan to understand the query structure. With the visualized plan, we
also found the dominant column search.
00:07:36 With that, in the next unit, we will check out the dataflow of this plan and find SQL hint.
Thank you for your attention.
00:07:47 Bye.

10
Week 4 Unit 5

00:00:02 Hello, and welcome to unit five of week four. In this unit, with the findings from the last unit,
00:00:10 we will continue to work on the case study two. From the last unit, we checked out the query
structure
00:00:20 and found that the column search 2 is the dominant column search. Now, with this diagram,
let's check out how the data is processed.
00:00:32 For better understanding, we use the sample data here. So this group of the square box
means a table.
00:00:41 Let's say there are records A, B, C, D, and E. I drew only five rows here for convenience,
but there are 100,000 rows in this case.
00:00:55 So firstly, 100,000 records are extracted from the table T1. These records are processed
with the ORDER BY operator.
00:01:07 By going through ORDER BY operator, it sorts the records. After that, limit operator is
applied.
00:01:18 Then, this column search 1 only returns 20 rows in total. Therefore, its exclusive time is fast.

00:01:28 Now, let's find out how data is processed in column search 2. So there is a table T2 and T3.

00:01:39 Here again, for convenience, I just described the tables with five rows. So there are records
A, B, C, D, and E in table T2,
00:01:54 and there are records 1, 2, 3, 4, and 5 in T3. These tables are joined in INNER JOIN.
00:02:05 INNER JOIN can reduce the results. But here when we check this part in PlanViz,
00:02:11 this INNER JOIN generates huge intermediate results. So when this JOIN is processed, this
column search 2 generates a huge intermediate result.
00:02:26 And this intermediate result is processed with GROUP BY. In the GROUP BY stage, let's
assume the first column is the grouping column,
00:02:39 then it groups the columns. Then the following results remain.
00:02:48 However, after this GROUP BY, the result was not reduced. Normally when GROUP BY is
processed, the result is reduced.
00:02:59 However, this GROUP BY still returns lots of records, and this is a bottleneck at this
process.
00:03:08 Even though this GROUP BY still returns lots of records, we suspect that the most time-
consuming part would be the INNER JOIN,
00:03:17 since it generated huge intermediate results. After column search 1 and 2 are processed,
00:03:25 with these results, LEFT OUTER JOIN is processed. Let's say these two columns are JOIN
keys.
00:03:34 JOIN key is a column used to process the JOIN in JOIN condition So here, in order to
process LEFT OUTER JOIN,
00:03:42 firstly the JOIN key is matched. Since this is LEFT OUTER JOIN, it returns all the rows in
the left table
00:03:55 and all the matching rows found in the right table. In order to check all the possibilities
regarding this issue,
00:04:04 we are checking matched records and non-matched records. Depending on the actual
records, there can be only matched records
00:04:13 or there can be non-matched records. So here, if there are non-matched rows in the right
table,
00:04:24 then null values are returned. So we could know that the intermediate results can be
reduced
00:04:33 by this LEFT OUTER JOIN because the left child has only 20 rows. Therefore, we can tell
this LEFT OUTER JOIN is a possible key reducer,

11
00:04:47 which is the part that reduces the results and makes the performance improvement. But if
there are many matched rows, then the result won't be reduced.
00:05:00 But here, there are only 20 rows matched, therefore the result is reduced. From last week,
you remember that we can try SQL hint in order to move the operator.
00:05:16 Here, we can try SQL hint JOIN_THRU_AGGR to move this possible key reducer into
column search 2.
00:05:27 So, let's try SQL hint JOIN_THRU_AGGR. Here, we specify the SQL hint,
JOIN_THRU_AGGR,
00:05:36 as well as IGNOT_PLAN_CACHE to not store this plan in SQL plan cache. Then the
execution time is around 1.68 seconds,
00:05:46 which is slightly faster than original, but it is still not enough to achieve our goal,
00:05:52 which is less than 100ms. To clarify the structure, we can check the inner plan of column
search.
00:06:03 As you can see, the LEFT OUTER JOIN is still there. It has the same structure as originally.

00:06:11 The LEFT OUTER JOIN isn't moved under column search 2. Therefore, the hint does not
work.
00:06:20 So today, as a workaround, we tried SQL hint JOIN_THRU_AGGR, but it does not change
the plan.
00:06:29 That's the end of unit five. In this unit, we tried SQL hint as a workaround, but it did not
work.
00:06:39 In the next unit, we will continue to work on the case study two and simulate the query why
the SQL hint does not work.
00:06:48 Thank you for your attention. Bye.

12
Week 4 Unit 6

00:00:05 Hello and welcome to unit six of week four. In the last unit,
00:00:10 we checked out how data is processed and tried the SQL hint to make the performance
better.
00:00:16 However, the SQL hint JOIN_THRU_AGGR did not work. In today's session, we will
simulate the query processing with the SQL hint.
00:00:30 In the last unit, we tried the SQL hint JOIN_THRU_AGGR but it did not work.
00:00:37 That is, the LEFT OUTER JOIN was not pushed down to column search #2 and GROUP BY
is still located in column search #2.
00:00:47 In unit two of week three, we discovered that the hint may not lead you to the desirable plan

00:00:54 because it is logically impossible to move the operators or a different plan than what you
want to make is chosen by cost.
00:01:06 So when it is logically impossible, it means, when the optimizer expects the results
00:01:11 after they move the query operator using the hint, there can be a case to return different
query results.
00:01:19 Therefore, the optimizer does not allow to move the operator. And when we checked the
plan after JOIN_THRU_AGGR,
00:01:30 it was not changed, so it was same plan as the original.
00:01:36 So we can assume that the reason why the hint was not applied was due to the first reason,

00:01:43 which is, moving the hint is logically impossible, and there can be chance that the result
may be changed after applying the hint.
00:01:53 Therefore, the optimizer didn't allow the hint to move the operator. So in order to check
whether this is true,
00:02:03 let's try a simulation. As a solution to the performance issue,
00:02:08 we can try to rewrite the query based on the desired plan. So let's imagine the plan shape
and simulate the dataflow
00:02:17 using the query plan that we want, which is moving down LEFT OUTER JOIN.
00:02:28 This is a plan that we want to make, so pushing down the LEFT OUTER JOIN,
00:02:33 and placing GROUP BY after LEFT OUTER JOIN. Then, one single column search is made
based on the column search processing order,
00:02:44 which is table join GROUP BY and ORDER BY. Firstly, there is nothing to do with column
search #1,
00:02:51 so column search #1 remains the same. But for column search #2,
00:02:57 since GROUP BY is placed after join, then one big column search can be made by column
search availability.
00:03:04 Therefore, there will be no more three-column searches, there will be only two-column
searches.
00:03:13 This is the plan that we want, and we will simulate the query result using this desired plan.
00:03:21 So, those tables are processed from the bottom. There are only three rows in this example
for convenience.
00:03:30 But in a real case, there will be 20 rows from column search #1. So please understand this.

00:03:39 The black and grey records are the results from column search #1. The data was extracted
from T1,
00:03:47 and the ORDER BY and LIMIT operations are applied. And when you look at the colored
records,
00:03:53 this is the result after INNER JOIN between T2 and T3. Then, with these results,

13
00:04:02 we will process LEFT OUTER JOIN. Firstly, it checks its join key column.
00:04:10 Then, through the join key, there will be matched records.
00:04:17 Then, as you can see, the matched records are returned,
00:04:20 and for the non-matched records, it will be returned as null value
00:04:25 since it is LEFT OUTER JOIN. Now, the GROUP BY operation is processed.
00:04:34 Within the GROUP BY operation, there is the grouping operation and sum operation.
00:04:43 From the LEFT OUTER JOIN, the non-matched records are returned as null values. And
these null values now will be compared with T2.E and T3.H
00:04:55 due to the CASE WHEN clause in the SUM expression. However, the null value cannot be
compared with any other value.
00:05:06 Therefore, this CASE WHEN clause would return 0 when it is compared with T2.E and
T3.H.
00:05:16 Then, the SUM operation will contain only 0, and the sum of 0 will be 0.
00:05:27 Therefore, we can get these results. So, let's compare the result from the desired plan
00:05:36 and the result from the original plan. As you can see,
00:05:40 the final result is changed when we check with the original plan.
00:05:45 If the hint is applied as we want, LEFT OUTER JOIN is processed first
00:05:51 and the GROUP BY is processed after the join, therefore it returns 0.
00:05:57 On the other hand, when you look at the original query and its processing,
00:06:02 the aggregation is processed first and the LEFT OUTER JOIN is applied.
00:06:07 Therefore, it returns the null value. So, GROUP BY cannot be moved
00:06:14 since it changes the result. If it returns the same result,
00:06:18 then it may be possible to move the GROUP BY. That was the end of unit six of week four.

00:06:26 In this unit, we simulated the query result with SQL hint and found the reason why the hint
was not applied.
00:06:35 That is because there is a chance that it returns different query results. In the next unit,
00:06:42 we will find out the solution for this case study. Thank you for your attention
00:06:48 and looking forward to meeting you there. Bye.

14
Week 4 Unit 7

00:00:05 Hello and welcome to unit seven, the last unit of this course week,
00:00:09 and the last unit of this openSAP course. To continue on case study two,
00:00:16 we will find out the solution in this unit. In unit four of week four,
00:00:24 we found that the dominant column search is column search #2. And here, the right table is
big in LEFT OUTER JOIN in column search #3.
00:00:34 Actually, the data comes from the left table by matching the result in the right table. If there
are no matched results,
00:00:42 it only brings the results of left table. In this case,
00:00:48 "What if we make matched records to be processed in column search #2 in order to reduce
the number of records to be materialized in column search #2?"
00:01:03 If the matched results come from column search #2 first, before processing LEFT OUTER
JOIN,
00:01:10 then we assume that the result size of column search #2 would be small and the processing
time would be fast.
00:01:21 Then, let's see how we can make the matched records. We can use INNER JOIN.
00:01:30 By adding INNER JOIN, we can get the matched records from column search #2 before
LEFT OUTER JOIN,
00:01:37 then the result size of column search #2 would be reduced. Therefore, the record size that
is processed in LEFT OUTER JOIN would be smaller.
00:01:49 When we look at the matched result, since this is LEFT OUTER JOIN,
00:01:53 the matched records come from INNER JOIN with T1. Therefore, as a join key,
00:02:00 we can use T1.A = T4.C. If we make an additional INNER JOIN within column search #2 ,
00:02:09 then we can get the matched records first before it is processed by LEFT OUTER JOIN.
00:02:17 Therefore, we can rewrite a query. We know that LEFT OUTER JOIN is a possible key
reducer
00:02:23 and the matched records after LEFT OUTER JOIN are small. So we want these matched
records to be processed earlier than LEFT OUTER JOIN.
00:02:33 That is, in a subquery. As we checked in the previous slides,
00:02:39 we can find the matched records with INNER JOIN and we want to put this INNER JOIN
into column search #2.
00:02:47 And in this query, the column search #2 is the subquery part. Therefore, we can add INNER
JOIN within the subquery.
00:02:59 So we can rewrite a query like this. By making an additional INNER JOIN,
00:03:06 this join will find the matched records first. Then we can reduce the intermediate results to
process this query,
00:03:14 and this results in fast execution. If we add the INNER JOIN in column search #2,
00:03:23 as you can see, we can get this query tree.
00:03:28 When you check out the PlanViz, you can find the result size of column search #2 is now
reduced.
00:03:38 Also the execution time is reduced under 100 milliseconds, which is 11 milliseconds.
00:03:47 And the inner logical plan also shows that INNER JOIN is added in column search #2.
Finally, we could get the desired plan that we want,
00:03:57 and execution time is reduced by rewriting the query. Therefore, adding INNER JOIN in
column search #2
00:04:07 reduced the intermediate results and execution time. With this, we've come to the end of
unit seven

15
00:04:15 and the end of this openSAP course. I would like to thank my colleagues Jinyeon and
Sukhyun
00:04:23 who helped to prepare and present this course. We hope you enjoyed learning with us.
00:04:32 Thank you for your attention, and we wish you now good luck for the final exam.
00:04:38 Thank you and goodbye.

16
www.sap.com/contactsap

© 2020 SAP SE or an SAP affiliate company. All rights reserved.


No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company.

The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its distr ibutors contain proprietary software components of other software vendors.
National product specifications may vary.

These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable
for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and serv ices are those that are set forth in the express warranty statements
accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

In particular SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any relat ed presentation, or to develop or release any functionality
mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possibl e future developments, products, and/or platform directions and functionality are
all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation
to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from e xpectations. Readers are
cautioned not to place undue reliance on these forward-looking statements, and they should not be relied upon in making purchasing decisions.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trade marks of SAP SE (or an SAP affiliate company) in Germany and other
countries. All other product and service names mentioned are the trademarks of their respective companies. See www.sap.com/copyright for additional trademark information and notices.

You might also like