You are on page 1of 15

Parallel Processing in ABAP

Every software and programming language has its own limitations and restrictions beyond which a developer can not achieve some expected output. But luckily there are also some indirect ways or workarounds to come closer to desired output. I had a tough time convincing our customer reg. their performance expectations from SAP system and its runtime because of their un-reasonable calculations. They every time compare SAP with their very small single business process oriented non-SAP systems. And generally the expectation would be some thing like - if a sales order for example in file based legacy or non-SAP system can be created in 1 sec; then this should also be possible in SAP ERP. It is sometimes difficult to convey that everything can not be achieved within fraction of few seconds as an ERP system is logically and technically related to so many inter-dependent components; though they are already aware of this fact and in fact this is mainly the reason many big customers go for SAP. They get many of functionalities from Planning to HR from a single powerful box. But as per business requirement they always also demand for speed and performance. In our case, the requirement was to create some 500 outbound deliveries for Sales orders in time frame for 1 hour because Invoices would be created in their non-SAP system and their trucks will be waiting outside of the warehouse for the Invoice Printouts. Since we initially processed the whole functionality sequentially, the total response time was much far from their desired time. Hence even after fine tuning our code, we were quite sure that this is limit beyond which we can not go further. But, finally we decided to go by parallel processing using aSync RFC approach. With this method, it is possible to run single functionality in multiple running work processes at the same time within ABAP system. So if a program takes 1 min to create 1 sales order and 20 mins to create 20 sales orders, you could instead go by parallel processing of this program so that you can achieve the same results (20 sales orders) in 4 mins by equally distributing your 20 sales orders to 5 processes which will run in parallel. Parallel Processing is very nicely explained in SAP help; also some documents from sdn helped us to initially properly configure it. There were also some suggestions which could be configured from system point of view so that load can be equally distributed on different application server. However we can't always expect our customers to have rich system environment with many servers. So in this case, programming should also be intact to achieve best possible results. Coming back to our scenario, initially we were able to successfully test it and we found some interesting results which were close to targets, but still we could not meet the exact response times. So the exact challenge was not in making the program parallel process enabled, but however to decide a dynamic way of processing these work processes simultaneously. There were two things we had to keep in mind - the number of work processes should vary dynamically based on number of load in the system or based on number of users working in the system. And second was to make sure that our program is able to handle the situation of locking if same data (example same sales orders in diff deliveries) is conflicting to other running processes which might have locked the data for long time. I have described here 3 different programmatic flavors of parallel processing, so that based on the needs; one can decide which one is more suitable for the situation. Even all of 3 can also be combined. Before going into details - Some Common steps:-

Define the RFC group in RZ12 (if needed) which should be dedicated to our program - This should be configured in such a way that it does not occupy many work processes in the system and it could probably run on different application servers in case of lack of resources. This is generally done by Basis person.
*Inside the program initialize the group and see how many work *processes are free & accordingly divide the data CALL FUNCTION 'SPBT_INITIALIZE'

EXPORTING group_name IMPORTING max_pbt_wps free_pbt_wps

= 'RFC_GROUP' = lv_max_wps = lv_free_wps.

Optional - Second level of check from z-customization table to see how many processes should be taken in our program for distributing contents- This is generally available for Business users, so that in case if they find that program is being run for very small number of data and if many users are working in the system or if the amount of data is too high and more no. of work processes should be run to achieve the faster results then they would probably change the parameter in customization table.
SELECT tasks FROM z_cust_tasks INTO gv_tasks WHERE rfcgroup = 'RFC_GROUP'. "Just a field ENDSELECT. *But if user has set more processes than currently available in the system IF lv_tasks > lv_free_wps. lv_tasks = lv_free_wps. ENDIF.

Now we can move to actual aRFC call to achieve parallel processing. In the example below, sales order is consider as object for distribution.

1st Simple Pattern - Divide and Rule :


Divide the number of objects/lines (example sales orders) into equal number of packets and then distribute these packets to multiple different available tasks/processes. In this example , if we have 20 sales orders and if we know that there are 5 free available work processes then we will distribute 20/5 = 4 sales orders in each of these tasks. So the ABAP code will be something like:
*Check how many lines are to be processed Describe table lt_sales_orders lines lv_lines. *decide how many lines or sales orders; we will submit in one *task/process lv_packet_size = lv_lines / lv_tasks. *Collect lines to go in every process based on packet size lv_from = 1. "Initially for first packet lv_to = lv_packet_size. "For first packet *First packet LOOP AT lt_sales_orders FROM lv_from TO lv_to. APPEND lt_sales_orders TO lt_so_for_one_packet. ENDLOOP. DO. *The FM create deliveries for input sales orders CALL FUNCTION 'Z_DELIVERIES_IN_PARALLEL' *to process in different task by aRFC way STARTING NEW TASK gv_taskname DESTINATION IN GROUP 'RFC_GROUP' PERFORMING task_return_deliveries ON END OF TASK TABLES Sales_orders = lt_so_for_one_packet EXCEPTIONS RESOURCE_FAILURE = 1 system_failure = 2 communication_failure = 3 OTHERS = 4. CASE sy-subrc. WHEN 0. *Administration of asynchronous tasks gv_taskname = gv_taskname + 1. gv_send_tasks = gv_send_tasks + 1. REFRESH lt_so_for_one_packet. *take next packet of lines for processing in subsequent tasks lv_from = lv_to + 1. lv_to = lv_from + lv_packet_size - 1. LOOP AT lt_sales_orders FROM lv_from TO lv_to. APPEND lt_sales_orders TO lt_so_for_one_packet.

ENDLOOP. IF sy-subrc NE 0. EXIT. "when all the lines are processed, exit from do loop ENDIF. WHEN 1 OR 2 OR 3. lv_seconds = '1'. DO. *wait till we get all the results but not more than one min *gv_recv_task will be incremented in RETURN subroutine which *is automatically called on the end of submitted task WAIT UNTIL gv_recv_tasks = gv_send_tasks UP TO lv_seconds SECONDS. If lv_seconds = 60. *Report error or wait for long time and then exit from do loop Exit. Endif. ENDDO. ENDCASE. ENDDO.

This is simple approach and should be taken when you are sure that processing of every parallel process task will take more or less same time. Here the packet size remains almost same for every task. Advantage here is that; if there are 5 parallel tasks then always there will be 5 RFC connections (As the RFC FM will be called 5 times). But the disadvantage is what if some of the process e.g. 3 gets finished first and then 2 nd, 1st and finally 5th. The program has to wait till all the 5 work processes are returned to calling place as the further task is dependent on the results of this. This might look like :-

2nd Pattern - Arrange and Submit


A small piece of ABAP code to above approach can help us divide the packets to each parallel process in smarter way. This actually depends also on the fact that how well actual business functionality and some related factors will support the arrangement of packets. For example, in the above case, one outbound delivery having 200 items will of course take more processing time than the other delivery having 2 items. So in the above case, if I can make sure that I submit the packet having 200 items first and probably remaining packets of smaller sizes together in subsequent tasks then probably I would achieve better results.

So the ABAP code(small algorithm of highest and lowest comb) will be now be:
*table lt_total_items_desc contain the sales order and no. of items *in sales order arranged in descending order Sort lt_total_items_desc by no_items descending. *Make sure that same lines/sales orders of lt_total_items_desc are *not present in lt_total_items_asc lv_index = lv_tasks + 1. Loop at lt_total_items_desc from lv_index. Append lt_total_items_desc to lt_total_items_asc. Endloop. {color:blue} *Arrnage this table in ascending order, so that sales order with *lowest no. of items is present in 1st line and so on.. {color} Sort lt_total_items_asc by no_items ascending. *Prepareation for First Packet lv_index_desc = lv_index_asc = 1. Read table lt_total_items_desc index lv_index_desc. Append lt_total_items_desc-vbeln to lr_range_vbeln. *This will give highest no. of items in Sales order and this will be *submitted first; we will treat as a base for deciding the size of *remaining packets lv_max_count = lt_total_items_desc-no_items. Do lv_tasks times. Loop at lt_sales_order where vbeln in lr_range_vbeln. APPEND lt_sales_orders TO lt_so_for_one_packet. Endloop. ------> call aRFC FM exporting sales orders for current packet If sy-subrc eq 0. *Now start combining lines of input tables in such a way so that *we get the best mix of processing load in every packet Refresh lr_range_vbeln, Lt_so_for_one_packet. Clear : lv_count_desc , lv_count_asc . *Index to get next highest sales order lv_index_desc = lv_index_desc + 1. Read table lt_total_items_desc index lv_index_desc. lv_count_desc = lt_total_items_desc-no_items. Append lt_total_items_desc-vbeln to lr_range_vbeln. *Take next lowest sales orders to combine with above *sales order till we reach to highest count Loop at lt_total_items_asc from lv_index_asc. lv_count_asc = lv_count_asc + lt_total_items_asc-no_items. *if combination of next higher + next lower sales order is *more than the one with highest no. of items, then submit *the current collected combination for parallel processing If (lv_count_desc + lv_count_asc) > lv_max_count. EXIT."exit from loop Else. lv_index_asc = lv_index_asc + 1. Endif. Append lt_total_items_asc-vbeln to lr_range_vbeln. Endloop. Endif. EndDo.

Though the above example is very much scenario dependent and no. of items in sales order is taken as base line example; but in general the similar kind of algorithm can be applied to get the best mix and match for any packet for different scenarios also. Processing for Parallel Tasks after above arrangement:

3rd Pattern - First Come, Second Serve Basis


For the above case; we had some criteria to determine number of lines to be submitted for each task for parallel processing. However this case may not be applicable for all the business scenarios. And improper arrangement of lines for parallel processing can instead lead to more time than expected. So, in this approach we will not collect multiple lines to be processed by one task or one aRFC FM. However we can submit just single line/object to aRFC FM and submit next when any of the already submitted task returns back. For example we will initially submit 5 sales orders in 5 consecutive parallel tasks and as soon as any one of the submitted sales order is processed, we will submit the 6th one in the already finished task itself. This will continue till all the sales orders are processed. Remember we still have just 5 free work processes and we have to manage with this number for processing 20 sales orders.
Loop at lt_sales_orders . *one packet = one sales order only Append lt_sales_orders to lt_so_for_one_packet. *call aRFC Fm for single object or sales order only CALL FUNCTION 'Z_DELIVERIES_IN_PARALLEL' STARTING NEW TASK gv_taskname DESTINATION IN GROUP 'RFC_GROUP' PERFORMING task_return_deliveries ON END OF TASK TABLES Sales_orders = lt_so_for_one_packet "->only one line EXCEPTIONS RESOURCE_FAILURE = 1 system_failure = 2 communication_failure = 3 OTHERS = 4. CASE sy-subrc. WHEN 0. *Administration of asynchronous tasks gv_taskname = gv_taskname + 1. gv_send_tasks = gv_send_tasks + 1. REFRESH lt_so_for_one_packet. gs_tasklist-taskname = gv_taskname. *Table gt_tasklist contains names and status of every task APPEND gs_tasklist TO gt_tasklist. *After first round of parallel processing, taskname is taken *from old submitted tasks IF lv_tasks > 0. gv_taskname = gv_taskname + 1. lv_tasks = lv_tasks - 1. ENDIF. WHEN 1 OR 2 OR 3. lv_seconds = '1'. DO. *wait till we get all the results but not more than one minute

WAIT UNTIL gv_recv_tasks = gv_send_tasks UP TO lv_seconds SECONDS. ENDDO. ENDCASE. *Example if first 5 parallel processes are started already, then wait *for any one which gets finished and submit the next packet to it IF lv_tasks = 0. *this will be set as soon as any already submitted task is returned *from RETURN subroutine..see below WAIT UNTIL gv_received >= 1. READ TABLE gt_tasklist INTO gs_tasklist WITH KEY processed = 'F'. IF sy-subrc EQ 0. *keep the same name for new task gv_taskname = gs_tasklist-taskname. DELETE gt_tasklist INDEX sy-tabix. *Adjust the count of received task, as we will be sending new task gv_received = gv_received - 1. ENDIF. ENDIF. Endloop.

The subroutine task_return_deliveries might look as follows:


Form task_return_deliveries using taskname. .... RECEIVE RESULTS FROM 'Z_DELIVERIES_IN_PARALLEL'... *gv_recv_task will hold total number of received packets gv_recv_tasks = gv_recv_tasks + 1. "Receiving data *gv_received will be adjusted every time a new packet is sent *at the place of old one gv_received = gv_received + 1. READ TABLE gt_tasklist WITH KEY taskname = uv_taskname INTO gs_tasklist. IF sy-subrc = 0. "Register data gs_tasklist-processed = 'F'. "finished MODIFY gt_tasklist INDEX sy-tabix FROM gs_tasklist. ENDIF. Endform.

The obvious advantage of this approach is that, you will be properly utilizing every work process and these will

share equal load. However, number of RFC connections will be more in this case, as for every line/object, aRFC will be called. Hope these examples will help in writing better programs to deal with parallel processing.

Comments: How did you handle locking error? In the particular scenario , one same sales order could belong to multiple deliveries. The coding had to adjust little bit to make sure that same or common sales orders will go in same thread so that two parallel running threads do not fight for lock. But still since it can't be 100% guaranteed; we also introduced the wait logic. i.e. we made a check before hand (before the program of delivery would have given this error) to see if there is lock and if yes; then we wait for some time(waiting time is again customizable). At the end if we find wait time for release of lock is beyond the limit mentioned in the customization, then we give the error. So basically the try was to avoid locking issues. I would say Locking is one of the main criteria that should be considered from the very beggining before going for parallel processing.

Parallel Processing in ABAP


Every ABAPer would have definitely encountered issues regarding performance tuning. Usually we use trivial methods of optimizing the performance. In this article we would be covering topic for parallel processing in ABAP with respect to asynchronous RFC Function Modules. Slight different perspective for performance tuning. The idea behind this would be Asynchronous call of a remote-capable function module using the RFC interface. To understand the concept we would take a simple example where we would useBAPI_MATERIAL_GET_DETAIL to fetch the Material Description from the Material Number and then we will try to optimize the performance using parallel processing. So the Program WITHOUT any parallel processing would look like. _____________________________________________________________________

REPORT ZMATERIAL_WITHOUT_PARALLEL. TABLES : MARA. DATA : BAPIMATDOA TYPE BAPIMATDOA, BAPIRETURN TYPE BAPIRETURN. SELECT-OPTIONS S_MATNR FOR MARA-MATNR.
LOOP AT S_MATNR. CALL FUNCTION BAPI_MATERIAL_GET_DETAIL EXPORTING MATERIAL = S_MATNR-LOW IMPORTING MATERIAL_GENERAL_DATA = BAPIMATDOA. write : BAPIMATDOA-MATL_DESC. ENDLOOP.

_____________________________________________________________________
Here every time the during loop execution the control would wait for the Function Module to return its value only then the loop will continue with the next record.

In this case only One Work Process is busy executing your program it does not consider other work processes even if they are sitting idle. Let us have a look on the program first and then we will try to understand in detail. _____________________________________________________________________

REPORT ZMATERIAL_DISPLAY_PARALLEL. TABLES : MARA. DATA : BAPIMATDOA TYPE BAPIMATDOA, BAPIRETURN TYPE BAPIRETURN. DATA : SYSTEM TYPE RZLLI_APCL, taskname(8) type c,

index(3) type c, snd_jobs TYPE i, rcv_jobs TYPE i, exc_flag TYPE i, mess TYPE c LENGTH 80.

TYPES : BEGIN OF type_material, desc TYPE maktx, END OF type_material.


DATA : Material type table of type_material with header line.

data: functioncall1(1) type c. constants: done(1) type c value X.

SELECT-OPTIONS S_MATNR FOR MARA-MATNR.

system = parallel_generators. RFC Server Group

LOOP AT S_MATNR. index = sy-tabix. CONCATENATE Task index into taskname. Generate Unique Task Name CALL FUNCTION BAPI_MATERIAL_GET_DETAIL STARTING NEW TASK taskname DESTINATION IN GROUP system performing set_function1_done on end of task EXPORTING MATERIAL = S_MATNR-LOW EXCEPTIONS system_failure = 1 MESSAGE mess communication_failure = 2 MESSAGE mess resource_failure = 3. CASE sy-subrc. WHEN 0. snd_jobs = snd_jobs + 1. WHEN 1 OR 2.

MESSAGE mess TYPE I. WHEN 3. IF snd_jobs >= 1 AND exc_flag = 0. exc_flag = 1.

WAIT UNTIL rcv_jobs >= snd_jobs

UP TO 5 SECONDS.

ENDIF.

IF sy-subrc = 0. exc_flag = 0.

ELSE.

MESSAGE Resource failure TYPE I.

ENDIF.

WHEN OTHERS.

MESSAGE Other error TYPE I.

ENDCASE. ENDLOOP.

WAIT UNTIL rcv_jobs >= snd_jobs.

loop at material. write : material-desc. endloop.

form set_function1_done using taskname.


rcv_jobs = rcv_jobs + 1.

receive results from function BAPI_MATERIAL_GET_DETAIL IMPORTING MATERIAL_GENERAL_DATA = BAPIMATDOA. BAPIRETURN = BAPIRETURN. functioncall1 = done.

APPEND bapimatdoa-matl_desc TO material. ENDFORM.

_____________________________________________________________________ Things to keep in mind, before we start coding.

RFC server group

For group, you must specify a data object of the type RZLLI_APCL from the ABAP Dictionary. This is usually one of the RFC server group created in transaction RZ12. In our case it is parallel_generators. For each asynchronous RFC where the group is specified, the most suitable application server is determined automatically, and the called function module is executed on this. Below is the values configured for the Server Group which we are using.

SYNTAX :

CALL FUNCTION func STARTING NEW TASK task

DESTINATION {dest|{IN GROUP {group|DEFAULT}}}] Parameter list [{PERFORMING subr}|{CALLING meth} ON END OF TASK].

With this statement, you are instructing the SAP system to process function module calls in parallel. Typically, youll place this keyword in a loop in which will divide up the data that is to be processed into work packets. Calling program is continued using the statement CALL FUNCTION, as soon as the remotely called function has been started in the target system, without having to wait for its processing to be finished. CALL FUNCTION BAPI_MATERIAL_GET_DETAIL STARTING NEW TASK skname - . It creates Different task name TASK in a separate work process. Each such task executes formset_function1_done in a separate work process. We have defined a subroutine set_function1_done as the callback routine, which is executed after terminating the asynchronously called function module. For subroutine, you must directly specify a subroutine of the same program. For method, you can enter the same specifications as for the general method call.

The statement WAIT UNTIL rcv_jobs >= snd_jobs

makes sure that all the call to the

asynchronous RFC call has been completed after which we are ready to write the remaining logic for the program. Result : The run time analysis of both the program are given as below. Both the Program was executed for 200 material numbers as input. With parallel Processing

Without Parallel Processing

Clearly the program with parallel processing is around 50% more effective than the normal program considering 200 materials as input.

Next time when you face any similar scenario , dont feel helpless come back read and surprise your functional tracker with the enhanced performance.