You are on page 1of 5

Implementing Parallel processing in Peoplesoft - Part I/III

What is parallel processing?

You might have seen some batch jobs which updates/creates millions of data taking much
time to finish the job. It is taking time due to the amount of data that needs to be
processed (apart from the bad design/sqls). Parallel processing is a method by which we
could finish the job in a matter of minutes. It is a process by which we divide the data into
smaller logical sets and running the job for each set at the same time. Peoplesoft support
the parallel processing in application engines by means of temporary tables and inbuilt
mechanism to launch multiple instances.

Suppose I have a process which updates the salary record of all my employees (say


100000). The employees belong to 5 different business units (BU) with 20000 employees in
each business units.

If I run the update process for all the employees without parallel processing, the job has
to update all the employees’ record and say it finishes in 10 minutes. What if I divide my
employees based on BU and run each instance of the same program simultaneously? Each
individual process will take 2 minutes and since all the process is run at the same time, the
entire record will be finished at the same time. So I reduced my processing time from 10
minutes to 2 minutes (in real scenario, it may take time more than 2 minutes but still it
will be better than the initial 10 minutes time). The following diagram illustrates running
of process in parallel.
There will be a delay between the actual processing start of the first instance and second
instance (applicable for all instances). This is due to the fact that, each individual set of
process will lock the data to be processed in the base table. Since one process is updating
the base table, the other one has to wait until the database frees the base table.

Similarly there will be delay when we update back the base table with updated data. But
this delay will be negligible when compared to the overall gain in performance.

Implementing Parallel processing in Peoplesoft - Part II/III

When do we need to implement parallel processing?

You might have wondered from the previous post that how it will improve the performance
if we split the process. If I’m having only one update statement and if I use set based
processing for doing that, then where really is my performance improved.  It will not be
always good make a process do the processing in parallel. Sometimes it may have negative
performance gains. As per the example I stated above it will be an overhead for the server
to send 5 sql statements instead of one. So when do I need to make the process parallel?
Below are some scenarios which you can think of introducing parallel processing.

   1.       My process is updating/ creating millions of rows in single run. 

   2.       There is a possibility that multiple users will run my process at the same time for same
transaction. This can happen if the same process is available in batch and online mode. In
this case if one person runs in batch mode and one person runs online for the same
transaction, one of my processes may error out or updates the tables with wrong data.
Also there can be a chance where two users running the same process with same
runcontrol parameters at the same time.

   3.       The transaction data to be processed is present in multiple tables and I do the


processing by importing relevant data to a intermediate (temporary) table. With almost all
the real process which does bulk processing this is applicable. The data required for
processing may be scattered across different tables. I then need to query each individual
table and select the relevant data and put that into a common temporary table and from
there I do the processing. In the salary example, this scenario will come up if you are
increasing the salary of your employees based of different rules such as (a) the percentage
increase depends on your designation (b) percentage increase depends on your experience
(c) percentage increase depends on your performance rating and so on.

   4.       You are doing row by row processing in your application engine program. There can be
scenarios where you cannot do all your processing in set based manor. In such cases
implementing parallel processing is the best option. Since the time required for processing
is directly coupled to the number of rows, the more row you have to process, the more
time it is going to take. So divide your data into logical set and run the process in parallel.
It will reduce the number of rows for each individual process instance and thereby the
processing time also gets reduced.

Implementing Parallel processing in Peoplesoft - Part III/III

How to implement parallel processing?

Implementing parallel processing in Peoplesoft is a simple task if you make sure that you
follow all the steps systematically. Below is the generic guide line to implement parallel
processing in peoplesoft.

   1.       Include locking fields in the base table – The main/ driver table for your transaction


should contain a field to map that the table is used by an application engine program. The
best way to do so is to add field PROCESS_INSTANCE in your base table. If you are allowing
users to run the same process in online mode also it is recommended that you add one
more fied PROCESSED_FALG. The significance of these fields will be explained in the
following steps.

   2.       Create temporary tables – Temporary tables is the heart of peoplesoft parallel


processing. For all the temporary data collections and manipulations, it is recommended
that you do it inside a table. This will avoid data being transferred out of the database and
makes the processing faster. The temporary tables should contain PROCESS_INSTANCE as
one of the key. The usual naming convention will be the table name ending with _TAO or
_TMP. The record type should be selected as Temporary Table.

   3.       Add the Temporary Table to application engine – Go to the Temp Tables tab of


application engine properties and all the temp tables created for your process in this tab.
Also provide the Instance count at this tab. Instance count will be the number of process
you need to run in parallel in batch mode. To provide the instance count for online mode
set it in the PeopleTools Options page (People Tools > Utilities > Administration
>PeopleTools Options). There will be another option in the Temp Tables tab of app engine
property called “Runtime”. If you select Shared, then the program will use your base temp
table if the total number of parallel instances exceeds the count you have mentioned. The
program will Abort if you have selected “Abort” option for the same.

   4.       Build the Temp Tables – You need to build the temp tables only after assigning it to the
app engine. Once you build the table, it will generate copies for the table for as many
instance you have mentioned (online+batch), the maximum being 99. Each instance will
end with specific instance count TAO1, TAO2 etc…

   5.       Locking logic – Now you are done with the configurations and needs to write the code
to suite parallel processing. Locking logic is important one. If two processes are initiated
for same transactions at the same time, then both will process the transaction and at the
end when inserting data back to the tables. Both the process will try to insert the same
data and as a result the process errors out or in case of updating the process will end up in
updating wrong data. To overcome this situation, we use locking logics. For this we have
created fields in step 1. At the beginning of the program, write an update statement as
below to update the base table with the process instance of the current process.

UPDATE PS_BASE_TABLE SET PROCESS_INSTANCE =%ProcessInstance WHERE


PROCESS_INSTANCE = 0

So when the second program tries to work on this transaction, it will see that this
is used by some other application engine and leaves it. Thus avoiding duplicate
processing and chances of error. Now comes another scenario, if you are also
running the process instance in online mode then your Process Instance will be
always zero and the previous sql will not help you out. In such cases we need to
use the second field added in step 1. Then the sql needs to be modified as below.

UPDATE PS_BASE_TABLE SET PROCESS_INSTANCE = %ProcessInstance ,


PROCESSED_FLAG=’Y’ WHERE PROCESS_INSTANCE = 0 and PROCESSED_FLAG = ‘N’

   6.       Drag data to temp table and do the processing – Now you can start dragging the data
from driver table to the temp tables based on the process instance and the do the
processing.

   7.       Use %Table() metasql – When you do all the processing with temp tables, always make
sure that you wrap your Temporary record name with %Table() metasql. It will
automatically unwind to the current instance name. For example if your current instance
is 6, then the below sql will unwind to UPDATE PS_SAL_TAO6 SET SAL=SAL*10
UPDATE %Table(SAL_TAO) SET SAL+SAL*10

   8.       Unlock the base table when done with processing -  Once you are done with the
processing you can now unlock the base tables by setting the PROCESS_INSTANCE to 0.
Otherwise any other process run at a later time for the same transactions will never be
picked up for processing. If your business logic needs that transaction to be processed only
once, then you can avoid this step.

   9.       Avoid Truncating tables – Most people are used to truncating the temporary tables as
the last step for the processing to clear up the temporary data. You can now avoid this
step as in the latest versions of PeopleTools, this is taken care internally. Once your
program ends, application engine will automatically issue the truncate statements.

I am sure if you follow these steps carefully while creating your program, you can easily
build up your process to work in parallel.

Read Related: Implementing Parallel processing in Peoplesoft -Part II/III

You might also like