You are on page 1of 22

ASM

OnDemand Concurrent Manager Analysis Programs


Author: Creation Date: Last Updated: Version: Approvals: Martin Fitzgerald Friday, June 12 2009 Monday, June 15 2009 1.0

TableofContents
Concurrent Manager Analysis..................................................................................................3 Structure...............................................................................................................................3 Installation................................................................................................................................5 Parameters............................................................................................................................6 Actions.................................................................................................................................6 PLSQL tasks...................................................................................................................6 SQLPLUS tasks..............................................................................................................6 Execution Frequency...........................................................................................................9 Job Execution Reports............................................................................................................10 Common Concepts............................................................................................................10 Job Execution flow.......................................................................................................10 Execution Time vs Actual Runtime........................................................................11 Common Filter Parameters...........................................................................................11 Switching Data Store....................................................................................................12 OOD: Concurrent Program Execution Summary.............................................................13 Parameters.....................................................................................................................13 Purpose..........................................................................................................................13 OOD: Program execution counts over time......................................................................14 Parameters.....................................................................................................................14 Purpose..........................................................................................................................14 OOD: Detailed Program runtime history..........................................................................15 Parameters.....................................................................................................................15 Purpose..........................................................................................................................15 Manager Execution Reports...................................................................................................17 OOD: Show detailed Manager activity.............................................................................17 Parameters.....................................................................................................................17 Example Report............................................................................................................17 Report Heading section...........................................................................................18 Queue Heading section............................................................................................19 Worker Detail section..............................................................................................20 Manager summary section.......................................................................................21 Known Bugs...........................................................................................................................22

Concurrent Manager Analysis


Oracle OnDemand has a number of tools to analyze Concurrent Manager performance. Unfortunately most of these tools have 2 main drawbacks : For the most part the tools are only accessible by OnDemand personnel and while that does help us to look at what happened in an environment it can leave a customer frustrated if they just want to understand what is happening on their environment. For the most part the historical data is summarized and it can be difficult getting the detail you need to compare points in time over vast stretches. To resolve these issues and to make it easier in general to understand what is happening on any given environment we have provided a set of basic reports. At present these reports are purely SQL*PLUS generated reports. These scripts represent the key scripts we use to help understand the performance of the Concurrent Manager subsystem separate from the performance of the jobs executed by the Concurrent Manager.

Structure
In order to use these reports we rely on some tables to store the historical data. The manner in which these tables are populated is described later on. Without these tables existing none of the reports will work. There are 7 tables associated with the historical data store : OOD_CONCURRENT_REQUESTS This table holds much of the information originally contained in the FND_CONCURRENT_REQUESTS table. To keep space usage low and make access of the table faster only a small subset of the columns are pulled over mostly date columns however we also capture some of characteristics of the job to determine its unique characteristics. OOD_CONCURRENT_PROCESSES This table holds the historical data for the FND_CONCURRENT_PROCESSES table. OOD_CONCURRENT_QUEUES This table holds the historical data for the FND_CONCURRENT_QUEUES table. As time goes by queues can be created and deleted. By keeping a secondary copy we keep the connection between jobs that ran and the queue they ran in even if the queue no longer exists. OOD_CONCURRENT_QUEUE_SIZE- This table holds the historical data for the FND_CONCURRENT_QUEUE_SIZE table which is used to provide the number of workers, the poll time and the workshifts for a queue. Using this data we can keep track on when worker targets were changed. OOD_CONC_REQ_SETS It turns out the start and end times of a job are not the real story. This table tries to get the real run times of a job and uses that to report against. OOD_CONCURRENT_QUEUE_CONTENT - This table holds the historical data for the FND_CONCURRENT_QUEUE_CONTENT table which is used to define the includes and excludes of all the queues.

OOD_CONC_POP This table holds the dates of when the historical tables were last updated.

The basic idea is that the key tables used to hold live data will be accessible via the historical tables once populated and most scripts should only need to change the prefix from FND to OOD to view the information.

Installation
TO install follow the directions below. For ease of use just past this into an SR when asking for a CRT: ______________________________________________________________________
To install the reports :

cd into the newly created cm_jobs directory o cd /autofs/upgrade/OHSPERF/OOD_CMJOBS/cm_jobs And run the install.ksh script you will need to provide the apps password when prompted. o ksh install.ksh Once installed the reports cannot be used until the OOD: Create and Update Historical CM tables job is executed. To do this log in as System Administrator and run the job from the SRS screen. Alternatively you can login to the database via sqlplus as the APPS user and run the ood_ctables script o sqlplus apps/password @ood_ctables 700

Once these steps are completed the reports are installed and ready for use. ______________________________________________________________________

Managing the Historical Data The historical data of most systems using the reduced column set taken from the fnd_concurrent_requests table should be quite small. All activities for managing the space taken is handled by the Concurrent manager program OOD: Create and Update Historical CM tables. This program has the following characteristics:

Parameters
Historical Days to Keep This parameter is used to delete entries older than a certain period. At present the default is 800 which means we will keep just over 2 years worth of history

Actions
This program is split into 2 sections a PLSQL procedure that manages the tables and a couple of sql statements that report on the contents.

PLSQL tasks
For each table explicitly listed above this program will perform the following actions in sequence: 1. create_table If the table does not currently exist it will create it. 2. create_indexes We create a number of indexes (especially on the OOD_CONCURRENT_REQUESTS table ) based on the access patterns of the provided reports. Each time the program is run we check to see if all expected indexes still exist 3. update_table we then update the contents with the current data from the live tables 4. purge_table Using the parameter provided on submission , the program now deletes older entries. 5. analyze_table If any of the previous actions resulted in a structure change ( a new table or new index) then the table is analyzed.

SQLPLUS tasks
To help manage how much space we use for these table we also report on the size of the tables and their contents. Table sizes we run a script to look at the table sizes. This requires accurate statistics so it may not be accurate if stats were not gathered recently. An example output is below : o An initial Execution will look similar to the following :
14:40:39 14:40:45 14:40:45 14:40:47 14:40:48 14:40:49 Create Create Create Create Create Create table synonym index index index index : : : : : : OOD_CONCURRENT_REQUESTS OOD_CONCURRENT_REQUESTS ood_concurrent_requests_U1 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n3 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n6 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n90 on OOD_CONCURRENT_REQUESTS

14:40:49 14:40:50 14:40:51 14:40:52 14:40:52 14:40:53 14:40:55 14:40:55 14:41:21 14:41:21 14:41:21 14:41:21 14:41:21 14:41:21 14:41:21 14:41:21 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:22 14:41:23 14:41:23 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24 14:41:24

Create index Create index Create index Create index Create index Updating table Purging table Analyze table Create table Create synonym Create index Create index Create index Updating table Purging table Analyze table Create table Create synonym Create index Updating table Purging table Analyze table Create table Create synonym Create index Create index Updating table Purging table Analyze table Create index Updating table Purging table Analyze table Create index Updating table Purging table Analyze table Create synonym Create index Purging table Analyze table

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

ood_concurrent_requests_n91 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n96 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n97 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n98 on OOD_CONCURRENT_REQUESTS ood_concurrent_requests_n99 on OOD_CONCURRENT_REQUESTS OOD_CONCURRENT_REQUESTS OOD_CONCURRENT_REQUESTS OOD_CONCURRENT_REQUESTS CONCURRENT_PROCESSES OOD_CONCURRENT_PROCESSES ood_concurrent_processes_N1 on OOD_CONCURRENT_PROCESSES ood_concurrent_processes_N2 on OOD_CONCURRENT_PROCESSES ood_concurrent_processes_u1 on OOD_CONCURRENT_PROCESSES OOD_CONCURRENT_PROCESSES OOD_CONCURRENT_PROCESSES OOD_CONCURRENT_PROCESSES CONCURRENT_QUEUES OOD_CONCURRENT_QUEUES ood_concurrent_queues_N1 on OOD_CONCURRENT_QUEUES OOD_CONCURRENT_QUEUES OOD_CONCURRENT_QUEUES OOD_CONCURRENT_QUEUES ood_conc_req_sets OOD_CONC_REQ_SETS ood_conc_req_sets_u1 on OOD_CONC_REQ_SETS ood_conc_req_sets_n1 on OOD_CONC_REQ_SETS OOD_CONC_REQ_SETS OOD_CONC_REQ_SETS OOD_CONC_REQ_SETS ood_concurrent_queue_size_n1 on OOD_CONCURRENT_QUEUE_SIZE OOD_CONCURRENT_QUEUE_SIZE OOD_CONCURRENT_QUEUE_SIZE OOD_CONCURRENT_QUEUE_SIZE ood_concurrent_queue_cont_n1 on OOD_CONCURRENT_QUEUE_CONTENT OOD_CONCURRENT_QUEUE_CONTENT OOD_CONCURRENT_QUEUE_CONTENT OOD_CONCURRENT_QUEUE_CONTENT OOD_CONC_POP ood_conc_pop_n1 on OOD_CONC_POP OOD_CONC_POP OOD_CONC_POP

o
23:37:22 23:37:40 23:37:40 23:37:40 23:37:41 23:37:41 23:37:41 23:38:11 23:38:12 23:38:12 23:38:12 23:38:12 23:38:12 -

Subsequent execution would look like this :


: : : : : : : : : : : : : OOD_CONCURRENT_REQUESTS OOD_CONCURRENT_REQUESTS OOD_CONCURRENT_PROCESSES OOD_CONCURRENT_PROCESSES OOD_CONCURRENT_QUEUES OOD_CONCURRENT_QUEUES OOD_CONC_REQ_SETS OOD_CONC_REQ_SETS OOD_CONCURRENT_QUEUE_SIZE OOD_CONCURRENT_QUEUE_SIZE OOD_CONCURRENT_QUEUE_CONTENT OOD_CONCURRENT_QUEUE_CONTENT OOD_CONC_POP

Updating table Purging table Updating table Purging table Updating table Purging table Updating table Purging table Updating table Purging table Updating table Purging table Purging table

Note that in the second execution we didnt have to create the indexes or analyze the tables. Table contents Next we show how many records are in the largest of the tables (OOD_CONCURRENT_REQUESTS) on a day by day basis. An example output is below :

Sizes are in MB OWNER -------APPLSYS APPLSYS APPLSYS APPLSYS APPLSYS APPLSYS APPLSYS sum TABLE_NAME LOGICAL PHYSICAL DIFFERENCE NUM_ROWS ------------------------------ ---------- ---------- ---------- ---------OOD_CONCURRENT_QUEUE_SIZE .02 .10 .08 201 OOD_CONCURRENT_QUEUES .01 .10 .09 83 OOD_CONCURRENT_QUEUE_CONTENT .01 .10 .09 169 OOD_CONC_POP .00 .13 .12 2 OOD_CONCURRENT_PROCESSES .75 .92 .17 3495 OOD_CONC_REQ_SETS .94 1.30 .35 19818 OOD_CONCURRENT_REQUESTS 52.16 63.15 10.99 372033 ---------- ---------53.89 65.80

These contents then would say that we are using about 66MB of space. The last step in the program is to review how many records are in the DB in a day by day basis. This will also help to have a basic measure to see if jobs are increasing over time. Example output is below:

Number of Processing DATES Jobs Hours --------- --------- ---------22-MAY-09 19426 419 23-MAY-09 9045 186 24-MAY-09 6559 199 25-MAY-09 7029 118 26-MAY-09 22005 223 27-MAY-09 21198 203 28-MAY-09 21749 333 29-MAY-09 19846 356 30-MAY-09 4954 298 31-MAY-09 8205 625 01-JUN-09 17249 356 02-JUN-09 23811 404 03-JUN-09 20586 211 04-JUN-09 20596 234 05-JUN-09 18288 272 06-JUN-09 8302 281 07-JUN-09 7325 221 08-JUN-09 21082 204 09-JUN-09 20192 249 10-JUN-09 30229 249 11-JUN-09 29758 241 From this information I can see we are running about 20,000 jobs a day although the last 2 days had a significant increase going all the way to 30,000. The number of processing hours just shows how many hours it would take to run all the jobs single threaded (ie it is the combined run time of all jobs).

ExecutionFrequency
Since this program manages the population and deletion of records from the historical data store its important that it be run frequently. At present I recommend executing nightly.

Job Execution Reports


Once installed there will be 3 Job Execution reports available : OOD: Concurrent Program Execution Summary OOD: Program execution counts over time OOD: Detailed Program runtime history This section of the document will first go over some common concepts used by these reports and will then go into the specifics of each of these reports. When viewing any of these reports Use the View Ouput button. The output may exist in both the log and out files however only the output in the View Output button is properly formatted.

CommonConcepts
Job Execution flow
Before Analyzing these reports its important to understand the flow of how a job begins executing. 1. Job Submission The time a job is submitted into the system is recorded in the FND_CONCURRENT_REQUESTS table as the REQUEST_DATE. The requested_start_date of a job request can actually be in the past. For Example a third party system could submit a job at 11am Jan 2nd 2009 with a requested_start_date of 11am Jan 1st 2009. To reconcile this issue we take the greater of these 2 dates to determine the effective requested_start_date. 2. Once a job is submitted it will have a phase code of P. If there are any incompatibilities between that job and any other job on the system it will have a status code of Q for standby otherwise it will have a status code of I 3. When the sysdate is greater than the requested_start_date one of 2 things happens a. If the status is Q the job remains in standby mode until the Conflict Resolution Manager (CRM) releases it. Once the CRM releases the job it updates CRM_RELEASE_DATE column to the current time and the job goes into Pending normal mode. b. If the job is already in Pending normal mode it will be available to be picked up by a running manager. 4. When a worker wakes up to poll it will check to see if there are any jobs waiting to run. At this time it will lock the record, update it to running mode move it to running phase and begin executing the job. 5. When the job completes the manager takes care of any cleanup work (printing, rescheduling ) and completes the job by setting it to a complete phase. In general then we are trying to track and understand the times associated with the following phases : Status Code Q I Status meaning Standby Pending Common Interpretation Incompatible Waiting All the reports will break out the times by these 3 states. R Running Executing

Execution Time vs Actual Runtime

The main issue with execution time is that it often doesnt represent the load a job puts on the system. As an example lets take a job which executes 10 child processes. The master program submits a child and waits for the child to complete, then it submits the next process. Overall the master programs load on the system is tiny perhaps accounting for a few seconds of runtime. However the actual completion time may be hours later than the actual start time. To reconcile this we have created a table called the OOD_CONC_REQ_SETS table which tries to capture the real load runtime of a job instead of the recorded runtime. Where possible we use this metric to account for jobload rather than just the completion time start time.

Common Filter Parameters


These 3 jobs share a number of parameters that can be used to filter the selection criteria. Job Filters These filters on the job being executed are useful in getting a handle on the most used jobs, application, Queue or middle tier CM node in the system. o Program Short Name Find the short name from the Concurrent Program Define screen. o Application Short Name Program Application Short name (For example AR, ONT). Remember that GL is actually SQLGL and AP is SQLAP. o Concurrent Queue - This is the Concurrent Manager short name as seen in the second in the Concurrent Manager Define screen. o Hostname Concurrent Manager host name. Only relevant in a PCP configuration. Job execution times The default for the following 3 parameters is 0. Essentially these are used to grab any jobs which were in conflict for a long time (Standby time), Waiting for a worker (Pending time) or running for a long time (running time. Times are in whole minutes. o Min Standby Time o Min Wait Time o Min Run Time Timeframe - Start and end days for the reports to be looking for jobs. Starting Day begins at 00:00:00 of the specified day and Ending day ends at 23:59:59 of the specified day. o Starting Day o Ending Day Business Day definition Most of the time we are most concerned about jobs running during the business day. To exclude jobs running in the morning set the Start of business day to a more appropriate time (for example 08:00:00). Similarly to exclude jobs run at night bring the End of business day time earlier to say 5pm (eg 17:00:00) o Start of Business Day o End of Business Day o Exclude Weekends Excluding weekends also may make sense. Setting this to Y will exclude jobs which ran on Saturday and Sunday. Submitting User filters o Application User Name Select only a specifica Application users jobs

Application Resp Nam Select only those jobs submitted from a a specific Responsibility.

Switching Data Store


All the reports include the following flag: Live or Historical Tables By default the reports run against the historical data store populated by the job OOD: Create and Update Historical CM tables. By switching this to Live it will instead go against the live FND tables.

OOD:ConcurrentProgramExecutionSummary
Parameters
Besides the filters mentioned in the common concepts sections the following parameters are used : This report adds the capability of providing 3 unique column definitions which will be used as both the first 3 columns and the group by clause. The choices which can be used are as follows : o QUEUE Concurrent manager queue short name o Application Program Application short name o Hostname Concurrent Manager Host name o Responsibility The Responsibility of the submitting user o Username The Username of the submitting user o Short_Program_name Concurrent program name o Full_program_name User familiar concurrent program name o NONE No filter on this column. Each of the first 3 columns can be any of the above values. The other unique parameter is the Order By parameter. This allows you to switch the sort column by either the number of jobs run or by the load the jobs represented on the system. o JOBCOUNT Orders the report by Number of executions o JOBLOAD Orders the report by summed runtime.

Purpose
The point of this report is to try and answer some potentially complex questions. For example: 1. What job is run out of the GL application the most during the day? 2. Who has been submitting all those FSG reports? 3. What machine is getting the most load Given the options of common filters and the addition of 3 unique groupings it should be fairly easy to drill down into what jobs/users/responsibilities are causing the most load on the environment.

OOD:Programexecutioncountsovertime
Parameters
Besides the filters mentioned in the common concepts sections the following parameters are used : Interval This parameter will display a record per interval detailing the job performance for any selected jobs within the interval. Avalable options are : o HOURS The Default value. Shows job performance by hour o MINUTES - Shows job performance by minute o DAYS A day by day comparison o WEEKS Provides a week by week comparison using the week number of the year o MONTHS A month by month comparison

Purpose
This report can be used to compare execution times over specific time periods. Its usefull in determining execution rates and load incurred by jobs and how that has changed over long periods of time.

OOD:DetailedProgramruntimehistory
Parameters
Besides the filters mentioned in the common concepts sections the following parameters are used : REQUEST_ID - If you know the specific request ID you are interested in you can specify it here.

Purpose
This report lists all the jobs line by line that meet the filter criteria. Included in the report will be: Requested start date CRM Release date Actual start date Actual completion date In addition the times between each milestone are displayed. This report is more useful to analyze the details of specific jobs rather than trying to aggregate them up where the averages can distort the analysis. If the REQUEST ID parameter is specified then the report will display the request ID and all its child processes in an indented fashion. For example the following report shows me all the child process of request ID 52065923:
REQUEST_ID -------------------52065923 52065967 52065968 52065974 52065975 52065976 52065977 52065978 52065969 52065970 52065971 52065972 52065973 52065994 52065996 52065997 52065998 52066000 52066001 52066002 52066003 52066004 52066005 52066006 52066011 52066012 52066010 QUE_NAME --------------CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CUST_MRP CONCURRENT_PROG --------------MRCNSP MRCMON MRCSDW MRCSDW MRCSDW MRCSDW MRCSDW MRCSDW MRCNSW MRCNSW MRCNSW MRCNSW MRCNSW MRCSLD MRCSLD MRCSLD MRCSLD MRCSLD MRCSLD MRCSLD MRCSLD MRCSLD MRCSLD MRCNEW MRPEXPWF MRPAUREL MRCEAP USER_NAME ---------PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH PPCBATCH REQUEST_DATE ... -----------------... 09-06-14 15:11:34... 09-06-14 15:18:54... 09-06-14 15:19:26... 09-06-14 15:19:27... 09-06-14 15:19:27... 09-06-14 15:19:27... 09-06-14 15:19:27... 09-06-14 15:19:27... 09-06-14 15:19:26... 09-06-14 15:19:26... 09-06-14 15:19:26... 09-06-14 15:19:26... 09-06-14 15:19:26... 09-06-14 15:21:17... 09-06-14 15:21:37... 09-06-14 15:21:42... 09-06-14 15:21:42... 09-06-14 15:21:46... 09-06-14 15:21:49... 09-06-14 15:22:00... 09-06-14 15:22:15... 09-06-14 15:22:15... 09-06-14 15:22:16... 09-06-14 15:22:16... 09-06-14 15:22:58... 09-06-14 15:22:58... 09-06-14 15:22:54...

Using this option we can see that job 52065923 kicked off job 52065967 which kicked off job 52065968 The indented feature allows us to see the relationship between parent, sibling and child for these processes.

Manager Execution Reports


The following report is more about the execution of the concurrent manager rather than the jobs within the concurrent manager. The report name is OOD: Show detailed Manager activity and the output is a somewhat graphical depiction of the activity of the manager during a short period of time (can be between 1 and 120 minutes).

OOD:ShowdetailedManageractivity
Parameters
There are only a few parameters used by this program: Starting Date and Time Start time for when the analysis should begin Reporting time in mins Amount of time to report on. Max is 120 mins (2 hours). Queue Name to display Setting this to something other than % will limit the program to show only those specific queues that meet the filter. Live or Historical Tables- Setting this to LIVE will switch the program to look at the live FND tables rather than the historical date store.

Example Report
Below is an example report and what follow is an explanation on how to interpret it: (below report is from an actual customer however the job names have been replaced):
Looking at jobs submitted between 06/14/2009 15:18:00 and 06/14/2009 15:33:00 *** Process Flags - Running=*, Pending=-, Standby=_, Paused=~ *** Running...but sleeping=z *** Manager Flags - Running=*, Idle =. /* Processing requests for concurrent queue: CUST_MRP | | Manager Request CRM Actual Actual | 2 3 | Proc. Request-Program short name Start Release Start End |890123456789012| ------ --------------------------- -------- -------- -------- --------|---------------| 298663 52065923-MRCNSP 15:11:34 15:18:47 15:18:54 15:22:55|zzzz* | | 52066010-MRCEAP 15:22:54 15:22:55 15:22:55| * | | 52066012-MRPAUREL 15:22:58 15:23:19 15:23:25 15:23:25| _* | | --- Mgr Process(1298663) 06/07/09 09:13:32 |zzzz**.........| \___________________________________ 298664 52065931-MRCNSW 15:12:46 15:13:14 15:13:21 15:18:01|* | | 52065959-MRCSLD 15:18:01 15:18:01 15:18:01|* | | 52065977-MRCSDW 15:19:27 15:19:32 15:19:33| * | | 52065970-MRCNSW 15:19:26 15:19:47 15:20:03 15:21:49| -** | | 52066000-MRCSLD 15:21:46 15:21:49 15:21:49| * | | 52066001-MRCSLD 15:21:49 15:21:49 15:21:50| * | | 52066006-MRCNEW 15:22:16 15:22:18 15:22:20 15:22:58| * | | 52066011-MRPEXPWF 15:22:58 15:22:58 15:23:28| ** | | --- Mgr Process(1298664) 06/07/09 09:13:32 |******.........| \___________________________________ 298666 52065919-MRCNSP 15:11:24 15:12:13 15:12:16 15:18:28|* | | 52065962-MRCEAP 15:18:27 15:18:28 15:18:29|* | | 52065974-MRCSDW 15:19:27 15:19:29 15:19:33| * | | 52065978-MRCSDW 15:19:27 15:19:33 15:19:33| * | | 52065971-MRCNSW 15:19:26 15:19:47 15:20:03 15:22:16| -*** | | 52066005-MRCSLD 15:22:16 15:22:16 15:22:17| * | | --- Mgr Process(1298666) 06/07/09 09:13:32 |*****..........| \___________________________________ 298667 52065927-MRCMON 15:12:16 15:12:44 15:12:46 15:18:06|* | 298669 52065960-MRCNEW 15:18:01 15:18:16 15:18:23 15:18:29|* | | 52065963-MRPEXPWF 15:18:29 15:18:29 15:19:03|** | | 52065994-MRCSLD 15:21:17 15:21:34 15:21:36| * | | --- Mgr Process(1298669) 06/07/09 09:13:32 |**.*...........| \___________________________________

298673 52065967-MRCMON 15:18:54 15:19:17 15:19:25 15:22:17|_zzz* | 298676 52065968-MRCSDW 15:19:26 15:19:26 15:19:43| * | | 52065997-MRCSLD 15:21:42 15:21:43 15:21:44| * | | 52065998-MRCSLD 15:21:42 15:21:44 15:21:45| * | | 52066003-MRCSLD 15:22:15 15:22:15 15:22:15| * | | 52066004-MRCSLD 15:22:15 15:22:16 15:22:16| * | | --- Mgr Process(1298676) 06/07/09 09:13:32 |.*.**..........| \___________________________________ 298677 52065976-MRCSDW 15:19:27 15:19:31 15:19:34| * | | 52065973-MRCNSW 15:19:26 15:19:47 15:20:04 15:22:16| -*** | | --- Mgr Process(1298677) 06/07/09 09:13:32 |.****..........| \___________________________________ 298680 52065969-MRCNSW 15:19:26 15:19:47 15:19:55 15:22:01| **** | | 52066002-MRCSLD 15:22:00 15:22:01 15:22:01| * | | --- Mgr Process(1298680) 06/07/09 09:13:32 |.****..........| \___________________________________ 298681 52065996-MRCSLD 15:21:37 15:21:38 15:21:39| * | 298682 52065975-MRCSDW 15:19:27 15:19:31 15:19:33| * | | 52065972-MRCNSW 15:19:26 15:19:47 15:20:03 15:21:53| -** | | 52065964-MRPAUREL 15:18:29 15:23:19 15:23:23 15:23:25|_____* | | --- Mgr Process(1298682) 06/07/09 09:13:32 |.***.*.........| \___________________________________ ****** Process Jobs Busy% Manager Summary for CUST_MRP | -------- ------ -----| 1298663 3 26.8 06/07/09 09:13:32 |zzzz**.........| | 1298664 8 19.73 06/07/09 09:13:32 |******.........| | 1298665 0 0 06/07/09 09:13:32 |...............| | 1298666 6 18.67 06/07/09 09:13:32 |*****..........| | 1298667 1 .67 06/07/09 09:13:32 |*..............| | 1298668 0 0 06/07/09 09:13:32 |...............| | 1298669 3 4.67 06/07/09 09:13:32 |**.*...........| | 1298670 0 0 06/07/09 09:13:32 |...............| | 1298671 0 0 06/07/09 09:13:32 |...............| | 1298672 0 0 06/07/09 09:13:32 |...............| | 1298673 1 19.13 06/07/09 09:13:32 |.zzz*..........| | 1298674 0 0 06/07/09 09:13:32 |...............| | 1298675 0 0 06/07/09 09:13:32 |...............| | 1298676 5 2.13 06/07/09 09:13:32 |.*.**..........| | 1298677 2 15 06/07/09 09:13:32 |.****..........| | 1298678 0 0 06/07/09 09:13:32 |...............| | 1298679 0 0 06/07/09 09:13:32 |...............| | 1298680 2 14 06/07/09 09:13:32 |.****..........| | 1298681 1 .13 06/07/09 09:13:32 |...*...........| | 1298682 3 12.6 06/07/09 09:13:32 |.***.*.........| \___________________________________

Report Heading section


Looking at jobs submitted between 06/14/2009 15:18:00 and 06/14/2009 15:33:00 *** Process Flags - Running=*, Pending=-, Standby=_, Paused=~ *** Running...but sleeping=z *** Manager Flags - Running=*, Idle =.

The heading section indicates the start and end time of the report which should match the criteria provided at submission time. In addition it details the characters used to represent various stats a program can be in : Running Process has begun processing by the concurrent manager Pending Process is waiting for a manager to run the job Standby Process is in conflict with another job and cannot run Paused Job has submitted a child process and is now paused waiting for the child process to complete Running but sleeping Job has submitted a child process but did not go into pause state. Instead it is listed as Running by the concurrent manager and is taking

up a slot in the queue. The job is actually executing a sleep command as it waits for the child process to complete. The Queue is started with a number of workers and each worker can be either executing a job or waiting to execute a job. In this case the flag to represent the 2 states are also displayed : Running Manager is actually processing a job Idle Manager has no work to do and is not processing any jobs.
Queue Heading section
/* Processing requests for concurrent queue: CUST_MRP | | Manager Request CRM Actual Actual | 2 3 | Proc. Request-Program short name Start Release Start End |890123456789012| ------ --------------------------- -------- -------- -------- --------|---------------|

Each queue will have its own heading section. In this case we can see that this queue is meant to display the activities of the CUST_MRP queue. The columns are as follows: Manager Proc. A Queue spawns several workers to deal with the jobs in the queue. The workers will each have a unique Manager Process ID. Tracking the performance of a queue then means tracking the performance of each of the workers and making sure that the workers are picking up jobs quickly. For the purposes of this report then we will be sorting the jobs by the worker process that executed that job. Request-Program short name - This queue provides ther request ID and a substring of the job being executed by the manager process. The next 4 columns provides the key milestones of a job: o Request Start Greater of REQUEST_DATE and REQUESTED_START_DATE o CRM Release - May be blank if there are no incompatibilities otherwise it will contain the time the CRM released the job to run o Actual Start Actual Start time o Actual End Actual Completion time The next set of columns represent the time in minutes. Since this report was run for a 15 minute period there are 15 columns in this section each column will represent 1 of those 15 minutes of the report. At every 10 minute interval we indicate whether it is 10,20,30,40 or 50 minutes into the report. In this case we can see that the report spanned both the 20 minute and 30 minute marks. If the report was set to run against 120 minutes the report would actually be 120 columns wide for this section alone. For that reason the report is limited to only 120 minutes at max.

Worker Detail section


298663 52065923-MRCNSP 15:11:34 15:18:47 15:18:54 15:22:55|zzzz* | | 52066010-MRCEAP 15:22:54 15:22:55 15:22:55| * | | 52066012-MRPAUREL 15:22:58 15:23:19 15:23:25 15:23:25| _* | | --- Mgr Process(1298663) 06/07/09 09:13:32 |zzzz**.........| \___________________________________

Each worker is displayed in numerical order. LINE1- The first worker in the list ran 3 jobs. The first job though spent pretty much the entire time running child processes as we can see it was mainly in a Running but Sleeping mode the entire time (as denoted by the z character). Review the details of the report example in the previous section to see what it was waiting on. LINE2 - The second job, with the job name of MRCEAP, is not incompatible with anything so the CRM Release column is blank. In addition we can see that the time between requested_start_date and actual start_date is 1 second. Each subsequent line executed by the same worker will have a | instead of the worker number to indicate the they are part of the same worker flow LINE3 - The 3rd job does have incompatibilities defined and it had to wait. Since it was requested to run at 15:22:58 it fell into the Standby section for the minute designated 22. It was released 21 seconds later but based on when it was submitted and released it will show up in the report with an _ to designate standby mode. Also It ran so fast it took less than a second to run. Even so we show it as having been run during the minute. Clearly the graph isnt exact. The intent is to show some representation of the busy ness of a queue and how jobs reacted to that activity. The detailed milestone dates are also supplied to give context to the graphical representation. As can be seen, for fast jobs many of them can be run in the same minute. LINE4 - The Next line sums up the activity of the worker that ran these jobs. It details out when the worker started and if it has exited it will show its end time as well. In addition it will pull together all the statuses from the individual jobs so its clear how busy the worker might have been. LINE5 - The last line of each worker is just a report separation line to indicate the end of the worker It is important to note that not every worker will be represented. If the worker ran no jobs then it wont be displayed in this section. In addition if the worker only ran 1 job then there is no need to show a summary line of that worker as well. For example the following section shows worker 298667. However since it only ran the single job it skipped the summary line and just went to the next worker.
298667 52065927-MRCMON 298669 52065960-MRCNEW | 52065963-MRPEXPWF 15:12:16 15:12:44 15:12:46 15:18:06|* 15:18:01 15:18:16 15:18:23 15:18:29|* 15:18:29 15:18:29 15:19:03|** | | |

In these cases the summary line would equal the worker line so there is no need to show the summary line separately.

Manager summary section


****** Process Jobs Busy% Manager Summary for CUST_MRP | -------- ------ -----| 1298663 3 26.8 06/07/09 09:13:32 | 1298664 8 19.73 06/07/09 09:13:32 | 1298665 0 0 06/07/09 09:13:32 | 1298666 6 18.67 06/07/09 09:13:32 | 1298667 1 .67 06/07/09 09:13:32 | 1298668 0 0 06/07/09 09:13:32 | 1298669 3 4.67 06/07/09 09:13:32 | 1298670 0 0 06/07/09 09:13:32 | 1298671 0 0 06/07/09 09:13:32 | 1298672 0 0 06/07/09 09:13:32 | 1298673 1 19.13 06/07/09 09:13:32 | 1298674 0 0 06/07/09 09:13:32 | 1298675 0 0 06/07/09 09:13:32 | 1298676 5 2.13 06/07/09 09:13:32 | 1298677 2 15 06/07/09 09:13:32 | 1298678 0 0 06/07/09 09:13:32 | 1298679 0 0 06/07/09 09:13:32 | 1298680 2 14 06/07/09 09:13:32 | 1298681 1 .13 06/07/09 09:13:32 | 1298682 3 12.6 06/07/09 09:13:32 \___________________________________ |zzzz**.........| |******.........| |...............| |*****..........| |*..............| |...............| |**.*...........| |...............| |...............| |...............| |.zzz*..........| |...............| |...............| |.*.**..........| |.****..........| |...............| |...............| |.****..........| |...*...........| |.***.*.........|

The manager summary section provides an indication of the activity of the workers in the queue. Each line has the following columns: Process The unique worker ID as described before Jobs The number of jobs the worker processed in the timeframe specified Busy% - This number takes the total time available in the timeframe specified and indicates how often that worker was actually actively running a job. This is useful to see if the queue is maxed out. As mentioned earlier just having a * in each column may only mean that 1 second of every minute was taken the Busy% will be a far more accurate gauge f the amount of time the worker spent doing work. The start and end times of the queue are then provided. The last column will be a repeat of the section seen earlier when the worker was originally displayed Note that in this section even workers which never ran a job will be portrayed. This makes it easier to see if capacity was available during specified times.

Known Bugs
These scripts try and mine the job history tables to show how the concurrent manager is performing. However there are known issues in the data. Some jobs seem to resubmit themselves using the same request ID. This is an anomaly of the job and as such its hard to account for it. Outages and system downtimes will cause huge spikes in both standby time and Pending time. Some of the logic determining the values for actual start time and job load may need to be refined. They seem to work pretty well however its a bit convoluted and there may be conditions where the logic isnt getting the right values.