Measuring End-to-End Response Time Through ALL The Tiers

larry.klein@hotsos.com
07 March 2005

www.hotsos.com

Copyright © 1999–2005 by Hotsos Enterprises,

Slide 1

Agenda • A Case Study • Lessons Learned • Generalizing a Method for End to End Performance Optimization

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 2

A Case Study Client XYZ
• Insurance company • Custom application – under development for two years – nighttime batch load of transaction data – daytime online edits, reviews, approvals of transaction data – in “pilot production” phase for 10% of total online user base • some users in geographically distributed corporate offices • other users work from home via dialup or broadband VPN • Online response times unacceptable – “everything is slow” – editors and reviewers are clerical and paid by “piece work”

• “Please help us fix our Oracle Database”
www.hotsos.com
Copyright © 1999–2004 by Hotsos Enterprises,

Slide 3

You should diagnose Oracle performance problems like you would diagnose any other performance problem.
1. “What’s the most important thing to be working on?” – Business priority 2. “Where did the system spend my time?” – Resource profile – Which SQL is responsible? – Which SQL to blame for time spent between db calls? 3. “What’s the most effective way to fix it?” – Eliminate waste, eliminate competition – Stop when “fixing” isn’t worth it 4. Repeat
Source: Millsap, C.; Holt, J. 2003. Optimizing Oracle Performance. www.hotsos.com
Copyright © 1999–2004 by Hotsos Enterprises,

Slide 4

I arrived on Monday. Client promised to pull together Transaction Information by Wednesday. Meanwhile…

Oracle Database

?
www.hotsos.com
Copyright © 1999–2004 by Hotsos Enterprises,

Slide 5

Analysis Start with the Oracle Database
Why start with Oracle? • Single element in common across customers’ environments • Very well instrumented • All I knew about Client XYZ’s environment Here’s what I did: • v$session showed 200 JDBC Thin Client sessions – Connection Pool! • v$sess_io showed relatively “even” session usage • picked 5 out of 200 sessions • dbms_system.set_ev (sid,serial#,10046,12,’’) • captured and analyzed trace files

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 6

Trace File Analysis, Recommendations, Results
• “SQL*Net message from client” predominated • “CPU service” for a handful of SQL statements taking one or a few seconds each • Each SQL statement preceded by a dbms_describe With much iterative, cut/paste testing: • ran dbms_stats at 30% • one new index • one code change • one hint • individual SQL statements now subsecond • fixes implemented • made notes – to pursue reason why/ possible elimination of describe – to review init.ora CBO parm’s
• • optimizer_index_cost_adj = 10 not 100 optimizer_index_cache = 80 not 0

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 7

Meanwhile, Client Feedback on Thursday then again on Friday after Fixes
Response Time Response Time After Fixes Before Fixes (seconds) (seconds) 19 31 6 … 18 29 6 … Response Time Goal (seconds) 5 5 5 5

User Action A B C …

Clearly, not enough Progress!!!
www.hotsos.com
Copyright © 1999–2004 by Hotsos Enterprises,

Slide 8

Now What? What’s Happening End to End?

Oracle Database User

?

SQL*Net msg from Client

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 9

The End to End Question is easier in the IBM Mainframe world Graphical Command Center Mainframe

“The status light is red for the Catalog Center operation in Wichita because Carol Smith’s Order Entry transaction from there is taking 3.92 seconds which exceeds the SLA of 2.0 seconds; drilling down on the red status shows her online transaction waiting on the database, which is suffering from high I/O contention on file 23, due to large batch job XYZ that’s running right now at an inappropriate time of day.”

IBM Designed and Built Performance Considerations throughout its End to End Technology Stack
www.hotsos.com
Copyright © 1999–2004 by Hotsos Enterprises,

Slide 10

In the Open Systems World Our Strength is our Weakness…
• Lower cost of computing • Robust solutions from multiple vendors • No common or dominant performance management architecture across the end to end technology stack of multiple vendors

How can I figure out the End to End Here?

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 11

Hotsos Method R Reprised
1. “What’s the most important thing to be working on?” – Business priority 2. “Where did the system spend my time?” – Resource profile – Which SQL is responsible? – Which SQL to blame for time spent between db calls? 3. “What’s the most effective way to fix it?” – Eliminate waste, eliminate competition – Stop when “fixing” isn’t worth it 4. Repeat
Source: Millsap, C.; Holt, J. 2003. Optimizing Oracle Performance. www.hotsos.com
Copyright © 1999–2004 by Hotsos Enterprises,

Slide 12

Generalizing Method R for End to End Performance Optimization • • • • • • • Identify and Prioritize Important Transactions Determine Application Architecture Establish Instrumentation Plan Conduct Tests or Probe Production Measure Activity Compile Results Identify Opportunities to Optimize

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 13

Client XYZ’s Application Architecture After Many Conversations with Many “Silo” Teams
3rd Party Workflow Product
Interface Conn Pool Workflows User PC Trans
A SO P

Oracle

Engine

Gateway
HTTP

.Net
Client

SOAP

MQ Interface

MQ

CICS/ Cobol

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 14

How to Instrument this End to End?
Key for “Logging” • Existing, custom written • Existing, vendor provided

3rd Party Workflow Product
Interface Conn Pool Workflows Oracle

User PC

Trans

Gateway
HTTP

A SO

P

Engine
.lis file .log file 10046 traces

.Net
Client

SOAP

MQ Interface

MQ

CICS/ Cobol

Stopwatch

.out file

Sniffer

.log file

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 15

Log File Attributes, Issues, and Demonstration Tests
Attributes
• Timestamps – start – end – duration (calculated) • Transaction/Step name – current process – called process Test 1 Test 2 Test 3 Test 4 Test 5 Fail Fail Fail Fail Success Issues • Awareness - Client’s Production Support team didn’t know that logs were available or how to enable them – Original architects no longer involved with project – “Custom” logs built by silo teams for their own purposes but not widely known to other teams – “Standard” logs supplied by vendors but not widely known by project team • Inconsistent timestamp granularity across log files UNIX servers not time synched Windows servers not time synched Logging filesystems filled up Sniffer buffer filled up Collate activities cross platform by timestamp

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 16

The Logs were Helpful To Reveal Discrete Actions per Architecture Component
.Net Client User Actions A B C … Trans Gtwy SOAP Calls S1 S2 S3 … Called Workflows Wa Wb Wc … Called Oracle Procedures GET… DO… xyzLKUP… …

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 17

The Logs when Collated by Timestamp were Helpful To Map the User Action/Transaction Flow
• User Action A – SOAP Call 1
• Workflow Wd
– Oracle Proc xLKUP – Oracle Proc GET…

• User Action B – SOAP Call 7
• Workflow Wz
– Oracle Proc jLKUP

• MQ Call 23

– SOAP Call 2
• Workflow Wa
– Oracle Proc jLKUP

– SOAP Call 9
• Workflow Wz
– Oracle Proc Do…

• MQ Call 17

– SOAP Call 8
• Workflow Wg
– Oracle Proc yLKUP

– SOAP Call 1
• Workflow Wd
– Oracle Proc xLKUP – Oracle Proc GET…

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 18

The Logs were Helpful To Create a Resource Profile
3rd Party Workflow Product
Interface Conn Pool Workflows User PC Trans Oracle

“A”
HTTP

Gateway

A SO

P

Engine
seconds

.08

.Net
Client

8.70
SOAP

MQ Interface

seconds

MQ

CICS/ Cobol

18.00
seconds

9.00
seconds

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 19

The Logs were Helpful To Reveal “Too Much Work” and Other Issues
Count % of Total Count 18.18% 0.76% 61.36% 10.61% 9.09% 100.00% Sum % of Total Sum 21.36% 0.93% 53.51% 12.88% 11.32% 100.00% Avg (msec) 3082.50 3213.00 2287.99 3185.57 3268.50 2623.79 Slide 20

SOAP Call S1 LOGIN S2 S3 S4 Totals:

Type Lookup Other Lookup Other Other

Count 24 1 81 14 12 132

Sum (msec) 73980.00 3213.00 185327.00 44598.00 39222.00 346340.00

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Client XYZ Status
• “Thanks – this is the best End to End View we’ve ever had!” • Using logging on an ongoing basis • Collapsing multiple, ongoing static lookups – one static lookup at login – cache lookups in .Net client – realize tradeoff of one slightly longer login for many 6-8 second savings per edit or review throughout the day • Reevaluating current Transaction Gateway Usage (XML Lovefest) – currently maps .Net client inbound SOAP/XML to alternative SOAP/XML format before passing downstream to Workflow or MQ; receivers need to reparse for their own purposes – substantial latency/cpu consumption from SOAP/XML parsing and manipulation – considering simpler non-XML map for messages “once in glass house” to eliminate redundant construction/parsing overhead • Reevaluating current Workflow Engine Usage – currently used as “database driver” but each workflow = 1000+ steps around db call – considering using only when “flow” is required otherwise call db from Gateway

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 21

Lessons Learned If you are an Oracle Performance Analyst…
• Current methods can quickly isolate problems to be inside or upstream of the database • If the problems are “not Oracle”, you can still add HUGE value – you already are successful within an existing method framework – you are a big picture thinker – you know what questions need answers – you have the confidence to lead – you can be an objective facilitator across many people and technologies – you can dig out the details from simple log file analysis • You should check your eyeglass prescription before reviewing logs • You need to expand your past successes with the Generalized Method R for End to End Optimization

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 22

Lessons Learned If you are Building or Managing an Application
• Performance needs to be architected into an Application by Design • Performance needs to be a “mindset” throughout a Project Life Cycle, not an afterthought months or years later • When in “Design Doubt,” Performance Proofs of Concept can channel development efforts in a positive and efficient direction • Performance methods and tools need to be – well-tested – documented – implemented • You need to Employ the Generalized Method R for End to End Performance Optimization

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 23

The Generalized Method R for End to End Performance Optimization
• • • • • • • Identify and Prioritize Important Transactions Determine Application Architecture Establish Instrumentation Plan Conduct Tests or Probe Production Measure Activity Compile Results Identify Opportunities to Optimize

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 24

Identify and Prioritize Important Transactions
• Critical to the business • Priority order • Use case – user – location – navigation path – data • Current and desired response time • Improvement’s value to the business

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 25

Determine Application Architecture
User Web Server Middle Tier Oracle Database

Trans

Service Request

DB Call

• • •

Identify the major technology building blocks through which the business transactions flow Identify the service calls/interfaces among the blocks Construct a “sequence diagram” template to map the blocks User Time Unacct Time Web Time Unacct Time Middle Tier Time Unacct Time Database Time

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 26

Establish Instrumentation Plan
For each transaction, • establish how to measure end-user time • establish how to measure response time consumed by each building block in the architecture
User
Trans Web Server Service Request Middle Tier

Oracle Database
DB Call

Stopwatch? Sniffer?
www.hotsos.com

Log File?

Log File?

Traces
Slide 27

Copyright © 1999–2004 by Hotsos Enterprises,

Conduct Tests or Probe Production
• Define goals • Construct a capture plan – How/who will drive transactions? – How/who will support the logging? • Prepare to execute the plan – Time synch all platforms – Enough space for logging? – “Skinny down” test connection pool • Execute the plan • Measure activity • Preserve the results, outputs
www.hotsos.com Slide 28

Copyright © 1999–2004 by Hotsos Enterprises,

Measure Activity

Phase Before During After

Steps Enable data collection for each relevant block Confirm that data are being collected properly • Turn off data collection • Capture the results

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 29

Compile Results
• • • • • • Use the Sequence Diagram Template Sift through the collected data Key individual event data into the template Record or calculate total “accounted for” time Derive “unaccounted for” time Determine if “unaccounted for” requires refinement and a new test

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 30

Identify Opportunities to Optimize
• Review sequence diagrams • Construct resource profiles • In descending order by time consumption – identify the component consuming the most time – evaluate optimization opportunities
• eliminating unnecessary service calls • combining many service calls into fewer • tuning individual service calls or actions

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 31

References
Millsap, C.; Holt, J. 2003. Optimizing Oracle Performance. Sebastopol CA: O’Reilly. This book provides a full description of Method R, a detailed reference for Oracle’s extended SQL trace facility, an introduction to queueing theory for the Oracle practitioner, and a set of worked performance improvement example cases.

www.hotsos.com

Copyright © 1999–2004 by Hotsos Enterprises,

Slide 32

Sign up to vote on this title
UsefulNot useful