Professional Documents
Culture Documents
Quick Reference
Initial data to collect
Windows Event Log – Application and System
o Any unusual error messages, for example, degraded components or errors/warnings
that coincide with the issue?
SQL Server Error Log
o Any unusual errors or warnings?
sys.dm_os_waiting_tasks
o Who is currently waiting? What are they waiting on?
sys.dm_exec_connections, sys.dm_exec_requests, sys.dm_exec_sessions
o Who is currently running? What are they running?
sys.dm_os_wait_stats
o What is the wait percentage since the last manual clearing of stats or last restart?
Ask if this is a virtual machine or physical
o If it is virtual, you need to understand the following:
o Resources allocated to the guest?
vCPUs
Memory
o Provisioning at the host level?
# of guests at the host level
Resources provisioned to those guests
o Any restrictions / limits by CPU and Memory?
o Access to vCenter stats?
Since % Processor time does not represent physical hardware resource
consumption, we’ll want to see VM Processor\% Processor Time
o General methods for monitoring VMware include esxtop (host level for admins), Virtual
Center and in-guest performance counters
I’m going to assume that DBAs won’t have access to esxtop, so I would
recommend requesting “Read Only” access for Virtual Center to allow the ability
to view the state of a VM.
Within Virtual Center we’ll want to see “CPU ready” summation values (and
check out the following KB regarding conversions
http://kb.vmware.com/selfservice/microsites/search.do?
language=en_US&cmd=displayKC&externalId=2002181)
For VMware vSphere ESX 4 and higher, the guest counters should be built-in
with no additional permissions needed. Noteworthy counters include:
% Processor Time (at host level)
Host Processor Speed (MHz)
Limit (MHz)
Reservation (MHz)
Memory Limit (MB)
Memory Reservation (MB)
Memory Ballooned (MB) – which we’ll never want to see non-zero
values for
Memory Swapped (MB) – another one that should always be 0
o VMware host power settings (should be “high performance”)
Look at running traces (sys.traces) or XE sessions
o Anything running that is non-standard? Be sure to look out for “observer overhead”
Perfmon stats (most recent counter logs)
o Use these numbers to look for non-standard values or values that skew from the
average baseline
Top resource consumers from sys.dm_exec_query_stats
o Who are the top consumers of I/O and CPU?
o Who are the top consumers if you group by query_hash? (2008 and up)
SQL Server instance settings from sys.configurations
o Any non-default or non-standard settings?
o Any changes since the last check?
Database settings from sys.databases
o Any non-default or non-standard settings?
“Auto” options misconfigured?
Shrink enabled?
Uncommon non-defaults values?
High Virtual Log File counts?
File placement (any files on the C: drive)?
Default tempdb single-file?
Multiple transaction log files (no good long term reason for multiple transaction
log files)
Non-current database compatibility level
Auto-growth by percent instead of by size
Non checksum page verification
o Any changes since the last check?
Is Resource Governor configured or any of the RG defaults modified?
What are the OS/BIOS power settings? (should be “high performance”)
Recommended reading
This reference is intended to summarize a very broad subject. Therefore, at minimum, it is
recommended you familiarize yourself with the following freely downloadable references:
Patterns
The following table is adapted, modified and expanded from the original source SQL Server: Common
Performance Issue Patterns (Pluralsight course) and : “Performance Guidance for SQL Server in Windows
Azure Virtual Machines” (SQL CAT Whitepaper):
Signs of significant I/O (shared symptoms) Any query hints being used?
Pattern may involve specific errors in Any high cost sort operations? And if so, can the sorts
conjunction with degradation be avoided?
Error 701 “There is insufficient Check for Hash Match operations – are they appropriate
system memory to run this given the estimated vs. actual row counts?
query.”
Error “A significant part of sql If a 32-bit system, is this virtual memory pressure?
server process memory has been Determine if this is external memory pressure
paged out.”
Error “Failed Virtual Allocate Heavy non-SQL activities?
Bytes: FAIL_VIRTUAL_RESERVE”
Collocated instances?
Faulty drivers?
Other (lower cost) queries still running fine Estimated memory request may be distorted due to
concurrently cardinality estimate issues
The failing query may run fine in quiet Look for sort & hash operations
periods, but concurrently seeing issues
Test with and without parallelism (decreasing can
Seeing RESOURCE_SEMAPHORE wait type reduce memory requirements)
Seeing Memory Grants Pending counter at Look for other high memory consuming queries to tune
non-zero values and output from
sys.dm_exec_query_memory_grants Using Resource Governor? Check for misconfigurations
You see lock escalation events reporting in Can you break it up into smaller sets of operations?
SQL Trace or via Extended Events
Missing or disused indexes? Bookmark lookups?
Profiler SP:Recompile/SQL:StmtRecompile
EventSubClass can be used to validate reasons behind
recompiles
Schema, statistics, SET options, temp table
changes, hints
Tempdb allocation Slow performance on workloads Validate number of equally sized tempdb data files
page contention CSS guidance
Latch waits on tempdb resources like 2:1:1 o <= 8 cores, #files = #cores
(first PFS page) & 2:1:3 (first SGAM page) o >8 cores, #files = 8 (add in 4-file
chunks / monitor for contention)
You have query workloads that heavily rely
on temporary tables Is TF 1118 enabled for SGAM contention?
Query plans may show spools / sorts Can the workloads be further optimized to reduce
SET STATISTICS IO worktables reliance on tempdb?
Tempdb Row Query degradation in association with any Check Tempdb DMVs
Versioning of the following:
Row versioning isolation levels Is tempdb appropriately sized?
Online index operations
Triggers Is tempdb collocated with other high I/O consumers?
Tempdb I/O Issues Tempdb usage is high and it is collocated Start with troubleshooting the workloads first
along with other databases
Don’t resort to an I/O path solution until you’ve
Other databases also have a high I/O eliminated opportunities for workload optimization
demands
When in doubt, try to isolate tempdb from other I/O
Tempdb usage could be a combination of activity
user objects, row versioning activity and
workspace memory
But make sure that I/O path is sufficient to meet I/O
demands
Tempdb Query Slow performance associated with the Start with troubleshooting the workloads first
Workspace following query operations: Missing indexes
Overhead Cardinality estimate issues
Hash joins, aggregates Unnecessary sorts
Spools
Cursors
You are evaluating a set of query execution Key areas to validate
Query Plan Quality plans which are associated with poorly Execution plan
issues performing queries Associated statistics, indexing
Query construction
Performance may be good sometimes and
bad other times (not predictable) Look for constructs that can impact plan quality and
proper cardinality estimates:
Row estimated versus actual is significantly Table variables, TVFs, MSTVFs, Scalar
skewed functions, modifying in-flight variables, data-
Example “1,000,000” rows type conversions, wrapping indexed columns
estimated, but actually only “1” in expressions, missing indexes, missing or
row – or vice versa stale statistics