You are on page 1of 5

SQL Server: Part 1 : Approaching Database Server Performance Issues

When you work as DBA, many people will approach you with a complaint like "Application is
taking ages to load the data on a page,could you please check something going wrong with
database server?" There might be hundred of other reason for slowness of the page.It might be a
Problem with application server,network issues,really a bad implementation or problem with
database server due to generation of huge report /job running at that moment. What ever be the
issue, database gets the blame first. Then it is our responsibility to cross check the server state.

Let us discuss how we can approach this issue. I use following script to diagnose the issues. The
first script which I will run against server is given below:
SELECT
parent_node_id AS Node_Id,
COUNT(*) AS [No.of CPU In the NUMA],
SUM(COUNT(*)) OVER() AS [Total No. of CPU],
SUM(runnable_tasks_count ) AS [Runnable Task Count],
SUM(pending_disk_io_count) AS [Pending disk I/O count],
SUM(work_queue_count) AS [Work queue count]
FROM sys.dm_os_schedulers WHERE status='VISIBLE ONLINE' GROUP BY parent_node_id

This will give following information.

 Number of records in the output will be equal to number of NUMA nodes (if it is
fetching only one record , it is not a NUMA supported server)
 Node_id : NUMA node id . Can be mapped into the later scripts.
 No.of CPU in the NUMA : Total number of CPU assigned to the specific NUMA node or
the number of schedulers.
 Total No. of CPU : Total No. of CPU available in the server.If you have set the affinity
mask, total number of CPU assigned to this instance.
 Runnable Task Count: Number of workers, with tasks assigned to them, that are waiting
to be scheduled on the runnable queue. Is not nullable. In short number of request in
runnable queue.To understand more about Runnable queue , read my earlier post.
 Pending disk I/O count : Number of pending I/Os that are waiting to be completed. Each
scheduler has a list of pending I/Os that are checked to determine whether they have been
completed every time there is a context switch. The count is incremented when the
request is inserted. This count is decremented when the request is completed.
 Work queue count: Number of tasks in the pending queue. These tasks are waiting for a
worker to pick them up.

I have scheduled this scrip to store the output this query in a table for two days in the interval of
10 minutes. That will give baseline data about what is normal in your environment. In my
environment people will start complaining once the Runnabable Task Count of most of the nodes
goes beyond 10 consistently. In normal scenario, the value of Runnabable Task Count will be
always below 10 on each node and never seen a value greater than 0 for work queue count
field.This will give a picture of current state of the system.If the output of this step is normal, we
are safe to an extent, the slow response might be issue which might be beyond our control or a
blocking and slow response is only for a couple of screens(sessions) not for entire system.

SQL Server: Part 2 : Approaching Database Server Performance Issues

In the Part 1, we have seen how quickly we can check the runnable task and I/O pending task on
an SQL server instance. This is very light weight script and it will give the result even if the server
is under pressure and will give an over all state of the server at that moment.

The next step (Step2) in my way of diagnosing is to check the session that are waiting of any
resources. Below script will help us. This query required a function as prerequisite, which will
help us to display the SQL server agent job name if the session started by SQL server agent.

/*************************************************************************************
****
PREREQUISITE FUNCTION
**************************************************************************************
****/
USE MASTER
GO
CREATE FUNCTION ConvertStringToBinary ( @hexstring VARCHAR(100)
) RETURNS BINARY(34) AS
BEGIN

RETURN(SELECT CAST('' AS XML).value('xs:hexBinary(


substring(sql:variable("@hexstring"), sql:column("t.pos")) )', 'varbinary(max)')
FROM (SELECT CASE SUBSTRING(@hexstring, 1, 2) WHEN '0x' THEN 3 ELSE 0 END) AS
t(pos))
END
/*************************************************************************************
**
STEP 2: List the session which are currently waiting for resource
**************************************************************************************
**/
SELECT node.parent_node_id AS Node_id,
es.HOST_NAME,
es.Login_name,
CASE WHEN es.program_name LIKE '%SQLAgent - TSQL JobStep%' THEN
(
SELECT 'SQL AGENT JOB: '+name FROM msdb..sysjobs WHERE job_id=
MASTER.DBO.ConvertStringToBinary
(LTRIM(RTRIM((SUBSTRING(es.program_name,CHARINDEX('(job',es.program_name,0)+4,35)))))
)
ELSE es.program_name END AS [Program Name] ,
DB_NAME(er.database_id) AS DatabaseName,
er.session_id,
wt.blocking_session_id,
wt.wait_duration_ms,
wt.wait_type,
wt.NoThread ,
er.command,
er.status,
er.wait_resource,
er.open_transaction_count,
er.cpu_time,
er.total_elapsed_time AS ElapsedTime_ms,
er.percent_complete ,
er.reads,
er.writes,
er.logical_reads,
wlgrp.name AS ResoursePool ,
SUBSTRING (sqltxt.TEXT,(er.statement_start_offset/2) + 1,
((CASE WHEN er.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), sqltxt.TEXT)) * 2
ELSE er.statement_end_offset
END - er.statement_start_offset)/2) + 1) AS [Individual Query],
sqltxt.TEXT AS [Batch Query]
FROM (SELECT session_id, SUM(wait_duration_ms) AS
wait_duration_ms,wait_type,blocking_session_id,COUNT(*) AS NoThread
FROM SYS.DM_OS_WAITING_TASKS GROUP BY session_id, wait_type,blocking_session_id) wt
INNER JOIN SYS.DM_EXEC_REQUESTS er ON wt.session_id=er.session_id INNER JOIN
SYS.DM_EXEC_SESSIONS es ON es.session_id= er.session_id
INNER JOIN SYS.DM_RESOURCE_GOVERNOR_WORKLOAD_GROUPS wlgrp ON
wlgrp.group_id=er.group_id
INNER JOIN (SELECT os.parent_node_id ,task_address FROM SYS.DM_OS_SCHEDULERS OS
INNER JOIN
SYS.DM_OS_WORKERS OSW ON OS.scheduler_address=OSW.scheduler_address
WHERE os.status='VISIBLE ONLINE' GROUP BY os.parent_node_id ,task_address ) node
ON node.task_address=er.task_address
CROSS APPLY SYS.DM_EXEC_SQL_TEXT(er.sql_handle) AS sqltxt
WHERE sql_handle IS NOT NULL AND wt.wait_type NOT IN ('WAITFOR','BROKER_RECEIVE_WAITFOR')
GO

The Description of the columns in the result are given below.

Column Name Description


Node Id NUMA node id . Can be mapped to the node id of the scheduler query.
Host_Name Name of the computer from the connection is originated.
Login Name Login used in the session to connect the database server
Name of the program/application using this session. You can set the
Program Name application name in the connection string. If this session is part of SQL
server agent job, it will show the job name
Database Name Current database of the session
Session Id The session id
Blocking Session id Session id blocking statement
Total wait time for this wait type, in milliseconds. This time is
wait_duration_ms
inclusive of signal wait time
wait_type Name of the wait type like SLEEP_TASK,CXPACKET etc
No of threads running on this session. If the session is in parallel
No of Thread
execution
Identifies the current type of command that is being processed like
Command
Select,insert,update,delete etc
Status of the request. This can be of the following:
Status
Background,Running,Runnable,Sleeping and Suspended
Wait Resource Resource for which the request is currently waiting
Open Transaction count Number of transaction opened in this session
Cpu Time CPU time in milliseconds that is used by the request.
Total Elapsed Time Total time elapsed in milliseconds since the request arrived
Percent of work completed for certain operations like backup,restore
Percent_Complete
rollback etc.
Reads Number of reads performed by this request.
Writes Number of writes performed by this request.
logical_reads Number of logical reads performed by this request.
ResoursePool Name of of Resource Governor Pool
Individual Query current statement of the batch running on this session.
Batch Query Current batch (procedure/set of sql statement) running on this session.

If there is a session with very long wait_duration_ms and not blocked by any other session and not
going away from the list in the subsequent execution of the same query, I will look into the
program name,host name,login name and the statement that is running which will give me an idea
about the session.Based on all these information, I might decide to kill that session and look into
the implementation of that SQL batch. If the session is blocked, I will look into the blocking
session using a different script which I will share later.(Refer this post)

The next step (Step 3) is to list all session which are currently running on the server. I use below
query to do that.

/***************************************************************************************
STEP 3: List the session which are currently waiting/running
**************************************************************************************
**/SELECT node.parent_node_id AS Node_id,
es.HOST_NAME,
es.login_name,
CASE WHEN es.program_name LIKE '%SQLAgent - TSQL JobStep%' THEN
(SELECT 'SQL AGENT JOB: '+name FROM msdb..sysjobs WHERE
job_id=ADMIN.DBO.ConvertStringToBinary
(LTRIM(RTRIM((SUBSTRING(es.program_name,CHARINDEX('(job',es.program_name,0)+4,35)))))
)ELSE es.program_name END AS program_name ,
DB_NAME(er.database_id) AS DatabaseName,
er.session_id,
wt.blocking_session_id,
wt.wait_duration_ms,
wt.wait_type,
wt.NoThread ,
er.command,
er.status,
er.wait_resource,
er.open_transaction_count,
er.cpu_time,
er.total_elapsed_time AS ElapsedTime_ms,
er.percent_complete ,
er.reads,er.writes,er.logical_reads,
wlgrp.name AS ResoursePool ,
SUBSTRING (sqltxt.TEXT,(er.statement_start_offset/2) + 1,
((CASE WHEN er.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), sqltxt.TEXT)) * 2
ELSE er.statement_end_offset
END - er.statement_start_offset)/2) + 1) AS [Individual Query],
sqltxt.TEXT AS [Batch Query]
FROM
SYS.DM_EXEC_REQUESTS er INNER JOIN SYS.DM_EXEC_SESSIONS es ON es.session_id=
er.session_id
INNER JOIN SYS.DM_RESOURCE_GOVERNOR_WORKLOAD_GROUPS wlgrp ON
wlgrp.group_id=er.group_id
INNER JOIN (SELECT os.parent_node_id ,task_address FROM SYS.DM_OS_SCHEDULERS OS
INNER JOIN SYS.DM_OS_WORKERS OSW ON OS.scheduler_address=OSW.scheduler_address
WHERE os.status='VISIBLE ONLINE' GROUP BY os.parent_node_id ,task_address ) node ON
node.task_address=er.task_address
LEFT JOIN
(SELECT session_id, SUM(wait_duration_ms) AS
wait_duration_ms,wait_type,blocking_session_id,COUNT(*) AS NoThread
FROM SYS.DM_OS_WAITING_TASKS GROUP BY session_id, wait_type,blocking_session_id) wt
ON wt.session_id=er.session_id
CROSS apply SYS.DM_EXEC_SQL_TEXT(er.sql_handle) AS sqltxt
WHERE sql_handle IS NOT NULL AND ISNULL(wt.wait_type ,'') NOT IN
('WAITFOR','BROKER_RECEIVE_WAITFOR')
ORDER BY er.total_elapsed_time DESC

GO

The columns are same as we discussed in step 2 . I used to analyse the sessions with more
total_elapsed_time and take appropriate actions like killing the session and look into
the implementation. In most of the scenario (where server was running perfectly but all off
sudden it become standstill) , I will be able fix the issue by following these steps. In the next
part let us discuss about blocking session and session with open transaction which is not active.