Professional Documents
Culture Documents
The information contained in this document represents the current view of Microsoft
Corporation on the issues discussed as of the date of publication. Because
Microsoft must respond to changing market conditions, it should not be interpreted
to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the
accuracy of any information presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO
WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE
INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user.
Without limiting the rights under copyright, no part of this document may be
reproduced, stored in or introduced into a retrieval system, or transmitted in any
form or by any means (electronic, mechanical, photocopying, recording, or
otherwise), or for any purpose, without the express written permission of Microsoft
Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other
intellectual property rights covering subject matter in this document. Except as
expressly provided in any written license agreement from Microsoft, the furnishing
of this document does not give you any license to these patents, trademarks,
copyrights, or other intellectual property.
Introduction
Because I/O requests can be canceled asynchronously, IRP cancellation is difficult
to handle correctly in all cases. Cancel logic problems can remain dormant for long
periods of time because the code paths are rarely exercised. This paper describes
IRP cancellation in several contexts and shows the correct techniques for canceling
I/O requests.
The information in this paper applies to drivers for the Microsoft ® Windows® family
of operating systems. Driver writers can use this information to prevent driver errors
that are caused by faulty cancel logic.
Use the system-wide cancel spin lock and include a Cancel routine.
Use a driver-supplied locking mechanism and include a Cancel routine.
Use the cancel-safe IRP queuing mechanism (IoCsqXxx). A Cancel routine
is not required.
All new drivers should use the IoCsqXxx routines, and existing drivers should be
revised to use the IoCsqXxx routines whenever possible. This paper provides
information on all three methods for the benefit of driver writers who maintain
existing drivers for which revising the cancellation logic is not feasible.
Drivers that merely forward IRPs to a lower-level driver must neither include a
Cancel routine nor use the IoCsqXxx mechanism. The lower-level driver is
responsible for canceling the IRP.
The I/O Manager always holds the cancel spin lock when it calls a driver’s Cancel
routine. Consequently, if the Cancel routine acquires a second lock, a deadlock
could occur. For example, such a deadlock occurs when another thread within the
driver acquires a driver-created lock and then attempts to acquire the cancel spin
lock.
For drivers that perform I/O infrequently, use of the single system-wide cancel spin
lock is generally adequate. Drivers that perform I/O frequently, however, should not
use this technique. The single spin lock may become a bottleneck, thus causing
driver throughput to suffer, particularly on multiprocessor systems. Instead, such
drivers should use driver-supplied locks and the cancel-safe IRP queuing routines,
which are described in “Cancel-Safe IRP Queues.”
A driver that uses the cancel spin lock must follow these guidelines:
Mark the IRP pending while holding the cancel spin lock.
Complete the IRP with STATUS_CANCELLED if Irp->Cancel is set.
Dequeue the IRP while holding the cancel spin lock.
In the Cancel routine, remove the IRP from the pending structure while
holding the cancel spin lock.
Do not acquire another lock while holding the cancel spin lock. After the
driver releases the cancel spin lock, it can acquire another lock if necessary.
By following these guidelines, a driver can avoid the race conditions that often
cause problems during IRP cancellation.
Example
The following sections show sample code that holds the cancel spin lock while it
queues a pending IRP, removes the IRP from the queue, and cancels the IRP. The
sample code demonstrates the correct techniques to avoid race conditions and
deadlocks. However, some error-handling code has been removed for brevity. The
circled numbers in the sample code correspond to the numbered lists that follow the
samples.
if (status != STATUS_PENDING) {
Irp->IoStatus.Status = status;
IoCompleteRequest( Irp, IO_NO_INCREMENT );
}
The following notes refer to the circled numbers in the sample code:
1. Acquire the cancel spin lock to protect the pending structure. While this thread
holds the cancel spin lock, no other thread can begin IRP cancellation or modify
the pending structure.
2. Check the Cancel flag in the IRP to determine whether the IRP was canceled
before this thread acquired the cancel spin lock.
3. The IRP was canceled already, so release the cancel spin lock, and then
complete the IRP.
4. The IRP has not been canceled, so mark it pending.
5. Set the Cancel routine. While the current thread holds the cancel spin lock, no
other thread can cancel the IRP. Note that the sample code tests the Cancel
flag in step 2, before it sets the Cancel routine. If the sample code instead set
the Cancel routine before it tested the flag, and if the IRP was already canceled,
it might be necessary to remove the Cancel routine.
6. Queue the IRP by using a local routine (not shown) named PendIrpInStructure.
Set the status to STATUS_PENDING to signal to subsequent code not to
complete the IRP.
7. Release the cancel spin lock.
8. Complete the IRP if it was already canceled in step 2.
The following notes refer to the circled numbers in the sample code:
1. If the IRP is canceled before this thread acquires the spin lock, the dequeuing
code does not handle the IRP. Instead, the Cancel routine removes the IRP
from the pending structure. See “Code for the Cancel Routine” for more
information.
2. Acquire the cancel spin lock to block IRP cancellation. This lock prevents the
Cancel routine for the IRP from running. The code does not need to check
whether the Cancel routine is already running; if the Cancel routine has already
run, it has already released the cancel spin lock and therefore has already
removed the IRP.
3. Remove the IRP from the structure.
4. Remove the Cancel routine for the IRP. Because this thread holds the cancel
spin lock, the Cancel routine cannot be running.
5. Release the cancel spin lock. The IRP can still be canceled after this point, but
because the Cancel routine has been removed, the I/O Manager sets Irp-
>Cancel to TRUE and does not call a Cancel routine.
6. Complete the IRP.
The following notes refer to the circled numbers in the sample code:
1. Remove the IRP from the pending structure before releasing the cancel spin
lock to prevent any other thread from trying to dequeue it.
2. Release the cancel spin lock. The I/O Manager acquires the cancel spin lock
before it calls the Cancel routine; the Cancel routine must release the lock.
3. Complete the IRP. A driver must never call IoCompleteRequest while it holds
the cancel spin lock.
Example
The following three sections show sample code that queues a pending IRP,
removes the IRP from the queue, and cancels the IRP, using a driver-supplied lock.
The sample code demonstrates the correct techniques to avoid race conditions and
deadlocks. However, some error-handling code has been removed for brevity.
if (Irp->Cancel &&
IoSetCancelRoutine (Irp, NULL)) {
//
// The IRP has already been canceled, but the I/O
// Manager has not yet called the Cancel routine.
status = STATUS_CANCELLED;
//
// Complete the IRP later, after releasing the lock.
//
}
else {
IoMarkIrpPending(Irp);
InsertTailList(&devExtension->PendingIrpQueue,
&Irp->Tail.Overlay.ListEntry);
status = STATUS_PENDING;
}
}
KeReleaseSpinLock(&devExtension->QueueLock, oldIrql);
The following notes refer to the circled numbers in the sample code:
1. Obtain a lock and then set the Cancel routine. If the driver does not hold a lock,
the Cancel routine could run to completion before the next line of code is
executed, thus invalidating the reference to the IRP in the call to
IoSetCancelRoutine.
2. Test the Cancel flag to determine whether the IRP has already been canceled.
The IRP could have been canceled before the Cancel routine was set or
afterwards; point 3 determines when the cancellation occurred.
While the sample code holds the lock, the Cancel routine cannot run. If the
routine has already started to run, it is blocked while it waits for the lock.
Consequently, the IRP remains valid.
3. Determine whether the Cancel routine has started running. If not, remove the
routine to prevent it from starting.
The logical AND in the if statement is evaluated from left to right, and evaluation
stops as soon as the value is determined. Therefore:
If Irp->Cancel is FALSE, the IRP has not been canceled. The test
stops and execution continues with the else clause.
If Irp->Cancel is TRUE, the call to IoSetCancelRoutine is
executed, which removes the Cancel routine.
If IoSetCancelRoutine returns a valid pointer (TRUE), the IRP has been
canceled, but the Cancel routine is not running, so execution continues with
point 4.
KeAcquireSpinLock(&devExtension->QueueLock, &oldIrql);
while(!IsListEmpty(&devExtension->PendingIrpQueue))
{
listEntry =
RemoveHeadList(&devExtension->PendingIrpQueue);
nextIrp = CONTAINING_RECORD(listEntry, IRP,
Tail.Overlay.ListEntry);
if (IoSetCancelRoutine (nextIrp, NULL))
{
//
// Cancel routine cannot run now and cannot already have
// started to run.
//
break;
}
else {
//
// The Cancel routine is running. Leave the IRP alone.
//
InitializeListHead(listEntry);
nextIrp = NULL;
}
}
KeReleaseSpinLock(&devExtension->QueueLock, oldIrql);
1. If the IRP is canceled before this thread obtains a lock, IRP cancellation can
proceed to any point in the Cancel routine.
2. If the IRP is canceled after this thread obtains the lock, the Cancel routine
blocks until it can acquire a lock.
3. Look for the IRP in the queue. If the IRP was canceled, and its Cancel routine
acquired a lock before the current thread, the Cancel routine has already
removed the IRP from the queue. Therefore, one of the following must be true
for all of the IRPs in the queue:
The IRP has not been canceled.
The IRP has been canceled, but its Cancel routine has blocked
before obtaining a lock.
4. Remove the Cancel routine and determine whether the cancel code is active. If
IoSetCancelRoutine returns a non-NULL value (TRUE), the Cancel routine has
not already started; it cannot start after the Cancel routine is removed because
removal is performed in an interlocked exchange.
IoReleaseCancelSpinLock(Irp->CancelIrql);
//
// Acquire the queue lock
//
KeAcquireSpinLock(&devExtension->QueueLock, &oldIrql);
//
// Remove the canceled IRP from the queue and
// release the queue lock.
//
RemoveEntryList(&Irp->Tail.Overlay.ListEntry);
KeReleaseSpinLock(&devExtension->QueueLock, oldIrql);
//
// Complete the request with STATUS_CANCELLED.
//
Irp->IoStatus.Status = STATUS_CANCELLED;
Irp->IoStatus.Information = 0;
IoCompleteRequest (Irp, IO_NO_INCREMENT);
The following notes refer to the circled numbers in the sample code:
1. Release the cancel spin lock that the I/O Manager acquired.
2. Acquire the driver-created lock. If another thread is traversing the protected
structure, the Cancel routine blocks while it waits for the lock.
3. Remove the IRP from the pending structure.
4. Release the lock.
5. Complete the IRP unconditionally.
All new drivers for Windows 2000 and later releases should use the IoCsqXxx
routines. See “Availability of Cancel-Safe IRP Queuing Routines” for information on
compiling and linking drivers that use these routines.
Example
The sample code in the following sections shows the callback routines that a driver
must implement to use the cancel-safe IRP queuing mechanism.
} DEVICE_EXTENSION, *PDEVICE_EXTENSION;
devExtension = CONTAINING_RECORD(Csq,
DEVICE_EXTENSION, CancelSafeQueue);
if (!devExtension->CurrentIrp) {
devExtension->CurrentIrp = Irp;
return STATUS_UNSUCCESSFUL;
}
InsertTailList(&devExtension->PendingIrpQueue,
&Irp->Tail.Overlay.ListEntry);
return STATUS_SUCCESS;
}
The following notes refer to the circled numbers in the sample code:
1. Get a pointer to the device extension. The driver has allocated the cancel-safe
queue in its device extension, as shown in “Code to Define and Initialize
Structures.”
2. Insert the IRP at the end of the queue.
devExtension = CONTAINING_RECORD(Csq,
DEVICE_EXTENSION, CancelSafeQueue);
RemoveEntryList(&Irp->Tail.Overlay.ListEntry);
}
The following notes refer to the circled numbers in the sample code:
1. Get a pointer to the device extension, which contains the cancel-safe queue.
2. Remove the IRP from the queue.
devExtension = CONTAINING_RECORD(Csq,
DEVICE_EXTENSION, CancelSafeQueue);
listHead = &devExtension->PendingIrpQueue;
//
// If the IRP is NULL, we will start peeking from the
// head of the list. If not, we will start from the
if(Irp == NULL) {
nextEntry = listHead->Flink;
} else {
nextEntry = Irp->Tail.Overlay.ListEntry.Flink;
}
while(nextEntry != listHead) {
nextIrp = CONTAINING_RECORD(nextEntry, IRP,
Tail.Overlay.ListEntry);
irpStack = IoGetCurrentIrpStackLocation(nextIrp);
//
// If context is present, continue until we find a
// matching IRP. If not, break out of the loop
// as soon as we get the next IRP.
//
if(PeekContext) {
if(irpStack->FileObject ==
(PFILE_OBJECT)PeekContext) {
break;
}
} else {
break;
}
nextIrp = NULL;
nextEntry = nextEntry->Flink;
}
//
// Check if this is from start packet.
//
if (PeekContext == NULL) {
devExtension->CurrentIrp = nextIrp;
}
return nextIrp;
The following notes refer to the circled numbers in the sample code:
1. Initialize a pointer to the list of IRPs.
2. Get the next IRP from the list.
3. When the driver called IoRemoveNextIrp, it passed a pointer to context
information that identifies the IRP to remove; in turn, the I/O Manager passes
that pointer to the XxxCsqPeekNextIrp routine. If the pointer is NULL, this
routine should return the next IRP in the list. If the pointer is not NULL, this
routine must walk the list until it finds an IRP with the matching context.
4. This driver uses the FileObject from the IRP stack location as the context, so
that it can remove the next IRP for a particular file. Passing the FileObject as
the peek context is useful for handling IRP_MJ_CLEANUP requests. The peek
context is not required to be the FileObject; it can be any information that the
driver requires to determine which IRP to remove from the queue. For example,
a driver that handles IRPs based on some internal priority scheme might pass
information related to the priority.
5. This driver implements an internal start-packet routine to sequence the
processing of I/O requests. If XxxCsqPeekNextIrp is called without a context,
the routine sets the current IRP.
6. Return a pointer to the IRP.
devExtension = CONTAINING_RECORD(Csq,
DEVICE_EXTENSION, CancelSafeQueue);
KeAcquireSpinLock(&devExtension->QueueLock, Irql);
return;
}
The following notes refer to the circled numbers in the sample code:
1. Get a pointer to the device extension.
2. Acquire the spin lock, which is allocated in the device extension.
The following sample code releases the previously acquired lock:
VOID CsampReleaseLock(
IN PIO_CSQ Csq,
IN KIRQL Irql
)
{
PDEVICE_EXTENSION devExtension;
devExtension = CONTAINING_RECORD(Csq,
DEVICE_EXTENSION, CancelSafeQueue);
KeReleaseSpinLock(&devExtension->QueueLock, Irql);
}
The following notes refer to the circled numbers in the sample code:
1. Get a pointer to the device extension.
2. Release the spin lock.
Example
The following code samples show correct coding techniques for a StartIo routine
and the corresponding Cancel routine.
Irp->IoStatus.Status = ntStatus;
IoStartNextPacket(DeviceObject, TRUE);
return;
}
The StartIo routine performs the I/O, and then it calls IoStartNextPacket and
IoCompleteRequest to get the next IRP and complete the current IRP.
Because this driver used IoSetStartIoAttrbutes to disable recursive calls into its
StartIo routine, the StartIo routine can safely call IoStartNextPacket. By calling
IoStartNextPacket before IoCompleteRequest, the driver can keep the hardware
busy while the operating system completes the IRP.
{
BOOLEAN wasQueued;
//
// Remove the IRP from the queue.
//
wasQueued =
KeRemoveEntryDeviceQueue(&DeviceObject->DeviceQueue,
&Irp->Tail.Overlay.DeviceQueueEntry );
IoReleaseCancelSpinLock( Irp->CancelIrql );
if(wasQueued)
{
Irp->IoStatus.Status = STATUS_CANCELLED;
Irp->IoStatus.Information = 0;
IoCompleteRequest(Irp, IO_NO_INCREMENT);
}
else
{
// The IRP was not found in the queue.
// A serious problem has occurred.
ASSERT(FALSE);
}
return;
}
The following notes refer to the circled numbers in the sample code:
1. Remove the IRP from the queue and release the cancel spin lock.
2. Cancel and complete the IRP.
3. In the routine that dequeues the IRP, remove the association and free the
context.
4. In the Cancel routine, remove the association and complete the IRP, but do not
free the context.
5. When the deferred procedure call (DPC) or interrupt service routine (ISR) runs,
which indicates that the hardware received a character, the driver does not
check at the IRP. Instead, the driver checks the context and then determines
whether the IRP has been canceled. If the IRP has not been canceled, the
driver completes the IRP. If the IRP has been canceled, the driver frees the
context.
Queue the IRP, set the Cancel routine, mark the IRP pending, and return a
pending status.
Dequeue the IRP, remove its Cancel routine, and process the IRP (typically
to completion).
Cancel the IRP by removing it from the queuing structure and completing it.
A race condition can occur if an IRP is canceled at any of the following points:
After a driver routine is called, but before it queues the IRP. A race
condition can occur between the driver routine and the Cancel routine.
After a driver routine is called, but before it starts to process an IRP. For
example, an IRP might be canceled after a driver's StartIo routine is called, but
before the StartIo routine removes the IRP from the device queue. In this case,
the race condition occurs between the StartIo routine and the Cancel routine.
After the driver routine dequeues the IRP, but before it starts the requested
I/O. The race condition occurs between the dequeuing routine and the Cancel
routine.
Note: A race condition can also occur if the event that completes a pending IRP occurs
before the driver has finished queuing the IRP. This is not strictly a cancellation problem, but
it is similar.
After a driver queues an IRP and releases any spin locks that protect the queue,
another thread can access and change the IRP. When the original thread resumes
— possibly as soon as the next line of code — the IRP might have already been
canceled or otherwise changed.
Understanding which thread owns, or has access to, the IRP at each point in
processing can help find such problems. Examples 1 and 2 trace IRP ownership to
demonstrate the issues.
Example 1
In the following example, a race condition can occur between the sample code and
the Cancel routine (not shown) if the IRP is canceled after the driver releases the
cancel spin lock but before it marks the IRP pending.
case IOCTL_GET_REQUEST:
IoAcquireCancelSpinLock (&cancelIRQL);
InsertTailList (&ConnectionIrpQueue,
&Irp->Tail.Overlay.ListEntry);
IoSetCancelRoutine (Irp, GetReqCancel);
IoReleaseCancelSpinLock (cancelIRQL);
IoMarkIrpPending (Irp);
As soon as this thread stores the address of the IRP in the global structure
(ConnectionIrpQueue) and releases the cancel spin lock, another thread can take
ownership of the IRP. The new owning thread then acquires the cancel spin lock
and reads the queued IRP by traversing the global structures. By the time that the
original thread marks the IRP pending, the other thread might already have
completed the IRP.
To correct the problem, the driver must hold the cancel spin lock when it marks the
IRP pending and sets status. For example:
case IOCTL_GET_REQUEST:
IoAcquireCancelSpinLock (&cancelIRQL);
InsertTailList (&ConnectionIrpQueue,
&Irp->Tail.Overlay.ListEntry);
IoSetCancelRoutine (Irp, GetReqCancel);
IoMarkIrpPending (Irp);
Example 2
In the following example, a race condition can occur between the routine that
dequeues the IRP and the Cancel routine for the IRP.
if(!IsListEmpty( &pDevExt->IrpQ[Type].ListHead) ) {
KeAcquireSpinLock( &pDevExt->IrpQ[Type].SpinLock,
&kTempIrql );
pHead = RemoveHeadList( &pDevExt->IrpQ[Type].ListHead );
KeReleaseSpinLock( &pDevExt->IrpQ[Type].SpinLock,
kTempIrql );
pIrp = CONTAINING_RECORD(
pHead, // Address
IRP, // Type
Tail.Overlay.ListEntry ); // Field
The sample code includes several errors. The first line incorrectly checks whether
the list of IRPs is empty before it obtains a lock to protect the list. If the IRP is
canceled after the test completes but before the driver acquires the spin lock, the
code within the lock could attempt to remove an IRP from an empty list.
Furthermore, the sample code releases the lock before it removes the Cancel
routine. Because the sample code does not use the cancel spin lock, no locks are in
place when it calls IoSetCancelRoutine. Therefore, if the IRP is canceled after the
call to KeReleaseSpinlock, the Cancel routine might already have started to run
when the sample code tries to remove the Cancel routine. To correct this error, the
driver should check the result of IoSetCancelRoutine. If IoSetCancelRoutine
returns NULL, IRP cancellation is already in progress.
The following code shows the Cancel routine for this IRP:
VOID
IrpqCancelRoutine(
IN PDEVICE_OBJECT DeviceObject,
IN PIRP Irp
)
{
IoReleaseCancelSpinLock( Irp->CancelIrql );
KeAcquireSpinLock( &pDevExt->IrpQ[q].SpinLock,
&kTempIrql );
RemoveEntryList( &Irp->Tail.Overlay.ListEntry );
KeReleaseSpinLock( &pDevExt->IrpQ[q].SpinLock,
kTempIrql );
}
In the Cancel routine, the call to RemoveEntryList might attempt to remove an IRP
that was previously removed by the dequeuing code. To correct this problem, the
Cancel routine must be able to detect whether the IRP is present in the queue. To
make such detection possible, the driver could call InitializeListHead from the
dequeue code to initialize the link fields in the IRP; alternatively, it could mark the
IRP in some way. If the Cancel routine detects that the IRP has already been
removed, it must not try to remove the IRP from the queue.
Calling IoCompleteRequest for an IRP while the Cancel routine is still set.
Calling an IRP’s Cancel routine after the IRP has completed without
cancellation.
Cancel routines that return at an elevated IRQL.
The DC2 and Devctl tools can assist in finding “lost” IRPs. A “lost” IRP is a request
that the device finished handling, but the driver neither completed by calling
IoCompleteRequest nor passed to another driver.