Treiber entwicklung

Da ich seit kurzem zur eintwicklung eines Treibers “verdonnert” wurde, musste ich mich mit dem ganzen Low Level kram von Windows beschäftigen. Dabei hat mir www.osronline.com bzw. die leute aus der Mailing liste dort sehr geholfen. Um den Text nicht zu verlieren und für leute dies interressiert habe ich hier eine antwort zum Thema IRQL nochmal gesichert.

The documentation for every DDK function shows (or should) the IRQLs at which you can safely call that function.
NT provides many different mechanisms for synchronizing execution.
Each mechanism is designed for a different situation. Spinlocks are just one of those mechanisms. They are designed for synchronizing access to data, where that data needs to be accessed at DISPATCH_LEVEL.
Before continuing, make sure you understand why we have dispatch levels at all, and why some DDK functions can only be called at certain dispatch levels.

PASSIVE_LEVEL is the lowest-priority dispatch level. When a thread is running at this level, the thread is “schedulable” — the kernel can choose to halt the thread and reschedule it at any time. This means that a thread running at PASSIVE_LEVEL does not “own” a processor; it is frequently given access to a processor, if the thread is runnable, but it cannot be guaranteed access to a processor. When a thread is running at any higher dispatch level, the thread is not schedulable. This means that the thread temporarily “owns” the processor that it is running on. As long as that thread is running at a higher dispatch level, the scheduler will never park that thread and schedule another one (on the same processor — other processors may be running at different dispatch levels). User-mode threads run at PASSIVE_LEVEL. When those threads call kernel entry points, such as NtClose or NtCreateFile or NtSetEvent (etc.), the kernel half of that thread wakes up, thread arguments are copied from the user-mode stack to the kernel-mode stack, and the kernel-mode part of thread begins running the correct system call function (e.g. nt!NtClose, which is the NtClose routine in ntoskrnl.exe, not ntdll.dll). But that thread is still running at PASSIVE_LEVEL, even though it is now running in kernel-mode. Because threads running at PASSIVE_LEVEL are schedulable, they can wait for dispatch objects, such as events, mutexes, semaphores, threads, and processes. When such a thread calls ZwWaitForSingleObject, if that object is not “signaled” (the meaning of “signaled” varies per object type), then that thread is halted, and placed in a “waiting” state. The scheduler then chooses the next runnable thread.

When a thread elevates to DISPATCH_LEVEL or above, this can no longer happen. The scheduler cannot halt the current thread, by definition. So you cannot call ZwWaitForSingleObject (unless you call it in polling mode, by passing a timeout of 0 (not NULL, but 0)), or any other function that boils down to any kind of “wait” operation. This includes many functions, including many of the functions that are exposed 1:1 in NTDLL.DLL. You can’t call ZwClose, ZwCreateFile, ZwReadFile, ZwWriteFile, ZwDeviceIoControl, etc. All of these functions can potentially block the current thread, and so are illegal at elevated IRQL. One of the reasons for this is paging. The “passive” functions, like ZwReadFile, are allowed to operate on memory that can be paged (swapped out). Since these functions are meant for use by user-mode threads, and since memory allocated to user-more threads is always pageable (* with a few slight exceptions), you can see that these functions can all implicitly cause a thread to be halted (become not schedulable), because any page fault (touching of a page that is marked “not present”) will require the thread to be halted while the page is read from disk.

So what is the purpose of DISPATCH_LEVEL?
DISPATCH_LEVEL is a level that allows device drivers to run code without switching to a different thread. When a device completes a request, typically that device asserts its interrupt line, and the HAL+kernel receive the interrupt. Interrupts occur at “high” priority, meaning higher than DISPATCH_LEVEL. Interrupt service routines are not allowed to swap the current thread; they inherit the constraint from DISPATCH_LEVEL that the current thread cannot be swapped. Interrupt service routines must do their job by borrowing the stack of the currently-running thread. In general, ISRs check their device, extract any interrupt information from the device, and then clear the interrupt so that the device stops asserting its interrupt line, and then the ISR schedules a DPC routine to be run “later, but soon”. This inserts a DPC object into a queue of DPCs to be run, on the current processor (usually), when the current thread/processor *attempts* to lower its IRQL from DISPATCH_LEVEL to PASSIVE_LEVEL.
In effect, DPCs are kind of “another” scheduler. Rather than scheduling threads, DPC objects schedule the execution of DPC routines. DPC routines are allowed to briefly “borrow” the current processor, and they always pre-empt threads running at PASSIVE_LEVEL.

Now we get back to your original question (sort of), of what good are spinlocks? Spinlocks are useful for synchronizing code that runs at PASSIVE_LEVEL with code that runs in response to device interrupts. The device interrupt logic runs at “device IRQL”, which is higher than dispatch, so its schedules a work item to run at DISPATCH_LEVEL. The passive-level code, such as your DispatchDeviceIoControl routine, runs at PASSIVE_LEVEL, so it elevates to DISPATCH_LEVEL by using KeRaiseIrql, or by using KeAcquireSpinLock (which also raises the current thread to DISPATCH_LEVEL). Then both “halves” of the driver can safely access shared data structures. If you want to synchronize access to data structures that are in pageable memory, or will only be accessed by threads that are always running at PASSIVE_LEVEL, then you have a lot of other options. You can use mutexes (KeInitializeMutex), semaphores (KeInitializeSemaphore), or events (KeInitializeEvent).
These all leave the current thread at PASSIVE_LEVEL after the waitable object has become signaled (meaning: the mutex is acquired, or the semaphore has been incremented, or the event has signaled). Or you can use fast mutexes (ExInitializeFastMutex), which leave the current thread in APC_LEVEL. (I haven’t discussed APC_LEVEL, but it’s sort of halfway between DISPATCH_LEVEL and PASSIVE_LEVEL, and if you’re new to dispatch levels, I would suggest you just avoid APC_LEVEL and master DISPATCH_LEVEL vs PASSIVE_LEVEL first.) This is a bit rambly, of course, but I hope it points you in the right direction. The shortest summary is: Dispatch levels are how threads control the scheduler (enabling and disabling scheduling), and how threads synchronize with interrupt service routines.

Dazu gehört dann noch die Link liste aus einer weiteren antwort:

http://www.microsoft.com/whdc/driver/kernel/locks.mspx
When you’re done with that, read this:
http://www.microsoft.com/whdc/driver/kernel/IRQL.mspx
After you’ve had a good long rest, then read this: http://www.microsoft.com/whdc/driver/kernel/MP_issues.mspx
If you’re still curious, read this:
http://www.microsoft.com/whdc/driver/kernel/mem-mgmt.mspx

Danke nochmal an Jake Oshins und Arlie Davis für die erklärung.

Leave a Reply