Wednesday, April 12, 2017
Microsoft Windows 8 1 Kernel Patch Protection Analysis Attack Vectors
Microsoft Windows 8 1 Kernel Patch Protection Analysis Attack Vectors
Authors: Mark Ermolov, Artem Shishkin // Positive Research
PDF version: link
Kernel Patch Protection (also known as "patchguard") is a Windows mechanism designed to
control the integrity of vital code and data structures used by the operating system. It was
introduced in Windows 2003 x64 and has been constantly improved in further Windows
versions. In this article we present a descriptive analysis of the patchguard for the latest
Windows 8.1 x64 OS, and primarily focus on patchguard initialization and attack vectors related
to it.
It is natural that kernel patch protection is being developed incrementally, so the initialization
process is common for all versions of Windows that have patchguard. There are a lot of papers
published about kernel patch protection on Windows, which describe the process of its
initialization, so you may use references at the end of this article to obtain details.
Initialization sources
As widely known, the main component of patchguard is initialized in a misleadingly named
function "KiFilterFiberContext". It will be the starting point of our investigation. Looking for
cross-references doesnt help us much for pointing out its call site, but several articles help us
by stating that patchguard initialization is called indirectly in a function
"KeInitAmd64SpecificState". By indirectly we mean here not just an indirect call, but the usage
of exception handlers. It is a very common trick often found in patchguard-related functions, as
well see further. So, we have an initialization function call stack:
... --> Phase1InitializationDiscard --> KeInitAmd64SpecificState -> KiFilterFiberContext
(call) (call) (exception)
This type of initialization is described in more detail in [1]. By the way, this one is always called
on the last CPU core, if it matters.
However, it is not the only way that kernel uses to initialize patchguard. With a 4% probability
patchguard context can also be initialized from a function also misleadingly called
"ExpLicenseWatchInitWorker":
... --> Phase1InitializationDiscard --> sub_14071815C (obviously with a stripped symbol because this one processes Windows license type for a current PC) --> ExpLicenseWatchInitWorker
The pseudocode of this function looks like this:
VOID ExpLicenseWatchInitWorker()
{
PVOID KiFilterParam;
NTSTATUS (*KiFilterFiberContext)(PVOID pFilterparam);
BOOLEAN ForgetAboutPG;
// KiServiceTablesLocked == KiFilterParam
KiFilterParam = KiInitialPcr.Prcb.HalReserved[1];
KiInitialPcr.Prcb.HalReserved[1] = NULL;
KiFilterFiberContext = KiInitialPcr.Prcb.HalReserved[0];
KiInitialPcr.Prcb.HalReserved[0] = NULL;
ForgetAboutPG = (InitSafeBootMode != 0) | (KUSER_SHARED_DATA.KdDebuggerEnabled >> 1);
// 96% of cases will fail
if ( __rdtsc() % 100 > 3 )
ForgetAboutPG |= 1;
if ( !ForgetAboutPG && KiFilterFiberContext(KiFilterParam) != 1 )
KeBugCheckEx(SYSTEM_LICENSE_VIOLATION, 0x42424242, 0xC000026A, 0, 0);
}
As you may notice, there is a small "present" in the HalReserved processor control block field
left for this initialization case. Tracing down the guy who left it leads us to the very beginning of
system startup:
... --> KiSystemStartup --> KiInitializeKernel --> KeCompactServiceTable --> KiLockServiceTable -v ??????
We have to pause here, because there is no code that puts data into HalReserved fields
directly. As instead, it is done using the exception handler. And it is done in a different way
from "KeInitAmd64SpecificState", because it doesnt trigger any exceptions. What it does
instead is it directly looks up the current instruction pointer, finds the corresponding function
and its exception handler manually, and then calls it. The exception handler of
"KiLockServiceTable" function is an unnamed stub to the "KiFatalExceptionFilter".
?????? ---> KiFatalExceptionFilter
KiFatalExceptionFilter in turn looks up an exception handler for "KiServiceTablesLocked"
function. And surprisingly it is the "KiFilterFiberContext"! Also, a parameter that is passed to
"KiFilterFiberContext" is located right after the "KiServiceTablesLocked" function. It is a small
structure:
typedef struct _KI_FILTER_FIBER_PARAM
{
NTSTATUS (*PsCreateSystemThread)(); // a pointer to
// PsCreateSystemThread function
KSTART_ROUTINE sub_140235C44; // unnamed checker subroutine
KDPC KiBalanceSetManagerPeriodicDpc; // global DPC struct
} KI_FILTER_FIBER_PARAM, *PKI_FILTER_FIBER_PARAM;
"KiFatalExceptionFilter" stores these pointers to HalReserved fields.
Creating patchguard context
Lets get back to the "KiFilterFiberContext" function. Its pseudocode is given below:
BOOLEAN KiFilterFiberContext(PVOID pKiFilterParam)
{
BOOLEAN Result = TRUE;
DWORD64 dwDpcIdx1 = __rdtsc() % 13;
DWORD64 dwRand2 = __rdtsc() % 10;
DWORD64 dwMethod1 = __rdtsc() % 6;
AntiDebug();
// Lets call sub_1406D6F78 KiInitializePatchGuardContext since it does initialize patchguard context
Result = KiInitializePatchGuardContext(dwDpcIdx, dwMethod1, (dwRand2 < 6) + 1, pKiFilterParam, TRUE);
// A 50% chance to create two patchguard contexts
if (dwRand2 < 6)
{
DWORD64 dwDpcIdx2 = __rdtsc() % 13;
DWORD64 dwMethod2 = __rdtsc() % 6;
do
{
dwMethod2 = __rdtsc() % 6;
}
while ((dwMethod1 != 0) && (dwMethod1 == dwMethod2));
Result = KiInitializePatchGuardContext(dwDpcIdx2, dwMethod2, 2, pKiFilterParam, FALSE);
}
AntiDebug();
return Result;
}
It is rather clear, and with provided code we can assume that up to 4 patchguard contexts can
be active on a running system simultaneously. Remember this one because wherever it is
called, we can be 100% sure that a new patchguard context is being initialized.
The function that creates and initializes patchguard context is so-called
"KiInitializePatchGuardContext". It is a huge obfuscated function. I guess it is suitable to
reference Alexs Ionescu tweet about it:
"I love the new #Windows 8 Patch Guard. Fixes so many of the obvious holes in downlevel, and the new hyper-inlined obfuscation makes me cry."
You bet it! IDA Pros decompiler works on it ~20 min on 3770 Core i7 CPU and spews out 26K
lines of code. It is not worth dealing with it as a single unit. Luckily, you can bite out small
pieces of information that give you a clue about methods that the new patchguard uses. Thats
why we did not reverse engineer it entirely, as instead we took and analyzed several parts in it.
Feel free to explore this function yourself, and you may discover new wonderful things!
It takes 5 parameters on Windows 8.1:
1. Index of DPC routine to be called from a created patchguard DPC for checking the
patchguard context. It may be one of these:
// These ones dont use exception handlers to fire checks
KiTimerDispatch (copied to random pool allocation)
KiDpcDispatch (copied into patchguard context)
// These use exception handlers to fire patchguard checks
ExpTimerDpcRoutine
IopTimerDispatch
IopIrpStackProfilerTimer
PopThermalZoneDpc
CmpEnableLazyFlushDpcRoutine
CmpLazyFlushDpcRoutine
KiBalanceSetManagerDeferredRoutine
ExpTimeRefreshDpcRoutine
ExpTimeZoneDpcRoutine
ExpCenturyDpcRoutine
Also those 10 DPCs are regular system DPCs with useful payload, but when they encounter a
DeferredContext which has non-canonical address, they fire a corresponding
KiCustomAccessRoutine function.
These functions are only called when an appropriate scheduling method is used (0, 1, 2, 5)
2. Scheduling method:
These are the methods that are used to fire a patchguard DPC object that is created inside
"KiInitializePatchGuardContext" function.
- KeSetCoalescableTimer (0). A timer object is created with a random fire period between 2 minutes and 2 minutes and 10 seconds.
- Prcb.AcpiReserved (1). In this case a patchguard DPC is fired when a certain ACPI event occurs, f.e. transitioning to idle state. In this case "HalpTimerDPCRoutine" checks if 2 minutes have passed since last queued by itself DPC, and queues another one, taken from Prcb.AcpiReserved field.
- Prcb.HalReserved (2). Here a patchguard DPC is queued when HAL timer clock interrupt occurs, in the "HalpMcaQueueDpc". It is also done with 2 minutes period at least. Queued patchguard DPC is taken from Prcb.HalReserved field.
- PsCreateSystemThread (3). In this case, patchguard DPC routine is not used, as instead a system thread is created. The thread procedure is taken from KI_FILTER_FIBER_PARAM structure. Patchguard DPC in turn is used just as a container of the address of a newly created patchguard context.
- KeInsertQueueApc (4). This time a regular kernel APC is queued to the one of the system threads with "KiDispatchCallout" APC procedure. No patchguard DPC is fired also. System thread is chosen based on its start address, i.e. it must be equal to either PopIrpWorkerControl or CcQueueLazyWriteScanThread.
- KiBalanceSetManagerPeriodicDpc (5). Patchguard DPC is stored in a global variable named "KiBalanceSetManagerPeriodicDpc". It is queued in "KiUpdateTimeAssist" function and "KeClockInterruptNotify" function within every "KiBalanceSetManagerPeriod" ticks.
3. This parameter can be either 1 or 2. We are not sure about how it affects "KiInitializePatchGuardContext" function, but it is somehow connected to the quantity of checks
being done during patchguard context verification routine execution.
4. A pointer to KI_FILTER_FIBER_PARAM structure. It is noticeable that a method chosen inside
"KiInitializePatchGuardContext" is selected based on the presence of this parameter. If it is
present, a method bit mask is tested with 0x29 (101001b) which allows methods 0, 3 and 5.
Otherwise methods 0, 1, 2 and 4 are available. That makes sense, because methods 3 and 5
require a valid KI_FILTER_FIBER_PARAM structure.
5. Boolean parameter which tells if NT kernel functions checksums have to be recalculated.
As you might guess, the only scheduling method that can be initialized twice is 0, so
"KiFilterFiberContext" takes this fact into account when chooses a method for a second call of
"KiInitializePatchGuardContext".
Firing a patchguard check
Methods that fire patchguard DPC
The main principle of patchguard check routine is to launch a patchguard context verification
routine on a DPC level, and then queue a work item that will check vital system structures on a
passive level with a proceeding context recreation and rescheduling. The verification work item
uses a copy of "FsRtlUninitializeSmallMcb" function. You can check this one out, if you want to
figure out how the check works.
For the methods which use DPC activation there is a common code inside 10 listed DPC
routines, which checks "DeferredContext" for being a non-canonical address. If it is OK, DPC
just executes its payload. Otherwise one of 10 "KiCustomAccessRoutineX" functions is called.
When "KiCustomAccessRoutineX" is called, (last 2 bits + 1) of "DeferredContext" are taken and
used to roll along "KiCustomRecurseRoutineX". These recursive routines are cycled
incrementing X value. When the roll is over, "KiCustomRecurseRoutineX" tries to dereference a
DeferredContext value as a pointer, which inevitably generates #GP exception since this
address is non-canonical.
// Inside DPC routine
if ( (DeferredContext >> 47) < 0xFFFFFFFFFFFFFFFFui64 && DeferredContext >> 47 != 0 )
// Is DeferredContext a canonical address
{
...
KiCustomAccessRoutineX(DeferredContext);
...
}
void KiCustomAccessRoutine9(DWORD64 DeferredContext)
{
return KiCustomRecurseRoutine9((DeferredContext & 3) + 1, DeferredContext);
}
void KiCustomRecurseRoutine9(DWORD dwRoll, DWORD64 DeferredContext)
{
DWORD dwNextRoll;
DWORD64 go_go_GP;
dwNextRoll = dwRoll - 1;
if ( dwNextRoll )
KiCustomRecurseRoutine0(dwNextRoll, DeferredContext);
go_go_GP = *DeferredContext; // #GP
}
// DPC routine call sequence
ExpTimerDpcRoutine -> KiCustomAccessRoutine0 -> KiCustomRecurseRoutine0 ...
KiCustomRecurseRoutineN
IopTimerDispatch -> KiCustomAccessRoutine1 -> KiCustomRecurseRoutine1 ...
KiCustomRecurseRoutineN
IopIrpStackProfilerTimer -> KiCustomAccessRoutine2 -> KiCustomRecurseRoutine2 ...
KiCustomRecurseRoutineN
PopThermalZoneDpc -> KiCustomAccessRoutine3 -> KiCustomRecurseRoutine3 ... KiCustomRecurseRoutineN
CmpEnableLazyFlushDpcRoutine -> KiCustomAccessRoutine4 -> KiCustomRecurseRoutine4 ... KiCustomRecurseRoutineN
CmpLazyFlushDpcRoutine -> KiCustomAccessRoutine5 -> KiCustomRecurseRoutine5 ... KiCustomRecurseRoutineN
KiBalanceSetManagerDeferredRoutine -> KiCustomAccessRoutine6 -> KiCustomRecurseRoutine6 ... KiCustomRecurseRoutineN
ExpTimeRefreshDpcRoutine -> KiCustomAccessRoutine7 -> KiCustomRecurseRoutine7 ... KiCustomRecurseRoutineN
ExpTimeZoneDpcRoutine -> KiCustomAccessRoutine8 -> KiCustomRecurseRoutine8 ... KiCustomRecurseRoutineN
ExpCenturyDpcRoutine -> KiCustomAccessRoutine9 -> KiCustomRecurseRoutine9 ... KiCustomRecurseRoutineN
Here comes vectored exception handling again. If you look up all the exception handlers for
these DPC routines, youll discover that there are several nested __try__except and
__try__finally blocks. For example, "ExpTimerDpcRoutine" looks something like this:
...
__try
{
__try
{
__try
{
__try
{
KiCustomAccessRoutine0(DeferredContext);
}
__finally
{
FinalSub1();
}
}
__except (FilterSub1()) // patchguard context decryption occurs here
{
// Nothing
}
}
__finally
{
FinalSub2();
}
}
__except (FilterSub2())
{
// Nothing
}
...
ExpCenturyDpcRoutine, ExpTimeZoneDpcRoutine, ExpTimeRefreshDpcRoutine,
KiBalanceSetManagerDeferredRoutine, CmpLazyFlushDpcRoutine, CmpEnableLazyFlushDpcRoutine,
PopThermalZoneDpc, ExpTimerDpcRoutine -> _C_specific_handler
IopIrpStackProfilerTimer , IopTimerDispatch -> _GSHandlerCheck_SEH (GS check + _C_specific_handler)
Depending on the DPC routine, decryption routine (based on KiWaitAlways and KiWaitNever
variables) may reside in one of the exception filters, exception handlers or termination handlers.
Further patchguard context verification occurs also inside decryption routine, right after the
decryption.
As for "KiTimerDispatch" and "KiDpcDispatch" DPC routines - they call patchguard context
verification directly. Also, depending on the DPC routine a different type of patchguard context
encryption is used (or not used at all).
Other methods
Method 3 creates a system thread. System thread procedure sleeps between 2 minutes and 2
minutes and 10 seconds using "KeDelayExecutionThread" or "KeWaitForSingleObject" on a
kernel object, which is always not signaled. After the wait is timed out it decrypts patchguard
context and executes verification routine.
Method 4 inserts an APC with "KiDispatchCallout" function as a kernel routine and
"EmpCheckErrataList" as a normal routine. Patchguard context decryption and validation occurs
upon APC delivery to the target waiting thread, which happens almost immediately. A 2 minutes
wait is located inside the verifier work item routine i
Available link for download