In April 2023, we published a blog post about a zero-day exploit we discovered in ransomware attacks that was patched as CVE-2023-28252 after we promptly reported it to Microsoft.
In that blog post, we mentioned that the zero-day exploit we discovered was very similar to other Microsoft Windows elevation-of-privilege (EoP) exploits that we have seen in ransomware attacks throughout the year. We found that since June 2022, attackers have used exploits for at least five different Common Log File System (CLFS) driver vulnerabilities. Four of these vulnerabilities used by the attackers (CVE-2022-24521, CVE-2022-37969, CVE-2023-23376, CVE-2023-28252) have been captured in the wild as zero-days.
Seeing a Win32k driver zero-day being used in attacks isn’t really surprising these days, as the design issues with that component are well known and have been exploited time and time again. But we had never seen so many CLFS driver exploits being used in active attacks before, and then suddenly there are so many of them captured in just one year.
Is there something seriously wrong with the CLFS driver? Are all these vulnerabilities similar? Was Microsoft somehow lax in patching these vulnerabilities? These questions piqued my interest and encouraged me to take a closer look at the CLFS driver and its vulnerabilities.
This study turned out to be quite long, so for the convenience of the reader, it is divided into six parts:
- This part will cover the internals of the Common Log File System (CLFS) driver and its design flaws.
- The next five parts will cover the actual root causes and exploitation of five vulnerabilities that were used in ransomware attacks throughout the year.
You can skip to the other parts using this table of contents or using the link at the end of this part.
- Part 1 – Windows CLFS and five exploits of ransomware operators
- Part 2 – Windows CLFS and five exploits of ransomware operators (Exploit #1 – CVE-2022-24521)
- Part 3 – Windows CLFS and five exploits of ransomware operators (Exploit #2 – September 2022)
- Part 4 – Windows CLFS and five exploits of ransomware operators (Exploit #3 – October 2022)
- Part 5 – Windows CLFS and five exploits of ransomware operators (Exploit #4 – CVE-2023-23376)
- Part 6 – Windows CLFS and five exploits of ransomware operators (Exploit #5 – CVE-2023-28252)
Common Log File System (CLFS) internals
To understand the root causes of vulnerabilities and their exploitation, it’s very important to understand what CLFS is, how it works, and its design quirks.
Common Log File System (CLFS) is a general-purpose log file subsystem. It is used by the OS itself, but can be used by any application that needs high-performance data/event logging, and Microsoft provides documentation for it (here for kernel-mode and here for user-mode APIs). It first appeared in Windows Server 2003 R2 / Windows Vista and is implemented in the clfs.sys driver.
Logs are created/opened with the API function CreateLogFile and consist of a special master file with metadata (it is called Base Log File and has a .blf file extension) and any number of containers for storing actual data. These containers are created using the API functions AddLogContainer and AddLogContainerSet.
As you might guess, the Base Log File (BLF), a file with metadata, plays a key role in working with logs. The format of this file is not documented by Microsoft, and it is intended that any work with it will be done through the provided API. But the format itself is not very complicated, and Microsoft provides debug symbols for clfs.sys, so it was only a matter of time before someone reverse-engineered it. A detailed documentation of this format by Alex Ionescu can be found at the following link.
And it’s no wonder that this format was not documented by Microsoft, because just looking at it sets off alarm bells. The BLF files consist of kernel memory structures, and there are even fields for storing memory pointers!
Excerpt from CLFS documentation hinting at the structure of BLF files
Although Microsoft does not advertise this, it cannot be said that they hide it, because it is basically stated in the documentation. The documentation says that CLFS is optimized for performance, and all work is done in buffers that are flushed to disk without copying. This implies that these buffers are read from disk in the same way.
CLFS exists to solve a fairly complicated task and therefore has complex functionality. Its code base is quite old. It parses files of questionable structure in the kernel. All code is optimized for performance. In my experience, code with all these characteristics is usually susceptible to vulnerabilities.
And this case is no exception. A search for “Windows Common Log File System Driver Elevation Of Privilege Vulnerability” among Security Update Guides reveals that there have been more than 30 such vulnerabilities patched since 2018, including the four previously mentioned zero-days that were captured in the wild.
Now let’s take a closer look at the BLF file format. Alex Ionescu’s documentation was very helpful in conducting this research, although I must admit I had to write my own CLFS parser and reverse engineer much of the same material myself to fully understand the root causes of the vulnerabilities. Below I will describe all the details about the BLF file format that are necessary to understand the root causes of the vulnerabilities we are about to discuss and how they are exploited.
The following is some key information for understanding the format. BLF files are made up of records. These records are stored in blocks. These blocks are written/read sector by sector. The size of a sector is equal to 0x200 bytes. The last two bytes of a sector are used to store the sector signature. If the last two bytes are occupied by the sector signature, where are the original bytes of the block stored? At some other location specified by the offset in the block header, but we will get back to that in a moment. Records may also contain additional data structures, depending on their type.
Every block begins with a block header – CLFS_LOG_BLOCK_HEADER. Its format is shown in the image below. There is no description of this structure in the debug symbols, so to avoid confusion I am using the field names from the previously mentioned documentation.
The block header contains information about the number of sectors, the checksum of the block data, and other fields that are not so important for us. We are only interested in two fields. The first one is RecordOffsets, which is an array of record offsets. The format allows blocks to contain many records, but only the first offset is always used in the code. We are also interested in an offset called SignaturesOffset. This offset points to the location in the block where the original bytes are stored, the bytes replaced by the sector signatures mentioned earlier. All offsets are relative, and when used are added to the beginning of the block header.
BLF files consist of six blocks. These blocks have the following names/types: CONTROL, CONTROL_SHADOW, GENERAL, GENERAL_SHADOW, SCRATCH, SCRATCH_SHADOW. However, there are not actually six different types, but only three. SHADOW blocks contain the previous copy of the recorded metadata and are used to restore data if the recording is interrupted.
The image below shows the location and size of the blocks in the newly created BLF file.
Layout of blocks in the newly created BLF file
Newly created BLF files always have the same layout, and exploits take advantage of that – there’s no need to build from scratch or carry a prebuilt BLF file to trigger the vulnerability, it’s enough to ask the OS to create a new BLF file and patch data at hardcoded offsets.
Now let’s talk about records that are stored in blocks. Records stored in CONTROL blocks are defined by the CLFS_CONTROL_RECORD structure; records stored in GENERAL blocks are defined by the CLFS_BASE_RECORD_HEADER structure; records stored in SCRATCH blocks are defined by the CLFS_TRUNCATE_RECORD_HEADER structure.
All these record structures begin with the CLFS_METADATA_RECORD_HEADER structure, which has a DumpCount field.
DumpCount is used by the ReadMetadataBlock function to select between a regular block and it’s SHADOW copy and the most recent and valid block is used (it selects the block with the higher DumpCount, valid checksum and other fields).
The CONTROL block is located at the very beginning of the BLF file, and CLFS_CONTROL_RECORD contains information on the location of other blocks. It exists to allow CLFS to increase/decrease the log size.
Most of the fields in the CLFS_CONTROL_RECORD structure are used by the log resizing function, but there is also the rgBlocks field that we are most interested in for now. This field is an array of CLFS_METADATA_BLOCK structures with information about all the blocks present in the file. The image above suggests that this array is arbitrary in size, but in fact its size is hardcoded to six (because there are six blocks in the BLF). Each CLFS_METADATA_BLOCK structure contains information about the block size, its offset (from the beginning of the file), and a placeholder to store the block’s kernel pointer when it’s loaded to memory.
The GENERAL block is the one that contains the actual information stored in the BLF file. It contains information about clients (those using the log), containers (files with actual data), security descriptors for containers.
The CLFS_BASE_RECORD_HEADER structure is quite large, taking up 10 sectors. This is because it contains five huge arrays with offsets. Information about clients and containers is represented as CLFS_CLIENT_CONTEXT and CLFS_CONTAINER_CONTEXT structures that are stored in the GENERAL block as symbols. What is a symbol? It is a combination of the CLFSHASHSYM structure and the CONTEXT structure immediately following it. This is all done so that the code can quickly find a CONTEXT structure using a hash search. The rgClientSymTbl, rgContainerSymTbl and rgSecuritySymTbl arrays store offsets to CONTEXT structures in the form of symbols. The rgClients and rgContainers arrays are used to store offsets that point directly to the same CONTEXT structures, but bypass the CLFSHASHSYM structures. The CLFS driver uses all these arrays, and different functions use different methods to access the CLFS_CLIENT_CONTEXT and CLFS_CONTAINER_CONTEXT structures. This is clearly a bad design decision that, as you will see, has backfired.
We are also interested in the cbSymbolZone field. Since more clients and containers can be assigned to the log at runtime, the code uses this field to get the next free offset in the GENERAL block where it can create a new symbol. This zone for new structures starts immediately after the CLFS_BASE_RECORD_HEADER structure.
All structures present in the symbol zone (including CLFSHASHSYM, CLFS_CLIENT_CONTEXT and CLFS_CONTAINER_CONTEXT) are represented as nodes. All of these structures start with a unique magic number that identifies the type of node, followed by the size of the structure.
One interesting fact related to CLFSHASHSYM is that some functions simply take the address of the CONTEXT structure, subtract 12 or 16 from it, and work with the cbSymName and cbOffet fields of the CLFSHASHSYM structure that is just expected to be there.
The CLFS_CLIENT_CONTEXT structure contains many fields, many of which are self-explanatory. To understand the root causes of the vulnerabilities described below, we are most interested in the llCreateTime/llAccessTime/llWriteTime and fAttributes fields. The first three are self-explanatory, and fAttributes contains FILE_ATTRIBUTE flags associated with the BLF file.
The CLFS_CONTAINER_CONTEXT is the last structure we need to look at. Please note the pContainer field. It’s a placeholder to store a kernel pointer to the CClfsContainer class. This may need to be reiterated: CLFS_CONTAINER_CONTEXT and all the other structures discussed previously are read from BLF files stored on disk. Therefore, if attackers manage to inject a malicious CLFS_CONTAINER_CONTEXT into a BLF file, and it is processed by code without proper validation/initialization, attackers will be able to hijack the control flow and elevate their privileges from user-level to kernel.
Fatal flaws of the Common Log File System (CLFS)
CLFS is perhaps way too “optimized for performance”. It would be better to have a reasonable file format instead of a dump of kernel structures written to a file. All the work with these kernel structures (with pointers) happens right there in the blocks read from disk. Because changes are made to the blocks and kernel structures stored there, and those changes need to be flushed to disk, the code parses the blocks starting from CLFS_LOG_BLOCK_HEADER over and over again every time it needs to access something. All this parsing is done using relative offsets, which can point to any location within a block. If one of these offsets becomes corrupted in memory during execution, the consequences can be catastrophic as attackers will be able to supply a malicious CLFS_CONTAINER_CONTEXT. But perhaps worst of all, offsets in the BLF file on disk can be manipulated in such a way that different structures overlap, leading to unforeseen consequences. All these factors lead to a large number of vulnerabilities and their easy exploitation.
Use the following link to read the next part:
- Part 2 – Windows CLFS and five exploits of ransomware operators (Exploit #1 – CVE-2022-24521)