Analysis of a Caddy Wiper Sample Targeting Ukraine
Analysis of a Caddy Wiper Sample
Introduction
CaddyWiper was first reported by ESET as below:
Dubbed CaddyWiper by ESET analysts, the malware was first detected at 11.38 a.m. local time (9.38 a.m. UTC) on Monday. The wiper, which destroys user data and partition information from attached drives, was spotted on several dozen systems in a limited number of organizations. It is detected by ESET products as Win32/KillDisk.NCX.
One of my friends pinged me a few days later with a link to a CaddyWiper sample. Since this sample was a particularly small one, I decided to write a blog post going through each function from scratch and introducing the tools I used to make my life easier. Hopefully, this can serve as a reference to junior malware analysts who want to get started with this craft.
First off, I’m a Linux user myself and I use mainly Linux tools to analyse malware. pev
is a set command-line utilities providing a high level analysis of a PE
binary. It consists of the following tools
|
|
running pehash
on the sample offers the following:
|
|
readpe
result:
|
|
If you’re new to analyzing a PE, I highly recommend looking at the official Microsoft documents for PE Format. Some notes from the link:
At location 0x3c, the stub has the file offset to the PE signature. This information enables Windows to properly execute the image file, even though it has an MS-DOS stub. This file offset is placed at location 0x3c during linking. After the MS-DOS stub, at the file offset specified at offset 0x3c, is a 4-byte signature that identifies the file as a PE format image file. This signature is “PE\0\0” (the letters “P” and “E” followed by two null bytes).
Main function Analysis
the main function starts at 00401000
and it looks like it doesn’t return a status code. in c
terms, it means the main
function is written like so:
void main(...)
.
In the main function, we can see a call to the external function DsRoleGetPrimaryDomainInformation
at 0040113a
:
according to Microsoft documentation, The DsRoleGetPrimaryDomainInformation
function retrieves state data for the computer. This data includes the state of the directory service installation and domain data.
If we take a closer look at the function call, we can see that the function has been called with 3 parameters: DsRoleGetPrimaryDomainInformation(0,1,&empty_int_pointer);
. the 0 refers to the lpServer
parameter, meaning the function will be called on the local computer (refer to the link above for more info on that). The 1
is the InfoLevel
parameter, which specifies the level of output needed, as well as the type of output being pushed to our empty_int_pointer
. referring to Microsoft Documentation, we can see 1
refers to the first item in the C++ enum, which is DsRolePrimaryDomainInfoBasic
:
|
|
If we follow the docs, it’ll mention our output type as DSROLE_PRIMARY_DOMAIN_INFO_BASIC
, and refers to this page. Looks like our return value will be in this struct:
|
|
clearly the attack is interested in MachineRole
, and compares it with value 5
. Let’s dig deeper to see what 5
means. If we go to this doc, we’ll see the following enum
:
|
|
5
is the primary Domain Controller. Looking at the code, you can see the attacker does not intend to attack the primary DC, and will skip them.
After getting all the info, I started to rename the functions and add a bit of comment, as well as converting types in Ghidra to make sure it’s readable:
Now we can see there’s a wiper
function, which runs on C:\\Users
as well as D:\\
for 24 chars (E:\\, F:\\, ...
), which means basically all drive letters.
let’s go take a look at the wiper
function. That’s where the attacker’s malicious code is located.
The wiper function
The function itself is a void
one. Meaning the attacker didn’t really care if the wiping is successful or not. Reading a bit of the function itself, the first bit of interesting information is seen at line ~180. There seems to be another function, that gets called with both *
and \\
values.
|
|
After digging around the wipe
function, you can see kernel32.dll
as a stack string with these functions being called from it (in order):
|
|
All above functions are thoroughly documented in Microsoft’s official Win32 API Docs
Essentially, the wiper is looking for all the files under C:\Users
and D:
through Z:
and tries to enumerate the first file within those directories (with FindFirstFileA
), then enumerates through the folders with FindNextFileA
, opens the file, scrambles the header of each file, and does it across all folder recursively. Here’s the main wiper
function with function names and syscalls somewhat renamed to a more readable format
Subfunction FUN_00402a80
Before we rename this function to something human-readable, we should know what it does. Here’s the pseudo-code of the function itself:
The function appears to concat two strings together with a couple of while
loops and put them in the first parameter’s pointer. in python
terms, it basically means param_1 = param_2 + param_3
. From now on, I’ll refer to FUN_00402a80
as concat
subfunction FUN_00401530
After concatenating the paths with *
and \\
, FUN_00401530
gets called with two parameters: findFirstFileA
and kernel32.dll
, as specified in lines directly after calling the two concat functions (line 190 to 200 inside the wipe
function in Ghidra).
Even though the logic of the function seems complicated, from what it gets and produces as an output, it’s safe to assume the function is a Win32 API client. The DLL filename as well as the specific functionality is pushed to the function and the result is an integer that corresponds to the API response code. From now on, I’ll refer to FUN_00401530
as syscall_wrapper
Other Interesting Functions
FUN_00401a10
Using the same trick we did before, it’s easy to see this function using the same syscall_wrapper
to invoke multiple functions from advapi32.dll
:
|
|
This function looks to be looking into each particular file’s ownership and tries to get around some ACLs and “access denied” errors that it comes across. I would describe it as a basic way to try to make a file writable enough so it can destroy it. Although I didn’t read each individual syscall to back that claim. FUN_00401750
is the main carrier of this operation. In FUN_00401750
, we can see the following functions:
|
|
FUN_00401750
simply tries to see if the malware has enough permission to change permissions on a file. I’ll rename it to priv_check
.
As a result, based on my guess, FUN_00401a10
is renamed to priv_set
Putting it all together
This is a small Malware sample, and it’s effective and fast. In a nutshell, this is what the attack vector had in mind
- Checks if the Computer is a primary domain controller or not. If not, it leaves it behind and doesn’t wipe it.
- It identifies C:\Users and D: through Z: as primary attack targets
- Recursively:
- Finds the first file in the folder
- Tries to see the permission it has to write to the file
- Tries elevating privileges to gain permission to write to the file
- Opens the file in write mode
- rewrites the file header with gibberish
- Close the file
Interestingly, If you run the binary through something like the strings
command, you’ll only see a few strings, like so
|
|
This is because the attacker is making use of stack strings
. This link has a good explanation of what are stack strings and how are they used to avoid detection.
Detection
The easiest detection for this particular sample could be a hash value. But since this malware is small, hashes, even ssdeep
are not a very good idea. Let’s try to build a YARA rule that defines what we learned from the malware.
|
|
As we saw, since the attacker was clever enough to use Stack String, our YARA rule is going to be slow and regex-y but it still works. Interestingly, for WriteFile
and FindClose
I had to adjust my regex to factor in the slightly smaller MOV
assembly code. I’ve also put a file size cap on the sample to ignore potentially different variants of this malware.
As an exercise, you can create similar detection for the dll
files, which are a bit trickier considering they’re both wide
strings and Stack Strings.
Hope you enjoyed this brief analysis. I’ll put the Ghidra zipped file alongside the scripts, comments etc in a Github Repo if anyone is interested. Let me know what Malware should I dissect next :)