Windows API Hashing to Malware
Windows API Hashing is a common technique used by malware to obfuscate the API calls they make to the operating system. This technique makes static analysis and detection by security solutions such as EDR (Endpoint Detection and Response) and antivirus (AV) more difficult. Instead of calling API functions directly by name, the malware computes a hash of the function name and uses this hash to dynamically resolve the function’s address at runtime.
How Does It Work?
Rather than referencing API function names directly, malware calculates a hash based on the function name it intends to call. When the malware executes, it parses the Export Table of loaded DLLs (such as kernel32.dll
, user32.dll
, etc.), which contains all the exported functions. The malware then applies the same hashing algorithm to the exported function names and compares the resulting hash to its pre-calculated hashes. If a match is found, it retrieves the corresponding function address and invokes it.
Ordinals and Hashing
Some malware may also rely on ordinals when resolving API functions, especially if a consistent ordinal number is associated with a specific function. This offers a shortcut, as ordinals are fixed in certain DLLs. However, using hashes is a more common technique as it does not rely on static ordinals and offers flexibility across different operating systems or versions where ordinals might change.
PowerShell Script to Extract API Hashes
The script below demonstrates how to calculate API hashes using PowerShell. It iterates through a list of API function names, calculates their hash values, and detects any hash collisions. This script is useful for understanding how malware developers may compute these hashes at runtime.
This PowerShell script calculates a hash for a given API function name and checks for collisions. If two API names generate the same hash, it will alert you to the collision.
This technique is an excellent introduction to how malware developers generate and utilize hashes for API function resolution.
C++ Implementation for Resolving Functions by Hash
Below is a C++ implementation that resolves API function addresses dynamically based on pre-calculated hash values. It uses the CalculateHash
function to generate a hash for each exported function name in a module (like kernel32.dll
). If the hash matches a known target hash, it retrieves the corresponding function address and uses it.
In this image, we can see the output of a Windows API Hashing program. The program calculates the hash values for several API functions: VirtualAlloc
, CreateThread
, and WaitForSingleObject
.
The calculated hash for
WaitForSingleObject
is0x397566
.The hash for
VirtualAlloc
is0xe0dabf
.The hash for
CreateThread
is0xf92f7b
.The hash for
WaitForSingleObject
is displayed twice, confirming that the same hash value is used when calculated both before and after the function resolution process.
On the right side, a message box is displayed with the title "hello" and the text "joas". This message box is generated as a result of executing shellcode that invokes the MessageBox
function. The shellcode likely contains pre-configured instructions to display this message box, which is shown once the shellcode is executed in memory after resolving the necessary APIs dynamically using the hashing mechanism demonstrated.
This showcases the correct functioning of both the API hashing process and the execution of shellcode, dynamically resolving API functions during runtime without directly referencing the function names.
Conclusion
Windows API hashing is an advanced technique often used by malware to obfuscate its use of system calls, making it more challenging for security solutions to detect malicious behavior. By calculating hash values for API functions instead of referencing them by name, malware can resolve functions dynamically, avoiding direct detection in static analysis. This technique is useful for malware developers but is also a good exercise for ethical hackers and reverse engineers seeking to understand and combat sophisticated threats. The scripts and code provided above demonstrate the basic principles of API hashing and how it can be implemented both in PowerShell and C++.
Last updated