Understanding Virtual Memory - A Beginner's Guide
If you’re reading this, chances are you’re a fellow tech enthusiast who finds virtual memory a bit mysterious and you’re not alone. Even seasoned developers often treat it like a black box. In this post, I’ll try to shed some light on how it works, so that the next time you write software, you can make smarter decisions to optimize it for speed and efficiency.
If you’re anything like me, you’ve probably sunk a fair number of hours into video games over the years because who doesn’t enjoy a good escape into pixelated chaos every now and then? Along the way, you might’ve stumbled upon those enigmatic players pulling off stunts that seem to defy the laws of the game. I’m talking about perfect headshots from miles away, levitating like it’s Hogwarts, or tanking bullets like they’ve got a personal vendetta against the concept of dying. In gamer lingo, we call these folks distinguished gentlemen hackers. Stick around till the end of this post, and I’ll even peel back the curtain on how some of these tricks are actually pulled off not with some sci-fi spell book of code, but through clever exploits in how our computer systems are built. It’s not magic… but it sure feels like it.
Before we begin feel free to take a look at some of the basic terms related to virtual memory if you're unfamiliar with them
- Page — A fixed-size block of words of real memory. Typically 4 KB in size for 64-bit operating systems.
- Word — Any type that is the size of a pointer. This corresponds to the width of the CPU’s registers. In Rust,
usize
andisize
are word-length types. In C/C++, this typically corresponds tointptr_t
anduintptr_t
from<stdint.h>
. - Page fault — An error raised by the CPU when a valid memory address is requested that is not currently in physical RAM. This signals to the OS that at least one page must be swapped back into memory.
- Swapping — Migrating a page of memory stored temporarily on disk to main memory upon request.
- Virtual memory — The program’s view of its memory. All data accessible to a program is provided in its address space by the OS.
- Real memory — The operating system’s view of the physical memory available on the system. In many technical texts, real memory is defined independently from physical memory, which becomes more of an electrical engineering term.
- Page table — The data structure maintained by the OS to manage translating from virtual to real memory.
The Beginnings
Intuitively, a program’s memory is a series of bytes that starts at location 0
and ends at location n. If a program reports 100 KB
of RAM usage, it would seem that n would be somewhere near 100,000
.
Let’s test that intuition by writing a simple program.
fn main() {
let mut n_nonzero = 0;
for i in 0..10000 {
let ptr = i as *const u8;
let byte_at_addr = unsafe { *ptr };
if byte_at_addr != 0 {
n_nonzero += 1;
}
}
println!("non-zero bytes in memory: {}", n_nonzero);
}
If you try and run this it will crash simply because you tried to dereference a NULL pointer(the pointer at 0).
so you can try running it from 1 to 10,000
fn main() {
let mut n_nonzero = 0;
for i in 1..10000 {
let ptr = i as *const u8;
let byte_at_addr = unsafe { *ptr };
if byte_at_addr != 0 {
n_nonzero += 1;
}
}
println!("non-zero bytes in memory: {}", n_nonzero);
}
This however still doesn't work. It still crashes upon execution, and the number of non-zero bytes is never printed to the console. This is due to what’s known as a segmentation fault.
Segmentation faults are generated when the CPU and OS detect that your program is attempting to access memory regions that they aren’t entitled to. Memory regions are divided into segments. That explains the name. Now you might say okay, that's an easy problem to solve we can just create some values, allocate some to the stack some to heap, and test them out.
static GLOBAL_STATIC_INT: i32 = 1000;
fn return_stack_pointer() -> *const i32 {
let stack_allocated_int = 12345;
&stack_allocated_int as *const i32
}
fn main() {
let stack_str_literal = "a";
let stack_int = 123;
let heap_char = Box::new('b');
let heap_int = Box::new(789);
let dangling_ptr_from_fn = return_stack_pointer();
println!("GLOBAL_STATIC_INT: {:p}", &GLOBAL_STATIC_INT as *const i32);
println!("stack_str_literal: {:p}", stack_str_literal as *const str);
println!("stack_int: {:p}", &stack_int as *const i32);
println!("heap_int (Box): {:p}", Box::into_raw(heap_int));
println!("heap_char (Box): {:p}", Box::into_raw(heap_char));
println!("dangling_ptr_from_fn: {:p}", dangling_ptr_from_fn);
}
this will output something like
GLOBAL_STATIC_INT: 0x1001d479c
stack_str_literal: 0x1001d47a0
stack_int: 0x16fc62be4
heap_int (Box): 0x600002e58040
heap_char (Box): 0x600002e58030
dangling_ptr_from_fn: 0x16fc62b8c
By this point you may have realized some important lessons
- some memory addresses are illegal. The program crashes if you try accessing something that is out of bounds.
- Memory addresses are not arbitrary. Although values seem to be spread quite far apart within the address space, values are clustered together within pockets.
Before progressing to main event(the cheat show) let's take a look at the program that translates these virtual addresses to physical memory addresses.
The dance of address translation
Accessing data in a program requires virtual addresses, the only addresses that the program itself has access to. These get translated into physical addresses. This process involves a dance between the program, the OS, the CPU, the RAM hardware, and occasionally hard drives and other devices. The CPU is responsible for performing this translation, but the OS stores these instructions.
CPUs contain a memory management unit (MMU) that is designed for this one job. For every running program, every virtual address is mapped to a physical address. Those instructions are stored at a predefined address in memory as well. That means, in the worst case, every attempt at accessing memory addresses incurs two memory lookups. But it’s possible to avoid the worst case.
The CPU maintains a cache of recently translated addresses. It has its own (fast) memory to speed up accessing memory. For historic reasons, this cache is known as the translation look-aside buffer, often abbreviated as TLB. Programmers optimizing for performance need to keep data structures lean and avoid deeply nested structures. Reaching the capacity of the TLB (typically around 100 pages for x86 processors) can be costly.
Looking into how the translation system operates reveals more, often quite complex, details. Virtual addresses are grouped into blocks called pages, which are typically 4 KB in size. This practice avoids the need to store a translation mapping for every single variable in every program. Having a uniform size for each page also assists in avoiding a phenomenon known as memory fragmentation, where pockets of empty, yet unusable, space appear within available RAM.
This is a general guide only. The details of how the OS and CPU cooperate to manage memory differs significantly in some environments. In particular, constrained environments such as microcontrollers can use real addressing. For those interested in learning more, the research field is known as computer architecture.
The OS and CPU can play some interesting tricks when data lives within pages of virtual memory. For example--
- Having a virtual address space allows the OS to over-allocate.
- Inactive memory pages can be swapped to disk in a byte-for-byte manner until it’s requested by the active program.
- Other size optimizations such as compression can be performed.
- Programs are able to share data quickly(ex- If your program requests a large block of zeroes, say, for a newly created array, the OS might point you towards a page filled with zeroes that is currently being used by three other programs. None of the programs are aware that the others are looking at the same physical memory, and the zeroes have different positions within their virtual address space. )
- Paging can speed up the loading of shared libraries.
- Paging adds security between programs.(As you discovered earlier in this section, some parts of the address space are illegal to access.)
Making effective use of the virtual memory system in day-to-day programs requires thinking about how data is represented in RAM. As promised here are some guidelines:
- Keep hot working portions of your program within 4 KB of size. This maintains fast lookups.
- If 4 KB is unreasonable for your application, then the next target to keep under is 4 KB * 100. That rough guide should mean that the CPU can maintain its translation cache (the TLB) in good order to support your program.
- Avoid deeply nested data structures with pointer spaghetti. If a pointer points to another page, then performance suffers.
- Test the ordering of your nested loops. CPUs read small blocks of bytes, known as a cache line, from the RAM hardware. When processing an array, you can take advantage of this by investigating whether you are doing column-wise or row wise operations.
virtualization makes this situation worse. If you’re running an app inside a virtual machine, the hypervisor must also translate addresses for its guest operating systems. This is why many CPUs ship with virtualization support, which can reduce this extra overhead. Running containers within virtual machines adds another layer of indirection and, therefore, latency. For bare-metal performance, run apps on bare metal.
How an Executable Becomes a Running Program
Have you ever wondered what happens when you run a program on your computer? How does an executable file just a collection of bytes on disk—turn into a living, breathing process in memory?
The transformation from an executable file to a program’s virtual address space is a fascinating and fundamental process in computing. Interestingly, the layout of executable files (also known as binaries) closely mirrors the address space diagram we explored earlier in the "Heap vs Stack" section.
While the exact mechanics can vary based on the operating system and the executable file format (such as ELF on Linux or PE on Windows), the underlying concept remains largely consistent.
Here is a dumbed down version. Each segment of a process’s virtual memory like the text, data, heap, and stack, has a corresponding definition in the executable file. When you start a program, the operating system:
- Reads the binary file from disk,
- Loads specific bytes into their designated memory regions,
- Sets up the process’s virtual address space accordingly.
Once everything is in place, the OS tells the CPU to jump to the starting address of the .text
segment (where the executable code resides), and just like that —> your program starts running.
The Cheat Show: Hacking Through Virtual Memory
As promised, let’s pull back the curtain on how some of those game "hacks" — the headshots from across the map, infinite ammo, walking through walls — are actually possible. Spoiler: it all comes down to virtual memory manipulation.
Games, like all programs, run in virtual memory. But here's the twist — that memory can often be inspected, and even modified, while the game is running. Tools like Cheat Engine or custom-built memory scanners allow hackers to:
- Scan for specific values (e.g., your current health = 100),
- Identify the memory address holding that value,
- Modify it on the fly (e.g., set health to 9999 or freeze it in place).
Here’s where things get clever: most modern games use dynamic memory allocation for in-game values, so the memory address of “health” can change between runs. That’s where pointer chains (a sequence of pointers leading to the actual data) and reverse engineering come in. By analyzing how memory is structured (and even disassembling parts of the binary), hackers can trace these dynamic allocations and automate modifications.
And remember paging? If a value is not currently in RAM maybe it's swapped out to disk that can affect the timing of your hacks, or cause them to fail entirely. Hackers need to be aware of how and when values are loaded into memory to stay effective.
Here’s another trick: DLL injection. A hacker can inject their own code into the game’s address space for example, by hooking into a function that handles rendering or movement logic. Since virtual memory is isolated per process but still mutable from within the process (or via privileged access), this kind of injection allows cheats like:
- Auto-aim,
- Wallhacks (by making invisible enemies visible),
- Speedhacks (by modifying in-game timers or physics constants).
Security Note: Modern anti-cheat engines use techniques like kernel-level drivers, behavior analysis, memory integrity checks, and virtualization detection to combat these hacks — but it’s a cat-and-mouse game. Understanding virtual memory doesn’t just help in making software faster, it helps in securing it too.
🧠 Happy hacking. And remember — with great power comes great responsibility.