Module 10: Virtual Memory
Introduction
As we discussed in the last module, virtual memory gives many advantages in memory management.
The most apparent advantage is the ability to run a process requiring more memory than the system
physically has. Also, physical memory used by a process no longer has to be contiguous when using
virtual memory; this is a pretty important advantage which would encourage us to continue to use virtual
memory techniques, even if we had more physical memory than we would ever need (and thus never
needing the secondary storage in the swap area). On this flip side, managing virtual memory is more
complex. Virtual memory could be implemented in the operating system without support from the
hardware, but it would be very slow, and most modern processors have integrated Memory Management
Units (MMUs) that make the use of virtual addresses quick and practical.
In this module, we look at how virtual addresses are translated into physical addresses. We will examine
issues with using virtual memory with larger RAM resources, and also some of the efficiency issues when
using virtual memory.
Paging
One important aspect of virtual memory is that memory is physically broken into frames, and that
contiguous frames are combined into pages of memory. Each physical page frame can then be mapped
into a set of contiguous virtual addresses.
The above diagram shows the virtual memory layout of two processes, on the far right and far left. In the
middle is a set of physical page frames that can be mapped into each of the processes. Recall that one
possible layout for a process is to have the text (instructions), global data and the heap in the lower
memory addresses, and the stack in the higher addresses. Thus, we see sets of physical frames mapped
into low virtual addresses for the global data and text, and some pages mapped into higher addressed
virtual memory for the stack. Note that this diagram, particularly in the virtual pages allocated to the
processes, is not to scale. That is, there is a lot of empty space in the virtual address space of the
process. This gives the process a lot of room to grow. To add to the stack, the operating system only
needs to allocate another page frame to the top of the stack, and likewise if the heap needs to grow, a
page is allocated to the top of the heap. Look carefully at the diagram, and you can also note that there is
a page frame that is allocated to both processes. This is a quick and easy way for processes to share
memory; this can be done with memory that can be modified by both processes. It is also common
(although not illustrated here) for processes running the same program to share text pages that are read-
only.
Virtual Memory Address Translation
One of the keys to understanding virtual memory is how virtual addresses map to physical addresses.
(Note that in some references virtual addresses are also called logical addresses.) In addition to these
two addresses, a page table must also be present. The virtual address is split into (at least) two parts -
the higher bits reference an entry into the page table, and the other lower bits reference an offset into that
page. If the virtual address is split into two components, the number of bit in the least-significant portion
will tell you the size of each page (since those bits have to be able to reference all bytes in the page). The
number of bits in the most-significant portion indicates how many pages the process can reference.
The following presentation illustrates the concept of virtual address mapping using very simple 5 bit
addresses (hence a virtual address space of 32 bytes). This is adapted from an example in Silbershatz.
http://media.ep.jhu.edu/EPmedia/play/?file=epmedia/courses/online/605_412/Mod10_01.mp4
What about today's systems that have 64-bit addresses? The predominant 64-bit processor is the x86-64
CPUs. With these processors, different page table structures are used depending on the page size. For
the typical 4 KiB page size, a four-level page table is used; 9 bits address each page table, and the offset
into the page is of course 12 bits (to be able to address all the bytes in the 4 KiB page). That means that
in this architecture, 48 bits of a virtual address is used; the 16 most significant bits in the address are
currently unused. With different page sizes, the page table configurations are different; it is interesting to
note that you can have 1 GiB pages; even then two levels of page tables are used. Note that other 64-bit
processors use inverted page tables (see below).
Structure of a Page Table Entry
What does a page table entry look like? It does depend on the system architecture. Your text mentions
several important fields that are typically in a page table entry:
• Page frame number
• Present/absent
• Protection
• Modified
• Referenced
• Caching Disabled
Complete the following self-check activity to confirm your knowledge of the page table entries. This will be
important in the next module as we examine some of the page replacement algorithms.
http://media.ep.jhu.edu/EPmedia/courses/online/605_412/Mod10Quiz1/viewer.swf
Translation Lookaside Buffers
Translation Lookaside Buffers (TLBs) are a hardware-supported optimization of virtual memory. It is
basically caching recently used page table entries in faster memory. On systems that use this capability,
a memory reference is first checked for in the TLB. If it is not in the TLB, the process is suspended
(perhaps a spin lock would be ok here, since the suspension of the process would be very short), while
the appropriate page table entry in put into the TLB cache. The process then resumes, and this time the
memory reference will be supported in the TLB, so the process can continue. Tanenbaum mentions that
the TLB uses associative memory (allowing searches to be done extremely fast and in parallel). Thus, the
TLB is usually fairly small (Tanenbaum states that the TLB is rarely more than 256 entries). Is that still
true today? Modern processors now generally have multiple levels of TLBs, but the fastest level still is
limited to 64 entries. The Intel i7 processors have 64 entries in their fastest memory, and up to 512
entries in the second level TLB. Note also (from the reference noted below) that there are also separate
data and instruction TLBs.
Inverted Page Tables
Inverted page tables are suggested for use when there is an extremely large virtual memory space (e.g.
64-bit virtual addresses). The page frame number, rather than the virtual address references the page
table. As Tanenbaum mentions, one problem in this approach is that the translation from virtual address
to physical address becomes much more difficult — most systems use a hash of the virtual address to
locate the entry in the inverted page table. Aside from the issues with inverted page tables mentioned in
your text, another difficulty is how a physical page can be shared between different virtual addresses
(which is simple using the standard page table approach, and is how shared memory between processes
is implemented). According to the ubiquitous Wikipedia, several modern processors including the
PowerPC, UltraSparc, and the IA-64 use inverted page tables. The x86-64 64-bit architecture, however,
does not, using multi-level page tables instead.
Other Resources
• http://dropsafe.crypticide.com/article/3848 — Interesting article on swapping on MacOSX
• http://sg80bab.blogspot.com/2007/03/is-my-mac-using-too-much-memory.html — A good
description of what 'wired' memory is in MacOS
• http://oreilly.com/catalog/opensources/book/kirkmck.html — search for 'VAX Unix' and read about
the motivation to add virtual memory to UNIX.
• http://www.eecs.umich.edu/~tnm/papers/vm-hardware.ps — compares MMU processing for a
variety of processors
• http://en.wikipedia.org/wiki/Memory_management_unit — The ubiquitous wikipedia page for
MMUs. This page does describe the page table layout for the x86-64 processor families.
• http://www.tomshardware.com/reviews/Intel-i7-nehalem-cpu,2041-11.html — discussion of the
TLB on the Intel I-7 processors.
• http://www.informit.com/articles/article.aspx?p=29961 — Virtual memory in the IA-64 Linux
kernel. This article is an interesting excerpt addressing implementation of virtual memory from a
Linux kernel book.