Memory Management

Below are notes based on a CS lecture found on YouTube.  The course, Computer Science 377, is offered by the University of Massachusetts.   This video is Lecture 11: Memory Management.  Many thanks to Sean Barker (the lecturer) on his excellent work.

Memory Management

As is discussed in the article Operating Systems: Process Management, a process is the execution of a program in a computer.  With all modern operating systems, multiple processes run simultaneously – as of this writing there are 89 processes running on this computer!  This article will introduce how modern operating systems allows multiple processes to run at once within main memory (RAM).

Background: Computer Architecture

First of all, all programs are stored on a storage device (a traditional hard drive, SSD or other storage) while not in use.  When a user attempts to open a program, the computer fetches the executable file from the disk and loads it into the RAM.  It isn’t necessary for the entire program to load into memory before the program begins to execute, rather the operating system continues to load the program as it initiates.

Memory Management: Terminology

There are few terms that are necessary in order to discuss memory management:

  • Segment:  a chunk of memory assigned to a process
  • Physical Address:  a real address of the data stored in memory
  • Virtual Address: an address relative to the start of a process’ address space

Where do Memory Addresses Come From?

Programs generate instructions and data addresses in three different ways:

  1. Compile Time: When the program is compiled, the compiler generates the exact physical location in memory for instructions and data addresses, starting from a position k.  In this scenario, the operating system doesn’t need to do anything.  This method was only used in the early computer era and is never used in modern computing.
  2. Load Time: When the program is compiled, the compiler generates instructions and data addresses, but instead of assigning them to a specific location, at load time the operating system determines the starting position in memory for the process.  Once the process loads, its position in memory is fixed.  While it is still possible, this method is not commonly used in modern computing.
  3. Execution Time: When the program is compiled, the compiler generates instructions and data addresses but gives the operating system complete freedom to place it anywhere in memory.

Uniprogramming Operating Systems

Some of the earliest operating systems were “uniprogram” – only one program could run on the computer at once.  For example, in the disk operating system (DOS), which was used in the 1980s and early 1990s was only capable of running one program at a time.  In DOS, the highest memory addresses were reserved for the operating system and the lowest addresses were for whatever program was running.

Instructions and data addresses could be assigned by the compiler during this era. Processes were always loaded starting at address zero (0).  The maximum address that the compiler could assign was the size of the RAM minus the amount of memory that the operating system (memory size – operating system size).

The operating system was protected from the running process by checking the addresses being used by that process, making sure that the addresses don’t interfere with the addresses it was using.

Multiprogramming Operating Systems

Multiprogramming was a huge step forward in operating system design.  In order to run multiple programs at once, several conditions have to be fulfilled:

  1. Transparency – multiple processes have to coexist in memory.  In order to work properly, no process should be aware that the memory is shared with others.  Processes shouldn’t care what portion of memory they are assigned to, only that there is enough memory for it to complete.
  2. Safety – it should be impossible for processes to corrupt each other.  More importantly, processes should not be able to corrupt the operating system.
  3. Efficiency – running multiple processes should not excessively degrade the system (at least not while running a reasonable number of processes).

Relocation

Operating systems can manage where each process is stored in memory using a technique called relocation.  The operating system is often stored in the highest memory addresses.  When the program compiles and executes, it starts with zero (0).  The maximum address is equal to the total memory size minus the operating system size.  The process is loaded and allocated a contiguous segment of memory.  The first (smallest) physical address of the process is the base address and the largest physical address the process can access is the limit address.  There are two methods of relocation: static and dynamic.

Static Relocation

The first method of relocation is known as static relocation.  In this process, the operating system adjusts the memory address of a process to reflect its starting position in memory. Once a process is assigned a starting position in memory, it executes within the space it has been allocated.  Once the static relocation process has completed, the operating system can no longer relocate the process until it terminates.

Dynamic Relocation

Dynamic relocation is the second (and more advanced) method of relocation.  In this method, hardware adds relocation register (base value ) to the virtual address generated by the compiler.  The relocation register allows translation to a physical memory address.  Hardware compares this memory address with the limit register (the highest value available in the allocated section – this memory address must be less than the limit).  If the memory address is higher than the limit, the processor takes an address trap and ignores the physical address.

The advantages of dynamic relocation are numerous.  The first and most important advantage is that the operating system can easily move a process if necessary.  This leads to the second advantage: a process can grow over time (because it can easily be relocated to a larger memory block).  Dynamic relocation is performed by hardware and is simple – it requires two special registers, a simple addition, and a simple comparison.  This is very inexpensive in computing terms.

There are also disadvantages to dynamic relocation.  Although relatively inexpensive, it does slow down hardware because of the additional operations outlined above.  Another consequence of dynamic relocation is that processes cannot share memory.  Processes still require a fixed amount of physical memory (whatever size the entire process is), and this can result in a limiting of multiprogramming because each active process must fit into memory.

Properties of Relocation

Relocation has three main properties:

  1. Transparency: processes are unaware that they are sharing memory.
  2. Safety: memory references are checked to ensure that they fall between the base and limit registers, which keeps other processes safe from corruption.
  3. Efficiency: memory checks and assurance that the process is within its range are fast because they are performed in hardware.  However, if a process grows it may have to be moved which is a very time expensive operation.

Memory Management: Memory Allocation

Once a process is fetched from storage and it begins to execute, it grows and eventually terminates.  Throughout this process, the operating system must keep track of which memory is available and which is utilized.  Pieces of memory that are free are known as holes.  Given a new process, the operating system has to decide in which hole to place the new process.

Memory Allocation Policies

There are three main policy options available for allocating memory.  These policies include:

  1. First-Fit: allocate the new process in the first hole in which the process will fit.  The search for a hole that is large enough can start with the first hole in memory or, alternatively, the search can begin wherever the previous first-fit search ended.
  2. Best-Fit: allocate the new process to the smallest hole that is big enough to hold the process.  In this approach, the operating system must search the entire list and find the appropriately-sized hole, or alternatively, store a list of available holes sorted by size.
  3. Worst-Fit: allocate the new process to the largest hole available.  The reasoning behind this policy is that perhaps the remaining space will still be useful for new processes later.  Similar to the best-fit policy, the operating system must search the entire list and find the largest hole, or alternatively, store a list of available holes sorted by size.

Simulations show that the first-fit policy and the best-fit policy usually yield better storage utilization that the worst-fit policy.  Simulations also show that the first-fit policy is generally faster than best-fit.

Fragmentation

Fragmentation is the breaking up of blocks of memory (or storage) into small, unusable pieces that cost computing time to keep track of but are not useful.  It is always important to avoid fragmentation.  There are two different forms of fragmentation:

  1. External Fragmentation
    The process of loading and unloading programs causes free space in memory to become fragmented.  In a case of extreme external fragmentation, it is possible that there could technically be enough space in memory to fit a new process, but the  available space is divided up and not contiguous.  Simulations of memory management clearly show that up to one-third (1/3) of memory can be wasted due to external fragmentation.  For this reason, minimizing external fragmentation is a priority in operating system design.
  2. Internal Fragmentation
    If a process is allocated a block of memory that is only slightly larger than its requirements, it is more efficient to allocate the process the entire block than it is to keep the extra memory as a separate block.  Keeping track of only one or two bytes of free space is more expensive than the value they add as free space.  This wastage of memory internal to a partition is known as internal fragmentation.

Compaction

Compaction is the process of moving memory around to eliminate most of the smaller holes and make one larger hole.  On storage disks, this is known as disk defragmentation – a well-used utility in most Windows operating systems.  Like defragmentation, compaction is very time expensive.  It is possible to perform a partial compaction, simply to create a hole in memory large enough to house the new process that is attempting to execute.

Swapping

When memory is becoming full, it is possible to swap out the process onto a storage device, which releases all the memory that the process is holding.  This is only done with a process that has stalled (waiting for I/O, for example) and is not in the ready queue.  Once the process is reactivated, the operating system reloads it into memory.  If static relocation has been used, then the process must be put back in the same memory location as before.  If dynamic relocation was used, the operating system can find a new position in memory and update the relocation and limit registers.

If the operating system utilizes swapping, this is also an ideal time to incorporate compaction.  Combining several smaller holes together to create one large enough to accommodate another process is easier when one or more of the processes is swapped out to a storage disk.

Memory Management Challenges

Thus far in this discussion, there have been multiple challenges facing the operating system designer.  External fragmentation requires frequent compaction, an expensive operation.  The need for contiguous allocation makes it difficult to fit processes into memory, and very difficult to grow or shink the amount of memory allocated to a process. Further, up to this point it has been a requirement for all of a process to reside in memory, and while process swapping helps, it doesn’t solve this problem completely, mostly because swapping itself is a very expensive operation.  Memory management using these techniques is clearly a large challenge.  Enter paging.

Paging

The basic process of paging takes place in the following way:

  1. Processes are divided up and assigned into fixed page sizes.
  2. These pages are assigned to frames (blocks of memory).
  3. The operating system manages (moves, removes, reallocates) pages in memory as required by the process.
  4. Pages are not necessarily contiguously assigned, they can be anywhere in memory.

Paging: Motivation and Features

The idea of paging arose after computer scientists realized the 90/10 rule of computing: in general, processes spend 90% of their time accessing 10% of their space in memory.  The other 90% of the allocated memory is used only 10% of the time.  The idea of paging was to keep only the parts of a process that are actually being used in memory.

Pages greatly simplify the problem of insufficient memory space.  Using paging, the process sees the memory it is using as contiguous, but pages don’t actually have to be physically contiguous in memory.  This happens by dividing memory into blocks or frames of fixed size and processes into fixed-size sections called pages.  The frames don’t need to be contiguous on the disk, but from the point-of-view of the process, each page is contiguous with the next.  Note that paging does eliminate external fragmentation, but not internal fragmentation.  Due to the fixed-size nature of pages and frames, sometimes the process lacks enough data to fill a page.

Paging Hardware

If pages are not allocated contiguously, then it is clear that finding them in memory could be a challenge.  This is where the idea of virtual addresses comes into play.  When a process is executed it uses a virtual (aka logical) address to name each of the memory locations it will require.  These virtual addresses are contiguous, from 0 to the size of the process.  The operating system assigns the process to pages, and the paging hardware translates the virtual addresses into actual physical addresses in memory (frames) that could appear anywhere, in any order within memory.  The hardware manages the translation from virtual address to physical, meaning that the operating system merely requests a page and the hardware fetches it, wherever it may be in physical memory.  The operating system employs page tables to keep track of this information.

Paging is a form of dynamic relocation, where each virtual address is bound by the paging hardware to a physical address.  It may be easier to think of the page table as a set of relocation registers, one for each frame.  Mapping is invisible to the process; the operating system maintains the mapping and the hardware will perform the translation.  Protection is provided with the same mechanisms as used in dynamic relocation.