Memory - Part 3

Virtual memory

There is now only a small step between the page and segment systems and virtual memory systems. Since we are making a translation between the address the process uses and the address where the memory is stored, why don't we do a bit of extra work and give ourselves more memory than we can afford, (or the hardware actually allows).

So we allocate pages (I'll use pages as the example at the moment, most popular systems use pages) as required. When we no longer have any free pages of memory we start moving pages out to disk, thus freeing up their space for new allocations.

When we refer to a page which has been paged out to disk (known as a page fault) we find an empty slot for it (possibly paging out another page) and bring it back in to real memory. The page table space which normally holds the frame number for a page in memory can be used to store the disk position for pages which are not in memory.

Bits and pieces

Along with the frame number or disk position information the page table entries have some extra bits.

The valid bit can now be used to indicate that the page is not currently stored in real memory. The memory management system has to check each memory violation to see if it was caused by a true invalid access or simply that the page the process was referring to was not currently in real memory.

Note, that with the introduction of virtual memory we talk of real and virtual memory and addresses. In most circumstances "real" is synonymous with "physical" and "virtual" is synonymous with "logical".

This way each process can use as much memory as it likes up to the maximum address size.

Why does it work?

It may seem strange that in many situations there is not an appalling slowdown with systems which use virtual memory. After all we know how slow disk access is compared to usual memory access. The only way virtual memory can be bearable is if we don't have to retrieve pages from disk very often. Not "very often" means only one disk memory access every million or so real memory accesses (see pg 311 of the textbook).

Of course the reason it works is "locality of reference". As long as we can provide the subset of pages in real memory which the program currently needs to keep running we have a good system. This subset we have seen before, it is known as the "working set" of the process. If we can't maintain a working set for each process in the system we get into difficulty, as we will see shortly.

Since we can't see the future we assume that the working set is the recently accessed pages. This works pretty well due to locality of reference, the frequency with which a page is referenced is a slowly changing function of time.

We are going to have to remember this when we decide which pages are going to be replace.

Working set window

The one outstanding problem in determining a process's working set is how long should the time interval be we are using to determine the working set.

What happens if this time interval is too short? or too long?

If this time interval is too short then pages which should be included in a processes working set will be missed, leading to an unnecessary increase in page faults. If too long, the working set will be larger than required and we won't be able to have as many active processes.

Swap devices and swap space

We either have special high speed disk devices or sections of ordinary disk devices set aside as the place where the pages get paged out (commonly referred to as swapped out). These devices or sections are called swap devices and the space on them is called swap space.

Keeping track of memory

Not only do we keep track of where a page is via the valid bit of the page table information and that the page is accessible for reading, writing or executing according to the protection bits associated with the page table entry, we also have to remember that the memory may be shared between several processes. This means that the page table entries for all such processes have to be altered to reflect the fact that the page is no longer in memory if it gets paged out (and of course the other way around when the page is brought back into memory).

It is not good enough for the operating system to keep track of which frames correspond to which process because it may correspond to several processes.

One way around this problem is to request that shared pages are held in memory and are not paged.

Page faults

What has to happen if a reference is made to a page which is not in real memory?

The memory management hardware determines that the page is missing and finds out where the page is stored on disk.

It sends a signal (a page fault interrupt to the processor) which causes a jump to the operating system code which deals with page faults.

A space must be found in real memory into which the page will be placed.

This may necessitate a page being moved out to make room for it.

After the space has been found, the required page can be read in to it.

The page table entry (or entries as noted above) has to be updated, to show the presence of the page.

Only then can the process be restarted. (NB in the meantime other processes will have been running. In a multiprogramming system we stop processes as soon as they have to wait for something.)

The instruction may be restarted from the beginning (it helps tremendously if the hardware can backup the state of the instruction to what it was before the page fault) or from a saved position from which it can complete successfully.

How often

How often a page fault occurs is of crucial significance both for our running processes and the overall system. We have to make sure there is enough real memory for our currently running processes to carry on with out causing excessive page faulting.

Obviously if there is only a small number of frames of real memory and many processes then there will be an enormous number of page faults. Whereas if there is a vast amount of free memory then there will be no page faults. This may cause us to expect an inverse linear relationship between the number of page faults and the number of frames. It turns out to look more hyperbolic.

The number of page faults drops rapidly at first as the number of frames increases, then there reaches a point where adding more frames hardly alters the rate at which paging occurs. If we could tell in advance the type of programs our system will be using, we should be able to estimate a minimum necessary number of frames in order to keep things running smoothly.

Since different programs will have different curves of page faults versus the number of pages, it makes sense to try to allocate frames in different numbers to each process.

How many frames do I need, how many do I get?

It is interesting to estimate the greatest number of pages an instruction may refer to in a real machine. A common type of instruction in older machines was something like:

add @x, @y

This instruction could cause 8 memory accesses and in the worst case 6 page faults. We hope that this does not occur too often. This means that in a machine with this type of architecture each process must always be allocated at least 6 frames. What would happen if a process couldn't get 6 frames?

One simple method allocates frames according to the virtual memory size of the process. Larger processes get more frames. In practise of course a large process may only use a small section of its address space.

PFF

The page fault frequency is an indication of how long since this process last had a page fault. If the process is faulting too often, the system should allocate it another frame. If the process doesn't fault very often, the system should try to reuse some of its pages (ones that aren't being used if possible).

There are other methods as well - see the textbook.

Thrashing

One poor example of system management is to increase the number of processes if the usage of the processor drops below a certain level. The idea is the more processes active, the more the processor will be utilized.

If we have a virtual memory system, the reason the processor may not have been highly utilized was because of a large number of page faults. Now our scheduling policy leads to disaster. As more processes are started up the scheduler sees that the processor is being used less, not more, because of the even higher number of page faults and so starts up even more processes.

This is an example of thrashing. (However it is by no ways the only way thrashing can occur.)

Thrashing is when the number of real pages is not enough to hold the working sets of the active processes. What does this mean in action?

Since the working set of a process is the frequently accessed set of pages for the process, page faults occur regularly. Unfortunately to handle a page fault we need to bring the required page in to memory, this means some other page will have to be moved out. Now the page which was moved out was in the working set of another process, so that when that process comes to run, it too causes a page fault.

This way the system spends almost all of its time servicing page faults. If it gets bad enough, it is possible for no worthwhile work to be done on the system at all. By the time a process gets its page and tries to run again, it has lost another page from its working set.

It is a real phenomenon. Once you have experienced it you don't forget it. Everything stops. Even simple things like getting echoes back to the terminal can take minutes.

Replacement strategies

Something we have ignored so far in talking about moving pages in and out of memory is which pages do we pick on. We know the pages which have to be brought in, they are the pages which caused the page fault.

There are many different strategies for choosing pages to move out to disk if there is not an empty free frame. It can be shown that the optimal strategy is to swap out the page which will referenced furthest in the future. The only hitch with this simple scheme is that it is impossible to foretell which page will be referred to furthest in the future. So we try some other schemes which attempt to approximate it.

Random

Just about the only thing in favour of a random page replacement strategy is that it treats every process fairly. It is also easy to implement. As long as the number of pages in the system is quite large, the method won't replace pages just about to be used too frequently.

FIFO

We could store information on when pages were allocated and keep them in a queue. Then the page which was allocated longest ago is the next one to go. This FIFO method seems appealing because of its simplicity until you realise that very important pages (such as part of the operating system) which are referenced frequently will be paged out just as frequently as pages which are hardly ever referred to.

This method requires some form of date stamp to say when the page was brought in to memory.

LRU

If we complicate our system a little to keep track of when each page was last referenced we can try the least recently used strategy. This is based on the idea that recently referenced pages will be accessed again soon. However it does require hardware assistance to timestamp memory references and it also requires scanning through the page table information to find the loser.

Another way of implementing the method is to maintain a list of pages. As each page is referenced it is moved from its current position to the top of the list. Pages are replaced from the bottom of the list.

This works well but is expensive, every memory access updates the clock field or the list.

LFU

The Least Frequently Used method is similar, except a count is kept of usage for each page. The method must somehow guard new pages because they haven't had long enough in memory in order to be used.

In practice we wouldn't want to keep a count of all references to a page since it was loaded. So we try …

NUR

A method which has a minimal overhead is the Not Used Recently method. We add a bit to each page entry. This is the referenced bit. When a page is loaded the reference bit is cleared. Only when the page is referred to is the referenced bit set. Then when we are searching for a page to swap out we look for pages which have clear reference bits.

Of course on a demand page system (we get to that in a minute) every page in memory will have been referenced, there is also no discrimination between pages that were referenced minutes ago over those that have been referred to in the last few microseconds. A way of dealing with this is to clear all referenced bits every so often. This way the referenced bits are set only on pages which truly have been referred to recently. Of course things can go wrong, on all of these methods the page next about to be referenced can be the one swapped out.

Frequently the hardware provides another bit along with the referenced bit. The "dirty" bit. This bit is set whenever the page has been modified. If we are going to swap pages out, we can save ourselves the bother if the page currently in memory hasn't been altered. Our replacement algorithm then looks for pages which haven't been referenced recently and which also haven't been modified first. If they are code pages, we may happily leave the original page in the file system and never move it to the swap space.

Second chance and clock replacement

Now that we have referenced bits we can try other tricks. We can implement a FIFO strategy without the problem of throwing out frequently accessed pages. Every time a process gets to the front of the queue we check to see if it has been referenced, if so we clear the reference bit and send it to the back of the queue (just like a newly allocated page).

A similar system uses clock replacement whereby all pages are inspected in order when a new page is required. If the page has not been referenced it is replaced, if it has the reference bit is cleared and we look at the next page.

Paging Daemons

A common method is to have a daemon (a system process) which scans for pages which haven't been referenced and keeps a list of them, then when a page is needed, these pages are checked to see if they have still not been referenced and are then replaced.

Demand-paging

Another question we have ignored so far, is how many pages do we allocate to a process when it starts running.

It would be possible to load the entire process into memory, however this becomes rather slow, because there is usually not enough room in real memory and hence it will have some of its pages (usually the first ones it wants to access) swapped out even before it has started running.

One way around this would be to move the entire program into the swapping space directly, without trying to place any of it in real memory. In some systems this has to be done, regardless of how many pages we give to the process when it first starts running. However if the normal disk space is compatible with the paging disk space then we have another solution.

A pure demand-paging system is one in which no pages are loaded into real memory until they have been requested (demanded). In such a system no pages are loaded into real memory until the process starts. The very first instruction will cause a page fault and the first page will be fetched.

There is an obvious problem with pure demand-paging systems: the start of any process's life is slowed down as it gradually builds up its working set. This is why some otherwise demand-paging systems will read in a certain number of pages to start with.

Where are the other pages. As I mentioned a moment ago, some systems move all information which could possibly be paged into the swap space.

However as long as the page tables can be set up to refer to the original program file residing on disk then the pages can be brought in from there the first time, and indeed any time if the page is never modified.

This means that it is perfectly possible for the memory of a process to exist in three distinct places:

Page sizes

It is interesting to attempt to estimate the ideal size for a page. Ideal in this sense means the size of page which minimizes the amount of memory which is not available for use by processes.

There are two competing factors. As the size of a page increases the amount of memory wasted through internal fragmentation also increases. Remember the figure of 1/2 a page per process.

Unfortunately as the size of a page decreases we also waste more memory. What? Where? To maintain a paging system each process has to have a page table which shows where its pages are. As the size of a page decreases the size of the page table increases. Surely the size of the page table is insignificant, unfortunately no.

Page tables are too big

With a 32-bit address space (4,294,967,296 bytes) and 4-Kbyte pages we have over a million possible number of page table entries for each process. If each of these entries occupies only 4 bytes we could be consuming 4 megabytes just for the page table of one process.

OK, it is very unlikely for a process to use up its address space. But even if it uses only a fraction of its address space, remember this is the page table for only one process.

Things obviously get worse as the size of the page decreases.

Because of this problem some machines actually page page-tables. In this case a memory access may be slowed down by a page fault on the page table entry the process needs in order to continue. In practice this doesn't unduly affect performance, as page table working sets will cover large chunks of the program.

To keep the page table down to size most modern systems have multiple levels of page table. The lower levels are only allocated when the process needs them.

Back to the lecture index