You are on page 1of 22

184

Chap.7 I

Intertask Communicationand Synchronizat

3. Hold and wait 4. No preemPtion

Countering any one of the four necessaryconditions is needed to preve deadlock. Mutual exclusion applies to those resourcesthat can't be shared (e' printers, disk devices, output channels).Mutual exclusion can be removed I for example, through the use of SPOOLe making such resourcesshareable, which allow these resources to appear to be shareable to an applicati task.

ex The circular wait condition occuts when a circular chain of processes further down the chain (such as processes neededby'other that hold resources One way to eliminatecircular wait is to imposean orderingr cyclic processing). order in resources increasing to the reslurces and to force all processes request enumeration.For example, consider the following list of resourcesand th (increasing)order number.

Disk Printer Motor conffol Monitor

I 2 3 4

Now if a processwishes to use both the printer and the monitor, it must requ first the printer and then the monitor' It can be proved that such a sche eliminatesthe possibility of deadlock. requestresourcese The hold and wait condition occurs when processes lequestsare hlled. c)ne solut resource then lock that resourceuntil subsequent at to this problem is to allocateto a processall potentially requiredresources Anoi lead to starvation to other processes' same time. This can, however, solution is never to allow a processthat locks more than one resourceat a til lock ( disk For example,when writing one semaphore-protected file to another, file and copy a record, unlock that file, lock the other file, write the record' i utilization as well as windowt so on. This, of course,can lead to poor resource interrupt and interfere with resou opportunity for other processes to utilization.

resource and signal the semaphore.If we allow the higher priority task to pree the lower one, then the deadlock can be eliminated. However, this can lea( starvation in the low-priority process as well as to nasty interference proble

I Sec.7.6 Deadlock

(For example,what if the low-priority task had locked the printer for output, and now the high-priority task startsprinting?) Two other ways of combating deadlock are to avoid it through avoidance algorithms like the Banker's algorithm, or to detect it and recover from it. Detection of deadlock is not always easy, although in embedded systems watchdogtimers can be usedand in organic systemsmonitors are appiopriate.

7.6.1Avoidance (A are deadlock available. morethorough techniques avoiding for Several For if of discussion thetopiccanbe foundin [146].) example, thesemaphores
protecting critical resourcesare implementedby mailboxes with time-outs,then deadlocking cannot occut But starvation of one or more tasks is possible. Starvation occurs when a task does nct receive sufficient resourcesto complete processingin its allocatedtime. A secondmethod for preventing deadlockis to allow preemption.That is, shouldbe allowed to grab then from tasksof higherpriority which needresources lower prionty tasks. Unfortunateiy, this can cause problems like stawation or incompleteI/O operarions. The fact that each task acquires a resource and then does not relinquish it until it can acquire anotherresourceis called a wait and hold condition. If we eliminate this condition, then deadlockcan be avoided.
T EXAiIPLE 7.11 to A taskneeds rcad from file 1 andwdte to file 2. It might openfite l, reada record,closehle l. for is Thenit opensflle 2, writes therecordandcldseshle 2 The process repeated eachrecorduntil I the file is transferred

timesgreatly. can however, slow downresponse This technique, be known as thebanker'salgorithmcansometimes used Finally, a technique the by suggested Dijksra [36] uses The prevent situations. technique deadlock to workson like resources, algorithm bank.The banker's of analogy a small-town that pools of memoryor printers.The algodthmensures the number for example, can to attached all processes neverexceedthe numberof resources of resources allocation"-that is, makea "dangeroui we In for thesystem. addition, cannever do not have enoughleft to satisfy the in allocateresources such a way that we of requirements any process.
I EXAMPLE 7..I2

of A, CoNider a systemwith threeptocesses, B, alrd C, and a pool of 10 resources a certaint]?e A (e.g., memoryblocks). It is known that process will neverneedmore than 6 blocks at any one A B time. For proce-sses and C the totals are 5 aDd7, respectively. table such as the one below is keep track of the rcsouce needsand availability. to constnrcted

r86

Chap. 7 I

Intertask Communica on and S\

Max Requirement

Used 0 0
Total Available

6 5 1

t 5 : ta

When resourcesare requested,the opemting system updates the table, ensuring dll I deadlock state is not reached.An example of a "saie state" is

Process

Max Requirement

Used

c

6 5 7

2 3 I
Toral Available

Here, the requirementsof processA or B can be satisfied,so the stare is safe An "unsafe state" is

Max Requirement

Used 4 l 2
Total Available

6 5. ,1

ln this case, the total requtrementsof no task can be met with the total available deadlock could ensue.

The banker's algorithm is often too slow for real-time systemsHabermann[56] has implemented the algorithm for mixed resources always practical. Finally, resource needs for each task may not b€ prrcrt.

7.6.2 Detectand Recover
Assfming that a deadlock situation can be detected (for example, bg watchdog timer), what can be done? One technique, known as fu algorithm, advisesthat the problem be ignored.If the deadlocksituadm to occur infrequently, for example, once per year, and the system is Dd one, this approach may be acceptable.For example, if in a video problem is known to occur infrequently, the effort needed to detect and

problemmay not be justified given the cost and function of the sysren

Sec 7.7 I Exercises

lE7

to the child who just losr his quarter).If the systemis a mannedolle or in control of say, an assemblyline, then the ostrich algorithm is unacceptable Another methodfor handling the deadlockis to resetthe systemcomPletelY. for Again, this may be unacceptable certain critical systerhs. state someform of rollback to a pre-deadlock Finally, if a deadlockis detected, can be perfomred,althoughthis may lead to a recurrentdeadlock,and operations suchas writing to certainfiles or devicescannotbe rolled backeasily.

7,7 EXERCISES
1 What effect would size N of a ring buffer have on its performance?How would you determinethe opdmal size? 2. For a machine you are familiar with, discuss whether the counting semaPhore implementationgiven in this chapter has any cntical region problems That is' can the semaphoreitself be tnte[upted in a harmful way'? 3. Why is it not wise to disable intetrupts before the while statement In the binary P? semaphore, procedures in tead-and-wrile 4. Rewrilelhe rinC bulTer (a) C or C++ (b) Ada or Ada 95 (c) Modula-2 5. Modify the wnte ptocedure for the ring buffer to handle the overflow condition' 6, Write a set of pseudocoderoutines to access(read from and write to) a 20-item ring to buffer. The routines should use semaphores allow more than one user to accessthe buffer, 7. Consider a binary semaphore,counting semaphore'queues,and mailboxes Any three can can be implementedwith the fourth We have shown hovr'binary semaphores be used to implement counting semaphoresand vice versa, how mailboxes can be used to and implementbinary semaphores, how mailboxescan be usedto implementqueues For each pair, show how one can be used to implement the otherl for (a) Binary semaphores impl€menting mailboxes. (b) Binary semaphores implementing queues. for (c) Queuesfor implemeniing binary semaphores. (d) Queuesfor implementing mailboxes. (e) Queuesfor implementin8 counting semaphores. for (O Counting semaphores implementing mailboxes. (g) Counting semaphores implementing queues for (h) Mailboxes for implemenring counting semaphores. 8. Discussthe problemsthat can ariseif the test and setin the P(S) operationare not atomic' What could happen if the simple assignmentstatementin the v(S) operation were nol atomic? in 9. Rewrite the binary semaphoreimplementationof the counling semaphore (a) C or C++ (b) Ada or Ada 95 (c) Modula-2 10. Using the ANSI-C raise and signal facilities' implement the pend(datas) and post(data,S) operations for arbitrary mailbox S

188

Chap. ? I

Intertask Communicationand

rl.

Rewrite the test_and set procedurein: (a) C or C++ (b) Ada or Ada 95 (c) Modula-2 The instruction is made indivisiti: by accessto a global semaphore two processors. to the CPU refusing to issuea DMA acknowledge(DMACK) signal in response a request (DMARQ) signal dunng execution of the instruction. The other memory.What are the real-time sharing the bus are locked out of accessing for a processortrying to accessmemory when anolher processoris executing a using the following code? that is looping for a semaphore Process 2:
r|aNInq JNE <an:6h^ra qE! IOCK

12.The TANDS instructioncan be usedin a multiprocessingsystemto prevent

in If this busy wait must be used, is there a bettet way to test the semaphore pror-:rs so that the bus is not tied up? 13. Rewrite the exception handler in Example 7.8 in (a) Ada or Ada 95 (b) Modula-2 (c) C++ 14. Write a function to compute.xfactorial, tI, where i is some nonnegativeinteger exceptionhandlerthat that rI = j . (.r - J ) . l and 0/ = .1.)Write an associated errols related to trying to take .r/ for -t < 0 and to handle overflow conditionsfactorial funcrion should invoke the exception handler if either error type occurs. Dc
ln

(a) c
(b) Ada or Ada 95 (c) C+r

15. Investigatethe use of signal handlersin the imPlementationof a Unix process
communication mechanism called a pipcline. Pipelines allow lhe outpuls of Your investigation can be doE processesto be used as inputs to other processes. examining the sourcecode to any Unix operating system,if available,or by one of the many texts on the Unix operating system

Real-Time Memory Management

KEY POINTS OF THE CHAPTER
1 . Dynamic memory management any kind in real-time, though usually of necessary, detrimental to real-time performanceand schedualability is analysis. Stacksare typically usedin foreground,/background systemsand the taskgeneric executives. control block used in commercial, 3 . Techniques managingstacksand task-controlblocks are given in the for chaDter.

An oftenneglected discussion, dynamic memoryallocation, important is in terms of both the use of on-demand memoryby applications tasks ald the requirements the operating of system. Applications tasksusememoryexplicitly, for example,throughrequests heap memory,and implicitly throughthe for maintenance therun{ime memory of needed support to sophisticated high-order (or languages. operating The system kemel)needs performextensive to memory management orderto keepthe tasksisolated. in Dangerous allocatlonof memory anyallocation canpreclude is that system determinism. Dangerous allocation destroy can eventdeterminism, example, for by overflowing stack,or it candestroytemporal the determinism entering by a (Chapter deadlock situation 11).It is important avoiddangerous to allocation of memorywhile at the sametime reducingthe overhead incurredby memory allocation. This overhead a standard of is component thecontext switchtime and mustbe minimized. Staticmemoryallocation schemes-thatis, the paftitioning generation of memoryat system time-are discussed Chapter in 9.

Chap. I Real-Time 8 Memory Managemeor Although some of the memory management schemesdiscussedin Section 8.2 may seem archaic (for example,MFT datesback to the early 1960s),these schemes have recently becomerelevant again.For example,cachememoriesare generally very small relative to main memory oust as main mernory was small relative to secondarystoragedevicesin early computers).In the case of cache, some of the replacement rules such as LRU and working setsare usedto manage the contentsof the cache.

8.1 PROCESS STACK MANAGEMENT
In a multitasking system,context for eachtask needsto be savedand restoredin order to switch processes. This can be done by using one or more run-time stacks or the task-control block model. Run-time stacks work best for interrupt-only systems and foreground/backgroundsystems, whereas the task-control block model works best with full-featured real-time operating systems. Substantial formalization of this statement can be found in [10].

8.1.1Task-Control Block Model
If the task-controlblock model is used, then a list of task-controlblocks is kept. This list can be either fixed or dynamic. In the fixed case,/r task-controlblocks are allocatedat system genemtion time, all in the dormant state.As tasks are created,the task-confiol block enten the ready state.Prioritizationor time slicing will then move the task to the execute state. If a task is to be deleted, its task-control block is simply placed in the dormant state.In the caseof a fixed number of task-conholblocks, no real-time memory management necessary. rs In the dynamic case,task-controlblocks are addedto a linked list or sorne other dynamic data structure as tasks are created.Again, the tasks are in the suspended stateupon creation and enter the ready statevia an operatingsystem call or event. The tasks enter the executestateowing to priority or time-slicing. When a task is deleted,its task-controlblock is removedfrom the linked list, and its heap memory allocation is retumed to the unoccupiedoJ available status. In this scheme,real-time memory managementconsistsof managing the heap neededto supply the task-control blocks; however, gther data structures such as a list or queuecan be used.(A heap is a specialkind of data structurebasedon a binary tree. For a disiussion of these structures,consult any text on data structures, example,[83].) for

8.1,2 Managing Stack the
If a run-time stack is to be used, certain considerations are required. In order to handlethe run-time savingand restoringof context,two simple routines-"save" and "restore"-are necessary.The save routine is called bv an intemrpt haldler

S:c. 8.1 I

ProcessStack Manasement

191

to save the current context of the machine into a stack area.This call should be made immediately after intemrpts have been disabled to prevent disaster.The restore routine should be called just before interrupts are enabled andtf,efore retuming from the interrupt handler.

T EXAMPLE 8.'I
Considea implementationof the saveroutine. Assume that global variable "stack' is to point to the the top of the stack and that eight generalregisters(R0-R7) are to be savedon a stack The memory location "PC" cofiespondsto the lnterrupt retum vector location, and so it containsthe PC value at the time of intefiuption. We need to save this on the stack to allow stackingof interupts The code for a 2-addressarchltectureresemblesthe followinsl save (context ) : S T O R ER 0 , s t a c k , I LOAD R0, stack ADD ADD ADD ADD ADD ADD ADD ADD ADD R O ,1 save regrster I save register 2 save register 3 save regrster4 save register 5 save reglster o save register 7 save retum location save new stack point& enable interrupts R O ,1 R O ,1 R O ,1 R O ,1 R O ,1 R O ,1 R O ,1 R O ,1 S T O R ER 1 , R O .] S T O R ER 2 , R O ,I R S T O R E 3 ,R O , I S T O R ER 4 , R O ,I S T O R ER 5 , R O ,I S T O R ER 6 , R O ,I sfroRE R7,R0, 1 S T O R EP C ,R O .1 S T O R ER 0 , s L a c k context is a pseudo-argument disable inte.rupts save contentsof register 0 onto stack load index register with addressof stack

The saveoperation illusnated Figure8-1. is in routine, writtenin 2-address code. Next consider restore the
restore (context ) : L O A D R O ,s L a c K SUB LOAD SUB R O ,1 PC, RO, I RO, 1

context a pseudo-argument is disableinterrupts

rcstorerclum location

192
LOAD R7,RO,I suB SUB SUB SUB SUB R0, 1 L O A D R 6 , R O ,I R O ,1 L O A D R 5 , R O, I R O ,1 L O A D R 4 , R O ,I RO,1 L O A D R 3 , R O ,I R O ,1 LOAD R2 , RO, 1 R O .1 SUB L O A D R 1 , R O, I S T O R ER 0 , s t a c k SUB R O ,1 LOAD RO, RO, I

MemoryManagement Chap.8 I Real-Time
restore register 7 restorercgister 6 restoreregister 5 restore register4 restoreregister 3 restore register 2 restore register I reset stack pointer restoreregister 0 enableinterrupts

in operatiolis illustrated Figure8.2. The restore

allow block save and block restoreinstructionsE Certain machine architectures store and load n general registers in t? consecutivememory locations. ThesE instructionsgreatly simplify the implementationof the saveand restoreroutin€s. may be designedto be interruptable(ro Be aware that such macroinstructions that if interruptshave not alreadybeen disabledreducecontext switch time), so they should be.

Stack +

R7

R1 Stack RO

Figure 8,1 The saveop€ration.

operation. Figur€ 8,2 The restore

I Process StackManasement

8.1.3 Run-Time Fing Buffer
A run-time stack cannot be usedin a round-robin systembecause the firsfin/ of firscout natureof the scheduling.In this casea ring buffer or circular queuecan be used to save context.The context is savedto the tail of the list and restored from the head.The saveandresloreroulines can be easilymodified to accompJish this operation.

Maximum StackSize
The maximum amountof space needed the run{ime stackneedsto be known a for priori.ln general, stacksizecanbe determined ifrecursion is not usedandheapdata structuresare avoided.If maximum stack memory requirementsare not known. then a catastrophic memory allocationcan occur,andthe systemwill fail to satisfy event determinism.Ideally, provision for at least one more task than anticipated should be allocated to the stack to allow for spurious intem:pm and time overloading.We will discussthis matterfurther in Chapter11;also see[95],

8.1.5 Multiple StackArrangements
Often a single run-time stackis inadequate manageseveralprocesses say,a to in, foreground/background system. Of course, in a multiprocessingsystem, each processwill manage its own stack, but this is not the kind of multiple stack schemewe are talking about. A multiple stackschemeusesa singlerun-time stackand severalapplication stacks. Using multiple stacks in embedded real-time systems has several advantages. 1. It permits tasks to interrupt themselves,thus allowing for handling transientoverload conditions or for detectingspuriousinterrupts, 2. The systemmay be written in a languagethat supportsre-entrancyand recursion,such as C or Pascal.Individual run-time stackscan be kept for each process which contains the appropriate activation records with dynamic links needed to support recursion. Or two stacks for each processcan be kept, one for the activationrecordsand the other for the display (a stack of pointersusedto keep track of variable and procedure scope).In either case,a pointer to thesestacksneedsto be savedin the context or task-conftolblock associated with that task. 3. Only non-re-entr.ant languagessuch as older versions of FORTRAN or assemblylanguagecan be supportedwith a single-stackmodel. We can rewrite the saveard restoreroutinesto use the conFxt arsumentas a Dointerto the stack.That is.

194
save (ccn text ) DPI S T O R ER 0 , c o n L e x t LOAD R0 , cont ext, I ADD ADD ADD R O ,1 S T O R ER 1 , R O ,I R O ,1 S T O R ER 2 , R O ,I R O ,1 STORE R3 , RO, I RO, ]. ADD S T O R ER 4 , R O ,I ADD ADD ADD ADD ADD EPI R O ,1 sToRE R5.R0, I R O ,1 S T O R ER 6 , R O ,I R O ,1 S T O R ER 7 , R O ,I R O ,1 S T O R EP C , R O ,1 contexL, 9

MemoryManagement Chap.8 I Real-Time

disable interrupts save contentsof register 0 onto stack load index register save reglsrcr r save register 2 save register 3 save register 4 save register 5 save register 6 save rcgister 7 save telum locanon increment stack pointer enable interrupts

This is the new restoreprocedure.

DPI LOAD RO, context, SUB LOAD SUB LOAD suB suB SUB SUB LOAD suB - suB RO, 1 PC, RO, 1 R O ,1 R7 , RO. I R0, 1 1

interrupts disable

restoreretum location restorcregister7 restoreiegister 6 restoreregister5 restorercgister4 register3 resaore restoreregister2

L O A D R 5 , R O ,1 R'0,1 R O ,1 R O ,1 R3 . RO, 1 R0, 1 R0, 1 L O A D R 5 . R O ,1 L O A D R 4 , R O ,1

LOAD R2 , RO, 1

8.1 I

ProcessStack Manaeement

195 restore register I restore register 0 decrement stackpointer
enable interupts

LOAD SUB LOAD SUB EPI

R 1 ,R 0 , L R0,1 R O , O ,I R contexL, 9

The individualinterupt-handler routines to save to a main stack, written in Pascal, follow.
nr--611,'16 i n1.

beg in save (mains tack ) ;
.: ca i nFa-rrrnF ^f

1: inLl; 2t end resCore (mainsLack ) end
hr^-6/l.,7a i-ts1

iIlL2;

3: inL3;

/*irtarrrr-r

h:nrllar

1

*/

begin save (stackl ) ; taskl; restore(stackl) end
hr^-64,r-6 i nF, .

save context on stack /" execute task 1 */ /* /* restore context

*/

fron stack

/r begin
drrra/ci.^l/r\.

interrupt

handler

2 */ r/ */

/* /* /t

task2;
racl ^,6/cr-.^L,)

save context on stack execute task 2 */ restore context

from stack

end
hr^-aA,r,6 ini-1' /* ihi6r,,,^r h:h/ll67 I +/

begin
.:valal:^L1\.

/* /* /*

save contexf restore

task3; restore (stack3) end

o-n stack execute task 3 */ context

*/ r/'

from stack

196

Chap. 8 I

Real-Time Memory Management

I EXAMPLE 8.2
Supposethree processes running in an interrupt-only systemwhere a single interrupt basedon are three prioritized intenupts is gener4ted. Let taskl, task2, and task3 be as follows: procedure taskl; appLicl; aPPl ic2 end procedure Lask2; applic2; appf rc3 end procedure task3; appfic3; applic4 end Supposetaskl is running when it is interupted by task2 during applic2. Later, task2 is interrupted by task3 dudng applic3. The main and run time stackswill then look like Figure 8.3. I

begfn

oeg1n

begln

Main slack

task2 sta6li

task3 stack

Figure 8.3 Main andrun-timestacks Example for 8.2.

Block Model 8.1.6Task-Control
When implementingthe task-control block (TCB) model of rcal-time multitaskrn& the chief memory management issue is the maintenance of the linked lists for tbe ready ald suspended tasks.As shown in Figure 8.4, when the currently executing task completes, preempted, is suspended or is while waiting for a resource, ne the highest priority task in th€ ready list is rernoved and is made the executing one. E

Sec.8.2 I

DynamicAllocation

197

Ready Ljst 1. Exsculinq dilic€l task|El€as€s rcsourcane-€dsd suspendsd by high-prioitylask 2. Ex€culing tssk insenedin 3. Suspgnd€d taskbegins 4 Dormad list nol shown

List SusPended Figure 8.4 Memory managementin the task-controlblock model

list, that is done. (If the the executing task needsto be added to the suspended executingtaskhascompleted,then its TCB is no longer needed.) Hence,by properly managingthe linked lists, updatingthe statusword in the TCBs, and adheringto the appropriateschedulingpolicy by checking the priority word in theTCBs, round-robin,preemptivepriority, or both kinds ofschedulingcan of can include the maintenance reserved be induced.Other memory management as blocks of memorythat are allocatedto individual task applications requested.

ALLOCATION 8.2 DYNAMIC
Dynamic allocation used to satisfy individual task requirementsfor memory is accomplishedby using a data structure such as a list or heap. For example, memory allocationcalls to the procedure"malloc" in C are implementedthough library calls to the operatingsystem.In Pascal,the NEW function can be'usedto generatea new record type in a dynamic memory scheme.Ada and Modula-2 provide similar constructs.How these languagesimplement the ailocation and deallocationof memory is compiler dependent.And, as we discussedbefore. languages such as FORTRAN and BASIC do not have dynamic allocation A constructs. good book on data structures(e.g.,t83l) can be consultedil order to implement thesedynamic memory allocation schemes in In this section,however,we ate interested dynamicmemory allocationfor processcode in main memory, and certdin aspectsof this need to be considered as they relate to real-time systems.In particular, we are interestedin scbemes where two or more programs cal co-residein main memory. Several schemes allow this capability,and we will review someofthem briefly with respectto their

.

Management Memory Chap. I Real-Time 8 real-dme implications. Interestedreaderscan consult a good text on operating systems such as [129] for a more detailed coverage.In general, the types of dynamic allocation that we are about to discuss are not recommended in embeddedreal-time sysbms.

8.2.1Swapping
The simplestschemethat allows the operatingsystemto allocatememory to two processes"simultaneously" \s swapping' In this case, the operating system is always memory resident)and one processcan co-residein the memory spacenot required by the operatingsystem,called the user space When a secondprocess alongwith its contexl and is to needs run, the frrstprocess suspended then swapped, a disk. The secondprocess,along with its to a secondarystoragedevice, usually context,is then loadedinto the user spaceand initiated by the dispatcher. This type of schemecan be used along with round-robin or preemptive priority systems, we would like the executiontime of eachprocessto be long but relative to the swap time. The accesstime to the secondarystore is the principal contributor to the context switch overheadand real-time responsedelays

8.2.2 Overlays
A technique that allows a single program to be larger than the allowable user spaceis called overlaying.In this casethe program is broken up into dependent code and data sections called overlays, which can fit into available memoq'. Specialprogram code must be included that permits new overlaysto be swaPped into memory as needed(over the existing overlays), and care must be exercised in the design of such sYstems. This technique has negative real-time implications becausethe overla;c overlaying ca must be swappedfrom secondarystoragedevices.Nevertheless, later in this chapter be usedin conjunction with any of the techniquesmentioned to extend the available address space. Many commercial tools are available thr facilitate overlaid linking and loading in conjunction with commonly used programming languagesand machines. Note that in both swapping and overlaying a portion of memory is never swapped to disk or overlaid. This memory contains the swap or overlay manager(and in the caseof overlaying any code that is common to all overlayr is called the root).

8.2.3 MFT
to thansimpleswappingallowsmore thanoneprocess bc A moreelegantscheme into a numberd at rnemory-resident any one time by dividing the userspace with a fixed is partitions.This scherne calledMFT (multiprogramming txed-size wherethe numberof tasksto bc numberof tasks)and is useful in systems

I

DynamicAllocation

executed is known and fixed, as in many embedded appiications. Partition swappingto disk can occur when a task is preempted. Tasks,however,must reside in contiguouspartitions,and the dynamic allocation and deallocationof memory causeproblems. In some cases main memory can become checkered with unused but available partitions,as in Figure 8.5. In this casethe memory spaceis said to be externally problemswhen memory fragmented.This type of liagmentationcauses requests cannotbe satisfiedbecause contiguousblock of the size requested a does not exist, even though the actual memory is available. I EXAMPLE 8.3
In Figure 8.5, even though 40 megabytes memory are available,they are in noncottlguousblocks, of so the requestcannot be honored.

Figure 8.5 Fragmented memory.

Another problem, internalfragmentation, occursin fixed partition schemes when, for example, a process requires 1 megabyte of memory when only 2-megabytepartitions are available.The amount of wasted memory or intemal fragmentationcan be reducedby creatingfixed partitionsof severalsizesand then allocating the smallestpartition greaterthan the requked amount Both intemal and extemal fragmentationhamper efficient memory usage and ultimately degradereal-time performance because the overheadassociated of with their correction. MFT is not particularly desirablein the real-time operatingsystembecause it usesmemory inefficiently as a result of the overheadassociated with fitting a processto availablememory and disk swapping.However, in some implementations, particularly in commercial real-time executives,memory can be divided into regions in which each region containsa collection of differenrsized, fxedsized pa$itions. For example,one region of memory might consistof 10 blocks of size 16Mb, while anotherregion might contain 5 blocks of 32Mb and so on.

Chap. 8 I

Real-Time Memory Management

The operatingsystemthen tries to satisfy a memory request(either diiectly from the program via a systemcall or through the operatingsystemin the assignment of that processto memory), so that the smallestavailablepartitions are used This approachtends to reduce internal fragmentatlon.

8.2.4MVT
ln MVT (or multiprogramming with a variable number of tasks), memory is allocated in amounts that are not fixed, but rather are determined by the requirementsof the processto be loaded into memory. This techniqueis more appropriatewhen the numberof reahime tasksis unknown or varies.In addition' little or no memory utilization is better for this techniquethan for MFT because fragmentationcan occur,as the memory is allocatedin the amountneeded intemal of for each process.Extemal fragmentationcan still occur because the dynamic memory must still be nature of memory allocation and deallocation,and because allocatedto a processcontiguously. In MVT, however,extemal fragmentationcan be mitigated by a processof compressing fragmented memory so that it is no longer fragmented. This techniqueis called compaction(seeFigure 8.6). Compactionis a CPU-intensive in processand is not encouraged hard real-time systems.If compactionmust be performed,it shouldbe donein the background,and it is imperativethat interrupts be disabledwhile memory is being shuffled. The bottom line is that MVT is useful when the number of real-time tasks is unknown or can vary. Unfortunately,its context-switchingoverheadis much such asMFT, and thus it is not always appropriate hieher than in simpler schemes

Belore

I

Dynamic Allocation

for embeddedreal-time systems.It is more likely to be found in a commercial real-time operating system.

Demand Paging
In demand pdge systems, program segments are permitted to be loaded in noncontiguous memory as they are requested fixed-sizechunks calledpagesor in page frames. This schemehelps to eliminate extemal fragmentation.Program codethat is not held in main memoryis "swapped" secondary to storage, usually a disk. When a memory referenceis made to a location within a page not loaded in main memory, a page fault exception is raised.The interrupt handler for this exceptionchecksfor a free page slot in memory. If none is found, a page frame must be selected and swapped disk (if it has beenaltered)-a process to called page stealing. Paging,which is providedby mostcommercial operating systems, is advantageous it because allows nonconsecutive references pagesvia a pd.qe to table. ln addiiion, paging can be used in conjunction wirh bank switching hardwareto extend the virtual addressspace.In either case,pointers are usedto accessthe desiredpage (see Figure 8.7). Thesepointers may representmemorymapped locations to map into the desired hard-wired memory bank; may be implbmentedthrough associative memory; or may be simple offsets into memory, in which casethe actual address main memory needsto be calculatedwith each in memory reference. Paging can lead to problems including very high paging activity called thrashing, intemal fragmentation,and the more seriousdeadlock(seeChapter7). But it is unlikely.that you would use so complex a schemeas paging in an embedded real-time system where the overhead would be too great and the associated hardware support is not usually available.

Figure 8.7 Paged memoryusingpointers.

202

Chap. 8 I

Real-TirneMemory Management

8.2.5,1 Replacement Algorithms-Least Recently Used Rule Several methodscan be usedto decidewhich page should be swappedout of memory to to disk, such asfirsrinfirst-out (FIFO). This methodis the easiest implement,and of the pages Although its overheadis only the recordingof the loading sequence other algorithmsexist, the best nonpredictivealgorithm is the least recentlyused (LRU) rule. The LRU method simply states that the least recently used page will be swapped out if a page iault occurs To illustrate the method' consider the following. 8.4 T EXAMPLE
pagesof which any 4 can b€ loaded A pagedmemory systemis divided into sixteen256-megabyte page is tagged(1, 2, etc ) The operatingsystemkeepstrack of the usageof at ihe sametime. Each each page.For examPle,the Pagereierencestring 2 3 4 5

indicatesthat pages2, 3, 4, and 5 have been used in that order' If a requestis made for page7' theo it page2 will be swappedout in order to make room for page7, because was the leastrecently used 3,4,5' and 7 with referencestdng The loaded pageswould then be 2 3 4 5 ' 7 if pleasenote that references pagesalreadyloaded in memory causeno page fault. For instance, to is now made to page 3, no pages need to be swapped becausePage 3 is loaded iD a reference memory. If this referenceis followed by one to page 6, page 4 would have to be swappedou becauseit had the least recent reference.The loaded pages would then be 3, 5' 7' and 6 wih reference string 2 3 4 5 ' 7 3 6 Notethatinapagingmemoryscheme,thewolstpossiblescenarioinvolvespagestealirrgforead request of memory. This occurs, fol examPle, in a four-page system when five pages are requested cyclically as in the page referencestring

2 4 6 8 9 2 4 6 8 9"'

You should note that the performance of LRU is the same in this case as FIFO (in terms of number of Pagefaults).

(whetheror not usedin conjunctionwitl schemes ln FIFO pagereplacement t\e we working sets), might tliink thatby increasing numberofpagesin mernory(a windowsin the working settwe canreducethenumberof pagefaults Oftenthisb tb conditionoccurswherebyincreasing an the case,but occasionally anomalous numberof pagefaults This is Beladyb lhe numberof pagesactually increases schemes' not Anomaly,whichasit tumsout,does occurin LRU replacement
To conclude, the overhead for the LRU scheme rests in recording the to sequence all pages,which can be quite substantial.Therefore' the benefrtsof usi LRU need to be weighed against the effort in implernenting it vis-d'vls FIFO'

i:c. X 2 I

DlnamicAllocation

the 8.2.5.2 Memory Locking In additionto thrashing, chief disadvantage of page swappingin real-time systemsis the lack of predictableexecutiontimes. In a real-time system,it is otien dqsirableto lock all or certain parts of a process into memory in order to reducethe overheadinvolved in paging and to make the execution times more predictable.Cenain commercial real-tjme kemels provide this feature, called memorl laclhg. These kemels typically allow code or data segments. both, for a particularprocess,as well as the run{ime stack segment, or to be locked into main memory. Any processwith one oi mole locked pagesis then prevented from being swapped out to disk. Memory locking decreases execution times for the locked modules and. more importantly, can be used to guarantee execution times. At the same time, it makes fewer pagesavailablefor ion, encouraging corttention. the applical

8.2.5.3 Other Points About Paging ln summary, 1. Paging is most efficient when supportedby the appropriatehardware. 2. Paging allows multitasking and extensionof the addressspace. 3. When a page is referencedthat is not in main memory, a page fault occurs, which usually causesan interrupt. ,| The hardwareregistersthat are usedto do pageframe address translation are part of a task's context and add additional overheadwhen doing a contextswitch. 5 . If hardware page mapping is not used, then additional overhead is incurred in the ohvsical addresscalculations. 6. The least recently used rule is the best nonpredictive page-swapping algorithm. 7. In time-critical real+ime systems, we cannot afford the overhead associated with disk swappingin simple swapping,overlays,MFT, MVT, paging schemes. or

8.2.6WorkingSets
The idea is if you Working sets are based,onthe model of localtty-of-reference. examine a list of recently executedprogram instructionson a logic analyzer,you will note that most of the instructionsare localized to within a small number of instructions in most cases. (For example, in the absence of lnterrupts and blanching, the program is executedsequentially.Or the body of a ioop may be executeda large number of times.)However, when interrupts,procedurecalls, or is branchingoccurs,the locality-of-reference altered.The idea in workhg setsis that a set of local code windows is maintained in the cache and that upon accessing memory location not containedin one of the working sets,one of the a rule such as FIFO or windows in the working set is replaced(using a replacement

Chap. 8 I

Real-Time Memory Management

LRU). The performance the schemeis basedentirely on the sizeof the working of set window, the number of windows in the working set, and the locality-ofreferenceof the code beins executed.

8.2.7Real-Time Garbage Collection
In a memory-management context,garbageis memorythat hasbeenallocatedbut is no longer being usedby a task (that is, the task has abandoned Garbagecan it). accumulatewhen tasks terminate abnormally without releasingmemory resourcesn441. It can also occur in object-oriented systemsand as a normal byproduct of nonprocedural languages [162]. [4], In C, for example,if memory is allocatedusing the malloc procedureand the pointer for that memory block is lost, then that block cannotbe usedor properly freed. The samesituationcan occur in Pascalwhen recordscreatedwith the new statement not properly disposedof. are Garbage collection algorithms generally have unpredictableperformance (althoughaverageperformancemay be known). Garbagecan be reclaimedusing the following procedure. Tag all memory from the heap which is pointed to by a variable (including those variables in procedure activation frames-a nondeterministicdata structure).Then reclaim all nontaggedmemory for the heap. The loss of determinismresultsfrom the unknown amountof garbage, tagging the time of the nondeterministicdata structures, and the fact that many incremental garb;ge collectorsrequire that every memory allocationor deallocationfrom the heap be willing to service a page-faulttrap handler Another techniqueis to build a heap or table of memory blocks along wirt processID for the owner of the memory block. This data structurE an associated is then periodically checkedto determinewhether memory has been allocatedto a processthat no longer exists. If this is the case,the memory can be released Becauseof the overheadinvolved, this method should not be implementedin high-frequencycycles, and ideally garbagecollection should be performed as r backgroundactivity or not performedat all [4]. Nevertheless, research real-tic in garbagecollection is still open.

8.2.8Contiguous FileSystems
Disk I/O is a problem in many real-time systemsthat can be exacerbated by File fragmentation is analogousto memory fragmentation and fragmentation. problems, only worse. In addition to the logical the same associated incuned in finding thcinext allocation unit in the file, the physical overhead of

physical disk mechanism a factor.For example, is overhead involvedin the disk's read/writeheadto the desiredsectorcan be sisnificant
To reduce or eliminate this problem aliogether, many commercial real-d systems, such asreal-time UNIX, force all allocatedsectorsto follow one on the disk.-This techniqueis called contiguous file allocation.

Sec.8.4 t Exercises

205

8,3 STATIC SCHEMES
Static memory issues revolve around the partltioning of memory into the I/O space,and so on. This appropriateamount of RAM, ROM, memory-mapped in 9. problemof resource is discussed Chapter allocation

8.4 EXERCISES
1. Rewrite the sale and restoreroutines assumlngthat eight generalregisters(R0 R7) and the paogramcounter are to be saved on a stack. Do this fbr (a) O-address machrne (b) l-addressmachine (c) 3-address machine code,assumingblock move (BMOVE) 2, Rewrite the saveand restoreroutinesin 2-address and restore (BRESTORE) instructions are available Make the necessaryassumptions aboutlhe tormal o[ the.e instruction:.. 3. Rewaifethe saveand restoreroutines so that they saveand restoteto the headand tail of a ring buffer, respectively. 4. Rewrite the save and testore routines in Pascal so that they employ push and pop procequres. Write a pseudocodealgorithm that allocatespagesof memory on request.Assume that 100 pages of size I megabyte, 2 megabytes, and 4 megabytes are available The algorithm should take size of the pagerequestedas an argument,and retum a pointer to the desiredpage.The smallestavailable page should be used,but if the smallestsrze ls unavailable.the next smallestshould be used. algorithm compacting64 megabytesof memory that is divided into 6. Write a pseudocode l-megabyte pages Use a pointer scheme.

7. For a four-page merilory systemwith memory referencestring, for example,
2 4 6 8 9 2 4 6 8 9-..

show rhat the number or page faults for FIFO replacementis lhe same as for the LRU replacementscheme. pagesof which any four 8. A pagedmemory systemis divided into sixteen256-megabyte can be loaded at the same time. Each page is tagged (1,2, etc.). Write a pseudocode algorithm to implement the least recently used rule. to datablocksin a link list (analogous the C 9, Write a heapmanagerto handlearbitrary-sized malloco routine).Rememberthat the run-time stackcan collide with the heap.Do this in fa) C (b) Ada fc) Pascal (d) Modula-2 (e) C++ 10. Modify the heap manager in the previous exercise so that a table consisting of the memory block number and processID is stored.Wrire a garbagecollection routine lo accomFanythe heap manager which consults a second table consisting of a list of all existing processIDs and frees all memory blocks belonging to extinc! processes.