You are on page 1of 25

Imperial College London

Department of Computing

Computer Systems 113 Architecture 110

Memory Organisation CPU Organisation and Operation Introduction to Assembly Programming

Dr. N. Dulay

November 2007

N. Dulay

Main memory organisation (1)

Preliminaries
Welcome to the Computer Systems (11 ! " #rchitecture (110! course. $ver the ne%t t&o terms &e'll stu(y the basic operation an( organisation of a computer. )opefully you've all complete( the self* stu(y on integer representations an( arithmetic. Some preliminaries first+ How to contact me ,here are 2 options+ (i! see me at a tutorial- (2! email me using n(.(oc.ic.ac.u/ Workload ,he course consists of lectures (2 per &ee/!- tutorials (1 per &ee/!- a number of pieces of assesse( course&or/ an( a lab e%ercise. ,he tutorials an( lab e%ercise are not e%amine( but are important as they give you an opportunity to chec/ on your progress an( to (iscuss problems an( i(eas &ith the tutorial helpers. ,he assesse( course&or/ is mar/e( an( un(erta/en as home&or/. ,he lab e%ercise is sche(ule( as part of the 00, programme. Exams ,here is a single combine( Christmas test that normally has 1 architecture 1uestion. ,he main e%aminations are hel( at the start of the summer term. 0ast e%am papers can be accesse( an( vie&e( via the (epartmental homepage. Textbooks 2nfortunately there is no single i(eal intro(uctory te%tboo/ . ,he follo&ing te%tboo/s are goo( ho&ever+
1

Structure( Computer $rgani3ation (4th e(ition! by #n(re& S. ,annenbaum- 0rentice )all Computer $rgani3ation an( #rchitecture+ Designing for 0erformance (7th 5(! by William Stallings- 0rentice*)all. ,he course also involves 6ntel 0entium assembly programming. 7or this you may &ish to consi(er ac1uiring a boo/ about 0entium programming such as+ 8ui(e to #ssembly 9anguage 0rogramming in 9inu% by S. Dan(amu(i- Springer. #n e%cellent &eb resource is Wi/ipe(ia at http+""en.&i/ipe(ia.org" Website ,he course notes- tutorials- sli(es etc &ill be available via the follo&ing 2:9+
https://www.doc.ic.ac.uk/~nd/architecture

Note+ the 2:9 starts https- not http. ;ou nee( to print your o&n copies.

<ore ambitious stu(ents might also &ish to consi(er Computer $rganisation = Design by Davi(. #. 0atterson an( >ohn. 9. )ennessey- <organ ?aufmann 0ublishing.

N. Dulay

Main memory organisation (2)

Main Memory ( AM! Organisation


Computers employ many (ifferent types of memory (semi*con(uctor- magnetic (is/s an( tapesD@Ds etc.! to hol( (ata an( programs. 5ach type has its o&n characteristics an( uses. We &ill loo/ at the &ay that <ain <emory (:#<! is organise( an( very briefly at the characteristics of :egister <emory an( Dis/ <emory. 9et's locate these types of memory in an abstract computer+

CPU Registers Arithmetic & Logic Unit I/O Controller(s)

Main Memory

Control Unit

RAM

RAM

Disk Drive

Disk Drive

Disk Drive

Register Memory
:egisters are memories locate( &ithin the Central 0rocessing 2nit (C02!. ,hey are fe& in number (there are rarely more than AB registers! an( also small in si3e- typically a register is less than AB bitsC 2*bit an( more recently AB*bit are common in (es/tops. ,he contents of a register can be Drea(E or D&rittenE very 1uic/ly ho&ever- often an or(er of magnitu(e faster than main memory an( several or(ers of magnitu(e faster than (is/ memory.
2

Different /in(s of register are foun( &ithin the C02. 8eneral 0urpose :egisters are available for general use by the programmer. 2nless the conte%t implies other&ise &e'll use the term FregisterF to refer to a 8eneral 0urpose :egister &ithin the C02. <ost mo(ern C02's have bet&een 1A an( AB general purpose registers. Special 0urpose :egisters have specific uses an( are either non* programmable an( internal to the C02 or accesse( &ith special instructions by the programmer. 5%amples of such registers that &e &ill encounter later in the course inclu(e+ the 0rogram Counter register (0C!- the 6nstruction :egister (6:!- the #92 6nput = $utput registers- the Con(ition Co(e (Status"7lags! register- the Stac/ 0ointer register (S0!. ,he si3e (the number of bits in the register! of the these registers varies accor(ing to register type. ,he Wor( Si3e of an architecture is often (but not al&aysG! (efine( by the si3e of the general purpose registers.
B

2 B

*H e.g. less than a nanosecon( (10 sec! $ccasionally calle( Wor/ing :egisters 2se( for performing calculations- moving an( manipulating (ata etc.
Main memory organisation (3)

N. Dulay

6n contrast to main memory an( (is/ memory- registers are reference( (irectly by specific instructions or by enco(ing a register number &ithin a computer instruction. #t the programming (assembly! language level of the C02- registers are normally specifie( &ith special i(entifiers (e.g. :0- :1- :7S0- 0C! #s a final point- the contents of a register are lost if po&er to the C02 is turne( off- so registers are unsuitable for hol(ing long*term information or information that is nee(e( for retention after a po&er* shut(o&n or failure. :egisters are ho&ever- the fastest memories- an( if e%ploite( can result in programs that e%ecute very 1uic/ly.

Main Memory (RAM)


6f &e &ere to sum all the bits of all registers &ithin C02- the total amount of memory probably &oul( not e%cee( 4-000 bits. <ost computational tas/s un(erta/en by a computer re1uire a lot more memory. <ain memory is the ne%t fastest memory &ithin a computer an( is much larger in si3e. ,ypical main memory capacities for (ifferent /in(s of computers are+ 0C 412<I - fileserver 28I(atabase server J8I. Computer architectures also impose an architectural constraint on the #ordSi$e ma%imum allo&able :#<. ,his constraint is normally e1ual to " memory locations.
4 A

:#< (:an(om #ccess <emory! is the most common form of <ain <emory. :#< is normally locate( on the motherboar( an( so is typically less than 12 inches from the C02. :$< (:ea( $nly <emory! is li/e :#< e%cept that its contents cannot be over&ritten an( its contents are not lost if po&er is turne( off (:$< is non*volatile!.
7 J

#lthough slo&er than register memory- the contents of any location in :#< can still be Drea(E or D&rittenE very 1uic/ly . ,he time to rea( or &rite is referre( to as the access time an( is constant for all :#< locations.
H 10

6n contrast to register memory- :#< is use( to hol( both program co(e (instructions! an( (ata (numbers- strings etc!. 0rograms are Dloa(e(E into :#< from a (is/ prior to e%ecution by the C02. 9ocations in :#< are i(entifie( by an addressing scheme e.g. numbering the bytes in :#< from 0 on&ar(s . 9i/e registers- the contents of :#< are lost if the po&er is turne( off.
11

4 A 7 J H 10 11

#ctually many computers systems also inclu(e Cache memory- &hich is faster than <ain memory- but slo&er than register memory. We &ill ignore Cache memories in this course. 1? K 2 K 102B- 1< K 2 - 18 K 2 - LI' &ill be use( for Iytes- an( Lb' or Lbit' for bits- cf. 1<I an( 1<bit ,here are many types of :#< technologies. :an(om is a <isnomer. Direct #ccess <emory &oul( have been a better term. ,ypically a byte multiple. e.g. less than 10 nanosecon(s (10%10 sec! Some :#< locations (typically those &ith the lo&est = highest a((resses! may cause si(e*effects- e.g. cause (ata to be transferre( to"from e%ternal (evices
10 20 0

*H

N. Dulay

Main memory organisation (4)

Disk Memory
Dis/ memory is use( to hol( programs an( (ata over the longer term. ,he contents o% a dis& are 'O( lost i% the po)er is turned o%%* ,ypical har( (is/ capacities range from B08I to over 400 8I 2H (4%10 !. Dis/s are much slo&er than register an( main memory- the access*time (/no&n as the see/* time! to (ata on (is/ is typically bet&een 2 an( B milli*secon(s- although (is/ (rives can transfer thousan(s of bytes in one go achieving transfer rates from 24<I"s to 400<I"s.
12

Dis/s can be house( internally &ithin a computer Dbo%E or e%ternally in an enclosure connecte( by a fast 2SI or fire&ire cable . Dis/ locations are i(entifie( by special (is/ a((ressing schemes (e.g. trac/ an( sector numbers!.
1

Summary of Characteristics

12 1

Some authors refer to (is/ memory as (is/ storage. 7or (etails about ho& (is/s an( other storage (evices &or/- chec/ out ,anenbaum or Stallings.

N. Dulay

Main memory organisation (5)

SRAM, DRAM, SDRAM, DDR SDRAM


,here are many /in(s of :#< an( ne& ones are invente( all the time. $ne of aims is to ma/e :#< access as fast as possible in or(er to /eep up &ith the increasing spee( of C02s. S:#< (Static :#<! is the fastest form of :#< but also the most e%pensive. Due to its cost it is not use( as main memory but rather for cache memory. 5ach bit re1uires a A*transistor circuit. D:#< (Dynamic :#<! is not as fast as S:#< but is cheaper an( is use( for main memory. 5ach bit uses a single capacitor an( single transistor circuit. Since capacitors lose their charge- D:#< nee(s to be refreshe( every fe& millisecon(s. ,he memory system (oes this transparently. ,here are many implementations of D:#<- t&o &ell*/no&n ones are SD:#< an( DD: SD:#<. SD:#< (Synchronous D:#<! is a form of D:#< that is synchronise( &ith the cloc/ of the C02's system bus- sometimes calle( the front*si(e bus (7SI!. #s an e%ample- if the system bus operates at 1A7<h3 over an J*byte (AB*bit! (ata bus - then an SD:#< mo(ule coul( transfer 1A7 % J M 1. 8I"sec. DD: SD:#< (Double*Data :ate D:#<! is an optimisation of SD:#< that allo&s (ata to be transferre( on both the rising e(ge an( falling e(ge of a cloc/ signal. 5ffectively (oubling the amount of (ata that can be transferre( in a perio( of time. 7or e%ample a 0C* 200 DD:*SD:#< mo(ule operating at 200<h3 can transfer 200 % J % 2 M .28I"sec over an J*byte (AB*bit! (ata bus.

ROM, PROM, EPROM, EEPROM, Flash


6n a((ition to :#<- they are also a range of other semi*con(uctor memories that retain their contents &hen the po&er supply is s&itche( off. :$< (:ea( $nly <emory! is a form of semi*con(uctor that can be &ritten to once- typically in bul/ at a factory. :$< &as use( to store the DbootE or start*up program (so calle( firm&are! that a computer e%ecutes &hen po&ere( on- although it has no& fallen out*of*favour to more fle%ible memories that support occasional &rites. :$< is still use( in systems &ith fi%e( functionalities- e.g. controllers in cars- househol( appliances etc. 0:$< (0rogrammable :$<! is li/e :$< but allo&s en(*users to &rite their o&n programs an( (ata. 6t re1uires a special 0:$< &riting e1uipment. Note+ users can only &rite*once to 0:$<. 50:$< (5rasable 0:$<!. With 50:$< &e can erase (using strong ultra*violet light! the contents of the chip an( re&rite it &ith ne& contents- typically several thousan( times. 6t is commonly use( to store the DbootE program of a computer- /no&n as the firm&are. 0Cs call this firm&are- the I6$S (Iasic 6"$ System!. $ther systems use $pen 7irm&are. 6ntel*base( <acs use 576 (5%tensible 7irm&are 6nterface!. 550:$< (5lectrically 5rasable 0:$<!. #s the name implies the contents of 550:$<s are erase( electrically. 550:$<Ss are also limite( to the number of erase*&rites that can be performe( (e.g100-000! but support up(ates (erase*&rites! to in(ivi(ual bytes &hereas 50:$< up(ates the &hole memory an( only supports aroun( 10-000 erase*&rite cycles. 79#S) memory is a cheaper form of 550:$< &here up(ates (erase*&rites! can only be performe( on bloc/s of memory- not on in(ivi(ual bytes. 7lash memories are foun( in 2SI stic/s- flash car(s an( typically range in si3e from 2< to 28I. ,he number of erase"&rite cycles to a bloc/ is typically several hun(re( thousan( before the bloc/ can no longer be &ritten.

N. Dulay

Main memory organisation (6)

Main Memory Organisation


<ain memory can be consi(ere( to be organise( as a matri% of bits. 5ach ro& represents a memory location- typically this is e1ual to the &or( si3e of the architecture- although it can be a &or( multiple (e.g. 2%Wor(si3e! or a partial &or( (e.g. half the &or(si3e!* +or simplicity )e )ill assume that data )ithin main memory can only be read or )ritten a single ro) (memory location! at a time* 7or a HA*bit memory &e coul( organise the memory as 12%J bits- or J%12 bits or- A%1A bits- or even as HA%1 bits or 1%HA bits. 5ach ro& also has a natural number calle( its address &hich is use( for selecting the ro&+
1B

#((ress 0 1 2 B 4 A 7 J H 10 11 #((ress 0 1 2 B 4 A 7 #((ress 0 1 2 B 4


1B

NOOOOOOOO J bit OOOOOOOP

NOOOOOOOOOOOOO 12 bit OOOOOOOOOOOOOP

NOOOOOOOOOOOOOOOOOOO 1A bit OOOOOOOOOOOOOOOOOOOP

,he concept of an a((ress is very important to properly un(erstan(ing ho& C02s &or/.

N. Dulay

Main memory organisation (7)

Byte Addressing
<ain*memories generally store an( recall ro&s- &hich are multi*byte in length (e.g. 1A*bit &or( K 2 bytes- 2*bit &or( K B bytes!. <any architectures- ho&ever- ma/e main memory byte,addressable rather than )ord addressable. 6n such architectures the C02 an("or the main memory har(&are is capable of rea(ing"&riting any in(ivi(ual byte. )ere is an e%ample of a main memory &ith 1A*bit memory locations . Note ho& the memory locations (ro&s! have even a((resses.
14

Wor( #((ress 0 2 B A J 10 12 1B 1A 1J 20

1A bit K 2 bytes

Byte Ordering
,he or(ering of bytes &ithin a multi,byte (ata item (efines the en(ian*ness of the architecture. 6n I68*5ND6#N systems the most significant byte of a multi*byte (ata item al&ays has the lo&est a((ress- &hile the least significant byte has the highest a((ress. 6n 96,,95*5ND6#N systems- the least significant byte of a multi*byte (ata item al&ays has the lo&est a((ress- &hile the most significant byte has the highest a((ress. 6n the follo&ing e%ample- table cells represent bytes- an( the cell numbers in(icate the a((ress of that byte in main memory. Note+ by convention &e (ra& the bytes &ithin a memory &or( left*to*right for big*en(ian systems- an( right*to*left for little*en(ian systems.

Wor( #((ress 0 B J 12

-ig,.ndian 0 B J 12 <SI 1 2 4 A H 10 1 1B OOOOOOOOO*P

7 11 14 9SI

Wor( #((ress 0 B J 12

/ittle,.ndian 2 1 0 7 A 4 B 11 10 H J 14 1B 1 12 <SI OOOOOOOOOOOP 9SI

14

,o avoi( confusion &e &ill use the term memory )ord for a &or(*si3e( memory location.

N. Dulay

Main memory organisation (8)

Note+ an N*character #SC66 string value is not treate( as one large multi*byte value- but rather as N byte values- i.e. the first character of the string al&ays has the lo&est a((ress- the last character has the highest a((ress. ,his is true for both big*en(ian an( little*en(ian. #n N*character 2nico(e string &oul( be treate( as N t&o*byte value an( each t&o*byte value &oul( re1uire suitable byte*or(ering. .0ample+ Sho& the contents of memory at &or( a((ress 2B if that &or( hol(s the number given by 1225 4701) in both the big*en(ian an( the little*en(ian schemesQ Iig 5n(ian <SI OOOOOOOOOP 2B 24 2A 12 25 47 9SI 27 01 9ittle 5n(ian <SI OOOOOOOOOP 27 2A 24 12 25 47 9SI 2B 01

Wor( 2B

Wor( 2B

.0ample+ Sho& the contents of main memory from &or( a((ress 2B if those &or(s hol( the te%t >6< S<6,). Iig 5n(ian Wor( 2B Wor( 2J Wor( 2 R0 > S ) R1 6 < Q R2 < 6 Q R , Q Wor( 2B Wor( 2J Wor( 2 R , Q 9ittle 5n(ian R2 < 6 Q R1 6 < Q R0 > S )

,he bytes labelle( &ith Q are un/no&n. ,hey coul( hol( important (ata- or they coul( be (on't care bytes O the interpretation is left up to the programmer. 2nfortunately computer systems - in use to(ay are split bet&een those that are big*en(ian- an( those that are little*en(ian . ,his lea(s to problems &hen a big*en(ian computer &ants to transfer (ata to a little*en(ian computer. Some architectures- for e%ample the 0o&er0C an( #:<- allo& the en(ian* ness of the architecture to be change( programmatically.
1A 17

Word Alignment
#lthough main*memories are generally organise( as byte*a((resse( ro&s of &or(s an( accesse( a ro& at a time- some architectures- allo& the C02 to access any &or(*si3e( bit*group regar(less of its byte a((ress. We say that accesses that begin on a memory &or( boun(ary are aligned accesses &hile accesses that (o not begin on &or( boun(aries are unaligned accesses. #((ress 0 2 B A
1A 17

<emory (1A*bit! &or( MS/SMS/S-

Wor( starting at #((ress 0 is #ligne( Wor( starting at #((ress 4 is 2naligne(

The interested student might want to read the paper, On Holy Wars and a Plea for Peace, D. Cohen, IEEE Computer, Vol 14, Pages 48-54, October 1981. ,he <otorola AJ000 architecture is big*en(ian- &hile the 6ntel 0entium architecture is little*en(ian.

N. Dulay

Main memory organisation (9)

:ea(ing an unaligne( &or( from :#< re1uires (i! rea(ing of a(Sacent &or(s- (ii! selecting the re1uire( bytes from each &or( an( (iii! concatenating those bytes together KP S9$W. Writing an unaligne( &or( is more comple% an( slo&er . 7or this reason some architectures prohibit unaligne( &or( accesses. e.g. on the AJ000 architecture- &or(s must not be accesse( starting from an o((* a((ress (e.g. 1- - 4- 7 etc!- on the S0#:C architecture- AB*bit (ata items must have a byte a((ress that is a multiple of J.
1J

Memory Modules, Memory Chips


So far- &e have loo/e( at the logical organisation of main memory. 0hysically :#< comes on small memory mo(ules (little green printe( circuit*boar(s about the si3e of a finger!. # typical memory mo(ule hol(s 412<I to 28I. ,he computer's motherboar( &ill have slots to hol( 2- B maybe J memory mo(ules. 5ach memory mo(ule is itself comprise( of several memory chips. 7or e%ample here are &ays of forming a 24A%J bit memory mo(ule.
1 1 1 1

1 1 1 0 0

1 1 1

1 1 0

256

#!it RAM

256 "!it RAM

256 "!it RAM

256

Eight 1!it RAMs

6n the first case- main memory is built &ith a single memory chip. 6n the secon(- &e use t&o memory chips- one gives us the most significant B bits- the other- the least significant B bits. 6n the thir( &e use J memory chips- each chip gives us 1 bit * to rea( an J bit memory &or(- &e &oul( have to access all J memory chips simultaneously an( concatenate the bits. $n 0Cs- memory mo(ules are /no&n as D6<<s ((ual inline memory mo(ules! an( support AB*bit transfers. ,he previously generation of mo(ules &ere calle( S6<<s (single inline memory mo(ules! an( supporte( 2*bit (ata transfers. .0ample+ 8iven <ain <emory K 1< % 1A bit (&or( a((ressable!:#< chips K 24A? % B bit <o(ule 0 1J 2
1J

<o(ule 1 C ) 6 0 C ) 6 0 B C ) 6 0 4 C ) 6 0 A C ) 6 0 7 C ) 6 0 J

<o(ule 2 C ) 6 0 H C ) 6 0 10 C ) 6 0 11 C ) 6 0 12

<o(ule C ) 6 0 1 C ) 6 0 1B C ) 6 0 14

C ) 6 0 0

C ) 6 0 1

C ) 6 0 2

Describe a metho( for (oing an unaligne( &or( &rite operation.

N. Dulay

Main memory organisation (10)

B%B bits :#< chips per memory mo(ule K B%B bits B%B bits Wi(th of <emory Wor( Wi(th of :#< Chip 1J bits are re1uire( to a((ress a :#< chip (since 24A? K 2
1J

B%B bits K 1A"B K B

K 9ength of :#< Chip !


20

# 1<%1A bit &or(*a((resse( memory re1uires 20 a((ress bits (since 1< K2 ,herefore 2 bits (K20O1J! are nee(e( to select a mo(ule. ,he total number of :#< Chips K (1< % 1A! " (24A? % B! K 1A.

,otal number of <o(ules K ,otal number of :#< chips " :amChips0er<o(ule K 1A"B K B

Interleaved Memory
When memory consists of several memory mo(ules- some a((ress bits &ill select the mo(ule- an( the remaining bits &ill select a ro& &ithin the selecte( mo(ule. When the mo(ule selection bits are the least significant bits of the memory a((ress &e call the resulting memory a lo),order interlea1ed memory. When the mo(ule selection bits are the most significant bits of the memory a((ress &e call the resulting memory a high,order interlea1ed memory. 6nterleave( memory can yiel( performance a(vantages i% more than one memory mo(ule can be rea("&ritten at a time+* (6! for lo&*or(er interleave if &e can rea( the same ro& in each mo(ule. ,his is goo( for a single multi*&or( access of se1uential (ata such as program instructions- or elements in a vector(ii! for high*or(er interleave- if (ifferent mo(ules can be in(epen(ently accesse( by (ifferent units. ,his is goo( if the C02 can access ro&s in one mo(ule- &hile at the same time- the har( (is/ (or a secon( C02! can access (ifferent ro&s in another mo(ule. .0ample+ 8iven that <ain <emory K 1<%Jbits- :#< chips K 24A? % Bbit. 7or this memory &e 1J &oul( re1uire B%2KJ :#< chips. 5ach chip &oul( re1uire 1J a((ress bits (ie. 2 K 24A?! an( the 20 full 1<%1A bit memory &oul( re1uires 20 a((ress bits (ie. 2 K 1< !

N. Dulay

Main memory organisation (11)

CPU Organisation & Operation


The Fetch-Execute Cycle
,he operation of the C02 is usually (escribe( in terms of the 7etch*5%ecute cycle .
1H 20

+etch,.0ecute Cycle 7etch the Instruction

(he cycle raises many interesting 2uestions3 e*g* What is an 6nstructionQ Where is the 6nstructionQ Why (oes it nee( to be fetche(Q 6snTt it o/ay &here it isQ )o& (oes the computer /eep trac/ of instructionsQ Where (oes it put the instruction it has Sust fetche(Q What is the 0rogram CounterQ What (oes the 0rogram Counter countQ 6ncrement by ho& muchQ Where (oes the 0rogram Counter point to after it is incremente(Q Why (oes the instruction nee( to be (eco(e(Q )o& (oes it get (eco(e(Q What are operan(sQ What (oes it mean to fetchQ 6s this fetching (istinct from the fetching in Step 1 aboveQ Where are the operan(sQ )o& many are thereQ Where (o &e put the operan(s after &e fetch themQ 6s this the main stepQ Coul(nTt the computer simply have (one this partQ What part of the C02 performs this operationQ What resultsQ Where fromQ Where toQ :epeat &hatQ :epeat from &hereQ 6s it really an infinite loopQ WhyQ )o& (o these steps e%ecute any instructions at allQ

6ncrement the Program Counter Deco(e the 6nstruction 7etch the Operands

0erform the $peration Store the results :epeat forever

6n or(er to appreciate the operation of a computer &e nee( to ans&er such 1uestions an( to consi(er in more (etail the organisation of the C02.

Representing Programs
5ach comple% tas/ carrie( out by a computer nee(s to be bro/en (o&n into a se1uence of simpler tas/s an( a binary machine instruction is nee(e( for the most primitive tas/s. Consi(er a tas/ that a((s t&o numbers - hel( in memory locations (esignate( by I an( C an( stores the result in memory location (esignate( by #.
21 22

#KIRC ,his assignment can be bro/en (o&n (compile(! into a se1uence of simpler tas/s or assembly instructions- e.g+
1H 20 21 22

Central 0rocessing 2nit. Sometimes calle( the 7etch*Deco(e*5%ecute Cycle. 9et's assume they are hel( in t&o's complement form. #- I an( C are actually main memory addresses- i.e. natural binary numbers.

N. Dulay

CPU organisation and operation (12)

Assembly Instruction 9$#D :2- I #DD :2- C

.%%ect Copy the contents of memory location (esignate( by I into :egister 2 #(( the contents of the memory location (esignate( by C to the contents of :egister 2 an( put the result bac/ into :egister 2 Copy the contents of :egister 2 into the memory location (esignate( by #.

S,$:5 :2- #

5ach of these assembly instructions nee(s to be enco(e( into binary for e%ecution by the Central 0rocessing 2nit (C02!. 9et's try this enco(ing for a simple architecture calle( ,$;1.

TOY1 Architecture
,$;1 is a fictitious architecture &ith the follo&ing characteristics+ 102B % 1A*bit &or(s of :#< ma%imum. :#< is &or(*a((ressable. B general purpose registers :0- :1- :2 an( : . 5ach general purpose register is 1A*bits (the same si3e as a memory location!. 1A (ifferent instructions that the C02 can (eco(e an( e%ecute- e.g. 9$#D- S,$:5- #DDS2I an( so on. ,hese (ifferent instructions constitute the Instruction Set of the #rchitecture. ,he representation for integers &ill be t&o's complement. 7or this architecture- the architect (us! nee(s to (efine a co(ing scheme for instructions. ,his is terme( the Instruction +ormat. 9ets loo/ at an e%ample before &e consi(er ho& &e arrive( at it. )ere's our instruction format for ,$;1+
2

TOY1 Instruction Format


,$;1 instructions are 1A*bits (so they &ill fit into a main*memory &or(!. 5ach instruction is (ivi(e( into a number of instruction %ields that enco(e a (ifferent piece of information for the C02. 7iel( Name 7iel( Wi(th OPCO4. B*bits .5
2*bits

A44 .SS 10*bits

,he OPCO4. fiel( i(entifies the C02 operation re1uire(. Since ,$;1 only supports 1A instructions- these can be enco(e( as a B*bit natural number. 7or ,$;1- opco(es 1 to B &ill be +
2B 24

0001 K 9$#D

0010 K S,$:5

0011 K #DD

0100 K S2I

,he .5 fiel( (efines a 8eneral C02 :egister. #rithmetic operations &ill use 1 register operand an( 1 main memory operand- results &ill be &ritten bac/ to the register. Since ,$;1 has B registersC these can be enco(e( as a 2*bit natural number+
2 2B 24

<ost architectures actually have (ifferent instruction formats for (ifferent categories of instruction. $peration Co(e ,he meaning of C02 operations is (efine( in the #rchitecture's 6nstruction Set <anual.

N. Dulay

CPU organisation and operation (13)

00 K :egister 0

01 K :egister 1

10 K :egister 2

11 K :egister

,he A44 .SS fiel( (efines the a((ress of a &or( in :#<. Since ,$;1 can have upto 102B memory locationsC a memory a((ress can be enco(e( as a 10*bit natural number. 6f &e (efine a((resses 200)- 201) an( 202) for #- I an( C- &e can enco(e the e%ample above as+ Assembly Instruction 9$#D #DD S,$:5 :2- U201)V :2- U202)V :2- U200)V Machine Instruction 0001 10 10 0000 0001 0011 10 10 0000 0010 0010 10 10 0000 0000

Memory Placement of Program and Data


6n or(er to e%ecute a ,$;1 program- its instructions an( (ata nee(s to place( &ithin main memory . We'll place our *instruction program in memory starting at a((ress 0J0) an( &e'll place the variables #- I an( C at memory &or(s 200)- 201)- an( 202) respectively. Such placement results in the follo&ing memory layout prior to program e%ecution. 7or convenience- memory a((resses an( memory contents are also given in he%.
2A

Memory Address in binary = he% 0000 0 0000 0 0000 0 0010 2 0010 2 0010 2 1000 J 1000 J 1000 J .tc 0000 0 0000 0 0000 0 0000 0 0001 1 0010 2 0000 0 0001 1 0010 2

$0

Machine Instruction :eg #((ress 10 # 10 # 10 # .tc 0000 0 0000 0 0000 0 0000 0 0000 0 0000 0 0000 0 0000 0 0000 0 0001 1 0010 2 0000 0 0000 0 1001 H 0110 A

Assembly Instruction 9$#D :2- U201)V #DD :2- U202)V

S,$:5 :2- U200)V .tc #K0 I KH CKA

2 0000 0 0000 0 0000 0

$f course- the big 1uestion is D)o& is such a program e%ecute( by the ,$;1 C02QE

2A

,he $perating System soft&are is normally responsible for un(erta/ing this tas/.

N. Dulay

CPU organisation and operation (14)

CPU Organisation
Central Processing Unit (CPU) General R" Registers R1 R R# ALU !utput Register Input Register 1 Input Register Internal BusBus Data Program Counter Instruction Register Instruction Decoder Control Bus Control Unit
Re$%&'rite

Memory Addr' & """ ""1 ""

Address Bus

#$D #$% & #$$

,he Program Counter (PC! is a special register that hol(s the address of the ne%t instruction to be fetche( from <emory (for ,$;1- the 0C is 10*bits &i(e!. ,he 0C is incremente( to Fpoint toF the ne%t instruction &hile an instruction is being fetche( from main memory.
27

,he Instruction egister (I ! is a special register that hol(s each instruction after it is fetche( from main memory. 7or ,$;1- the 6: is 1A*bits since instructions are 1A*bit &i(e. ,he Instruction 4ecoder is a C02 component that (eco(es an( interprets the contents of the 6nstruction :egister- i.e. its splits &hole instruction into fiel(s for the Control 2nit to interpret. ,he 6nstruction (eco(er is often consi(ere( to be a part of the Control 2nit. ,he Control Unit is the C02 component that co*or(inates all activity &ithin the C02. 6t has connections to all parts of the C02- an( inclu(es a sophisticate( timing circuit. ,he Arithmetic 6 /ogic Unit (A/U! is the C02 component that carries out arithmetic an( logical operations e.g. a((ition- comparison- boolean #ND"$:"N$,. ,he A/U Input egisters 1 6 " are special registers that hol( the input operan(s for the #92. ,he A/U Output egister is a special register that hol(s the result of an #92 operation. $n completion of an #92 operation- the result is copie( from the #92 $utput register to its final (estination- e.g. to a C02 register- or main*memory- or to an 6"$ (evice. ,he 5eneral egisters 03 13 "3 3 are available for the programmer to use in his"her programs. ,ypically the programmer tries to ma%imise the use of these registers in or(er to spee( program e%ecution. 7or ,$;1- the general registers are the same si3e as memory locations- i.e. 1A*bits.
27

Iy the appropriate number of memory &or(s.

N. Dulay

CPU organisation and operation (15)

,he -uses serve as communication high&ays for passing information &ithin the C02 (C02 internal bus! an( bet&een the C02 an( the main memory (the address bus- the data bus- an( the control bus!. ,he a((ress bus is use( to sen( a((resses from the C02 to the main memoryC these a((resses in(icate the memory location the C02 &ishes to rea( or &rite. 2nli/e the a((ress bus- the (ata bus is bi*(irectionalC for &riting- the (ata bus is use( to sen( a &or( from the C02 to main*memoryC for rea(ing- the (ata bus is use( to sen( a &or( from main*memory to the C02. 7or ,$;1- the Control bus is use( to in(icate &hether the C02 &ishes to rea( from a memory location or &rite to a memory location. 7or simplicity &e've omitte( t&o special registers- the Memory Address egister (MA ! an( the Memory 4ata egister (M4 !* ,hese registers lie at the boun(ary of the C02 an( #((ress bus an( Data bus respectively an( serve to buffer (ata to"from the buses.
2J

Iuses can normally transfer more than 1*bit at a time. 7or the ,$;1- the a((ress bus is 10*bits (the si3e of an a((ress!- the (ata bus is 1A*bits (si3e of a memory location!- an( the control bus is 1*bit (to in(icate a memory rea( operation or a memory &rite operation!.

Interlude: the Von Neumann Machine Model


<ost computers conform to the von Neumann's machine mo(el- name( after the )ungarian* #merican mathematician >ohn von Neumann (1H0 *47!. 6n von Neumann's mo(el- a computer has 3 subsystems (i! a C02- (ii! a main memory- an( (iii! an 6"$ system. ,he main memory hol(s the program as &ell as (ata an( the computer is allo&e( to manipulate its o&n program . 6n the von*Neumann mo(el- instructions are e%ecute( se2uentially (one at a time!.
2H

6n the von*Neumann mo(el a single path e%ists bet&een the control until an( main*memory- this lea(s to the so*calle( F1on 'eumann bottlenec&F since memory fetches are the slo&est part of an instruction they become the bottlenec/ in any computation.

Instruction Execution (Fetch-Execute-Cycle Micro-steps)


6n or(er to e%ecute our *instruction program- the control unit has to issue an( coor(inate a series of micro*instructions. ,hese micro*instructions form the fetch*e%ecute cycle. 7or our e%ample &e &ill assume that the 0rogram Counter register (0C! alrea(y hol(s the a((ress of the first instructionnamely 0J0).

2J 2H

<ost control*buses are &i(er than a single bit- these e%tras bits are use( to provi(e more sophisticate( memory operations an( 6"$ operations. ,his type of manipulation is not regar(e( as a goo( techni1ue for general assembly programming.

N. Dulay

CPU organisation and operation (16)

LOAD R2, [201H]


0000 0 1000 J 0000 0 10 1 0000 0001 # 0 Copy the value in memory &or( 201) into :egister 2

Control Unit Action +.(C7 I'S( UC(IO' 0C to #((ress Ius 0 to Control Ius #((ress Ius to <emory Control Ius to <emory
1 2

4ata %lo)s
0

0J0) 0 0J0) 0 0J0 1#01) 1#01)

READ
INC

0J0) 0 0J0) 0 0J1) 1#01) 1#01)

#((ress Ius Control Ius <emory <emory 0C becomes 0CR1


B

6ncrement 0C <emory U0J0)V to Data Ius Data Ius to 6nstruction :egister 4.CO4. I'S( UC(IO' 6: to 6nstruction Deco(er 6nstruction Deco(er to Control 2nit
4

Data Ius 6nstruction :egister

1#01) 1- 2201)

1#01) 1- 2- 201)

6nstruction Deco(er Control 2nit

.8.CU(. I'S( UC(IO' Control 2nit to #((ress Ius 0 to Control Ius #((ress Ius to <emory Control Ius to <emory <emory U201)V to Data bus Data Ius to :egister 2

201) 0 201) 0 000H) 000H)

READ

201) 0 201) 0 000H) 000H)

#((ress Ius Control Ius <emory <emory Data Ius :egister 2

0 1

B 4 A

,he micro*steps in the 7etch an( Deco(e phases are common for all instructions. ,his an( the ne%t B micro*steps initiate a fetch of the ne%t instruction to be e%ecute(- &hich is to foun( at memory a((ress J0). 6n practice a <emory #((ress :egister (<#:! acts as an interme(iate buffer for the #((resssimilarly a <emory Data :egister (<D:! buffers (ata to"from the (ata bus. We &ill use 0 for a memory :5#D re1uest- an( 1 for a memory W:6,5 re1uest. 7or simplicity- &e &ill assume that the 0C is capable of performing the increment internally. 6f not- the Control 2nit &oul( have to transfer the contents of the 0C to the #92- get the #92 to perform the increment an( sen( the results bac/ to the 0C. #ll this &hile &e are &aiting for the main*memory to return the &or( at a((ress J0). Since ,$;1's main*memory is &or(*a((resse(- an( all instructions are 1 &or(. 6f main*memory &as byte* a((resse( &e &oul( nee( to a(( 2. ,he 6nstruction (eco(er splits the instruction into the in(ivi(ual instruction fiel(s $0C$D5- :58 an( #DD:5SS for interpretation by the Control 2nit. ,he micro*steps for the e%ecute phase actually perform the operation.

N. Dulay

CPU organisation and operation (17)

ADD R2, [202H]


0000 0 1000 J 0001 1 10 0000 # 0002 0 #(( the value in memory &or( 202) to :egister 2
7

Control Unit Action 75,C) 6NS,:2C,6$N 0C to #((ress Ius 0 to Control Ius #((ress Ius to <emory Control Ius to <emory 6ncrement 0C <emory U0J1)V to Data Ius Data Ius to 6nstruction :egister D5C$D5 6NS,:2C,6$N 6: to 6nstruction Deco(er 6nstruction Deco(er to Control 2nit 5W5C2,5 6NS,:2C,6$N :egister 2 to #92 6nput :eg 1 Control 2nit to #((ress Ius 0 to Control Ius #((ress Ius to <emory Control Ius to <emory <emory U202)V to Data bus Data Ius to #92 6nput :eg 2 Control 2nit to #92 #92 $utput :eg to :egister 2

Data flo&s 0J1) 0 0J1) 0 0J1) #02) #02) 0J1) 0 0J1) 0 0J2) #02) #02) #((ress Ius Control Ius <emory <emory 0C becomes 0CR1 Data Ius 6nstruction :egister

READ
INC

#02) - 2202)

#02) - 2- 202)

6nstruction Deco(er Control 2nit

000H 202) 0 202) 0 000A) 000A)

READ

000H 202) 0 202) 0 000A) 000A) 0007) 0007)

#92 6nput :eg 1 #((ress Ius Control Ius <emory <emory Data Ius #92 6nput :eg 2 $utput :egister :egister 2

ADD
0007

2sing t&o's complement arithmetic.

N. Dulay

CPU organisation and operation (18)

STORE R2, [200H]


0000 0 1000 J 0001 2 10 2 0000 # 0000 0 Copy the value in :egister 2 into memory &or( 202)

Control Unit Action 75,C) 6NS,:2C,6$N 0C to #((ress Ius 0 to Control Ius #((ress Ius to <emory Control Ius to <emory 6ncrement 0C <emory U0J2V to Data Ius Data Ius to 6nstruction :egister D5C$D5 6NS,:2C,6$N 6: to 6nstruction Deco(er 6nstruction Deco(er to Control 2nit 5W5C2,5 6NS,:2C,6$N :egister 2 to Data Ius Control 2nit to #((ress Ius 1 to Control Ius Data Ius to <emory #((ress Ius to <emory Control Ius to <emory

Data flo&s 0J2) 0 0J2) 0 0J2) 2#00) 2#00) 0J2) 0 0J2) 0 0J ) 2#00) 2#00) #((ress Ius Control Ius <emory <emory 0C becomes 0CR1 Data Ius 6nstruction :egister

READ
INC

2#00 2- 2200)

2#00 2- 2- 200)

6nstruction Deco(er Control 2nit

0007) 200) 1 0007) 200) 1

WRITE

0007) 200) 1 0007) 200) 1

Data Ius #((ress Ius Control Ius <emory <emory <emory

N. Dulay

CPU organisation and operation (19)

TOY1 Programming
)o& is computer such as ,$;1 programme(Q We'll consi(er this 1uestion &ith some e%amples. 9et's first (efine a basic Instruction Set for the ,$;1 architecture +
J

OP Code Assembler +ormat 0000 0001 0010 0011 0100 0101 0110 0111 S,$0 9$#D S,$:5 #DD S2I 8$,$ 67X5: 67N58 :n - Uaddr9 :n - Uaddr9 :n - Uaddr9 :n - Uaddr9 addr :n - addr :n - addr

Action Stop 0rogram 5%ecution :n K <emory UaddrV <emory Uaddr V K :n :n K :n R <emory UaddrV :n K :n O <emory UaddrV 0C K addr 67 :n K 0 ,)5N 0C K addr 67 :n N 0 ,)5N 0C K addr

Example 1: Multiplication
8iven these instructions lets &rite a ,$;1 assembly program- &hich &ill perform the follo&ing assignment+ #KIYC &here #- I an( C (enote integers place( at memory &or(s 100)- 101) an( 102) respectively. ,he first point to observe &ith this e%ample is that a multiply operation is not available in the ,$;1 instruction setG ,herefore &e nee( to consi(er if &e can use other instructions to carry out the multiplication. ,he obvious solution is to use repeate( a((ition+

BYC =
.0ample: 12 Y

N =1

K 12 R 12 R 12

12 Y 1 K 12 12 Y 0 K 0

Note+ $nly half of the possible si%teen instructions are (efine(. ,he remaining J &ill be (efine( later. ,he S,$0 instruction (oes not ma/e use of the :egister or #((ress fiel(s- &hile the 8$,$ instruction (oes not ma/e use of the :egister fiel(.

N. Dulay

CPU organisation and operation (20)

9et's first &rite the multiplication algorithm in 0seu(o Co(e


; Given: A, B, C ; Pre: C > = 0 ; Post:A = B $ C

h! do we have this pre" condition#

su% = 0 ; su% wi&&accu%u&ate the answer n= C ; n wi&&indicate how %an! additions re%ain loop exit w h e n n ' = 0 su% = su% ( B n = n ") end loop A = su%

9et's try translating (compiling! this 0seu(o Co(e to ,$;1 instructions. Since &e have B general registers- it is &orth&hile allocating fre1uently use( variables to them as this &ill lea( to faster e%ecution. 9et's allocate :egister 1 to hol( TsumT- an( :egister 2 to hol( TnT.

sum = 0
,he first assignment sum;0 yiel(s our first problem. )o& (o &e get 3ero (or any constant! into a :egisterQ ,he only instruction that &e can use to set a register is 9$#D :n- addr. ,herefore &e must reserve a memory &or( an( pre*set it to 3ero before program e%ecution begins. 9ets place 3ero in memory &or( 200). No& to perform sum K 0 &e have+ 9$#D Address J0) 200) :1- U200)V C sum K 0 Comment C sum K 0 C hol(s 3ero 9et's place instructions starting at memory &or( J0)+ Assembler Instruction 9$#D :1- U200)V 0
H

n=C
,he ne%t statement is n ; C. ,his is easy to translate+ J1) 9$#D :2- U102)V C nKC

,ry to comment each line of an assembly*language program.

N. Dulay

CPU organisation and operation (21)

exit when n <= 0


What (oes the loop e0it )hen n NK 0 statement mean in ,$;1 termsQ 9ets consi(er a simpler e%ample first+ loop e0it )hen n K 0. $n the ,$;1 this statement has a simpler translation- namely+ loop e0it )hen n K 0 instructions end loop Address 90 91 ... 9% 9y Instruction 67X5: :2- /y instructions ... 8$,$ /0

Note+ 8$,$ alters the 0rogram Counter register thereby causing an unconditional branch in the or(er of program e%ecution. 67X5: alters the 0rogram Counter only if the contents of the specifie( :egister are 3ero. ,o han(le e0it )hen n NK 0 &e nee( to s/ip to the en( of the loop if :2 is 3ero or if :2 is negative+ loop e0it )hen n NK 0 instructions end loop Address 90 91 92 ... 9% 9y Instruction 67X5: :2- /y 67N58 :2- /y instructions ... 8$,$ /0

7or our e%ample &e no& have the follo&ing assembly program+ Address J0) J1) J2) J ) ..... /0 /y ..... 100) 101) 102) ..... 200) Assembler Instruction 9$#D :1- U200)V 9$#D :2- U102)V 67X5: :2- /y 67N58 :2- /y .................. 8$,$ J2) .... # I C 0 Comment C sum K 0 CnKC C e0it )hen nNK0 C &e &ill (efine /y &hen &e can ............... C end loop C hol(s # C hol(s I C hol(s C C hol(s 0

N. Dulay

CPU organisation and operation (22)

sum = sum + B n=n1


9et's continue &ith+ sum ; sum < -. JB) J4) 201) ,his is easy- namely C sum K sum R I C n K nO1 C )ol(s the value 1

#DD :1- U101)V S2I 1 :2- U201)V

7or n ; n = 1 &e &ill assume that location 201) is pre*set to the constant 1.

end loop A = sum


#((ing S,$:5 :1- 100) for A ; sum an( a S,$0 instruction &e arrive at the final program+ Addr J0) J1) >"7 J ) JB) J4) JA) J7) JJ) ... 100) 101) 102) ... 200) 201) Assembler Instruct* 9$#D :1- U200)V 9$#D :2- U102)V 67X5: :2- J7) 67N58 :2- J7) #DD :1- U101)V S2I :2- U201)V 8$,$ >"7 S,$:5 :1- U100)V S,$0 *** A C ... 0 1 Comment C sum K 0 Cn K C C e0it )hen n K 0 C e0it )hen n N 0 C sum K sum R I Cn K n *1 C end loop C # K sum C 5n( of program C )ol(s # C )ol(s I C )ol(s C C )ol(s Xero C )ol(s $ne Machine Instruction 0001 0110 0000 0000 0001 1001 0000 0010 0110 1000 1000 0111 0111 1000 1000 0111 0011 0101 0000 0001 0100 1010 0000 0001 0101 0000 1000 0010 0010 0101 0000 0000 0000 0000 0000 0000 initial value of # initial value of I initial value of C 0000 0000 0000 0000 0000 0000 0000 0001

N. Dulay

CPU organisation and operation (23)

Multiplication (An improvement)


,he multiply program &ill &or/ correctly but can be improve(. Consi(er Y 1000 if C is greater than I then it &ill be faster to compute 1000 Y . )o& can &e a(apt our program to han(le this caseQ Consi(er an( &or/ through the follo&ing solution+ sum K 0 i% I NK C then bigKC- nKI else C N I bigKI- nKC end i% loop e0it )hen n NK 0 sum K sum R big n KnO1 end loop # K sum Addr J0) J1) J2) J ) JB) J4) JA) J7) >>7 JH) J#) >-7 ... 202) Instruction 9$#D :1- U200)V C sumK0 9$#D :0- U102) V C i% CNI S2I :0- U101)V C then 59S5 67N58 :0- >>7 9$#D :0- U102)V C then S,$:5 :0- U202) VC big K C 9$#D :2- U101)V C nKI 8$,$ >-7 9$#D :0- U101)V C else S,$:5 :0- U202) VC bigKI 9$#D :2- U102) VC nKC etc C loop .... ... C )ol(s big

Example 2: Vector Sum


Write a se1uence of ,$;1 instructions (an( constants! to sum 100 integers store( consecutively starting at memory &or( 200). ,he sum is to be left in :egister 0. #gain- lets first &rite the 0seu(o Co(e for the problem+
su% = 0 n = )00 addr = *00+ loop exit w he n n ' = 0 su% = su% ( ,A- .addr/ addr = addr ( ) n = n ") end loop

9oo/ing at this co(e- &e fin( that the main F(ifficultyF is ho& to perform sum K sum R :#< Ua((rV ,here (oesnTt appear to be any &ay of accessing memory &or(s base( on a F@ariableF. We nee( therefore to e%ten( ,$;1 to inclu(e an in(irect a((ressing capability .
B0

B0

6n fact there is a &ay of &riting this program &ithout resorting to in(irect memory access instructions. Can you thin/ &hat the &ay might beQ

N. Dulay

CPU organisation and operation (24)

Indirect Addressing Instructions for TOY1


OP Code Assembler +ormat 1001 1010 1011 1100 9$#D S,$:5 #DD S2I :n - U:m V :n - U:m V :n - U:m V :n - U:m V Action :n K <emory U m V <emory U m V K :n :n K :n R <emory U m V :n K :n O <emory U m V
B2 B1

# secon( 6nstruction 7ormat is also nee(e( for these instructions . We &ill use the follo&ing+ OPCO4. B*bits .5n 2*bits .5m 2*bits Unused J*bits
B

.0ample: 8iven this format the ,$;1 instruction #DD :1- U:2V &oul( be co(e( as 1011 01 10 0000 0000 in binary or IA00) in he%a(ecimal.

Vector Sum Example Contd.


,he vector sum e%ample is no& straightfor&ar(. ,his program &ill be place( at 07) on&ar(s- an( the registers allocate( as follo&s+ :0 for TsumT- :1 for TnT- :2 for Ta((rT sum K 0 n K 100 a((r K 200) loop e0it )hen n NK 0 sum K sum R :#< Ua((rV a((r K a((r R 1 n Kn*1 end loop C :esult in :egister :0 0 1 2 ... 07) 10) 11) 12) 1 ) 1B) 14) 1A) 17) 1J) 0 1 100 200) 9$#D 9$#D 9$#D 67X5: 67N58 #DD #DD S2I 8$,$ S,$0 :0- U0V :1- U2V :2- U V :1- 1J) :1- 1J) :0- U:2V :2- U1V :1- U1V 12) C )ol(s C )ol(s C )ol(s C )ol(s 0 1 100 200)

C sum K 0 C n K 100 C a((r K 200) C e0it )hen nNK0 C sum K sumR... C a((r K a((r R 1 Cn K n O1 C end loop

B1 B2 B

<emory U:mV (enotes the contents of the memory &or( &hose a((ress is given by :egister m. 6s having more than one instruction format a goo( i(eaQ 7or these instructions J*bits are unuse(. # more a(vance( C02 coul( allo& t&o a((ress*in(irect instructions to be enco(e( into one &or(- thus s/ipping one instruction fetch.

N. Dulay

CPU organisation and operation (25)

You might also like