Professional Documents
Culture Documents
ON
COMPUTER ORGANIZATION & MICROPROCESSORS
(R20A1201)
L/T/P/C
3/-/-/3
(R20A1201) COMPUTER ORGANIZATION & MICROPROCESSORS
COURSE OBJECTIVES:
Students should be able:
1. To understand basic components of computers and architecture of 8086microprocessor
2. To learn to classify the instruction formats and various addressing modes of8086
microprocessor.
3. To know how to represent the data and understand how computations are
performed at machine level.
4. To have knowledge of the memory organization and I/O Organization.
5. To understand the parallelism both in terms of single and multiple processors.
UNIT - I
Digital Computers: Introduction, Block diagram of Digital Computer, Definition of
ComputerOrganization, Computer Design and Computer Architecture.
Basic Computer Organization and Design: Instruction codes, Computer Registers,
Computerinstructions, Timing and Control, Instruction cycle, Memory Reference Instructions,
Input – Output and Interrupt, Complete Computer Description.
Micro Programmed Control: Control memory, Address sequencing, micro program
example, design of control unit.
UNIT - II
Central Processing Unit: The 8086 Processor Architecture, Register organization, Physical
memoryorganization,GeneralBusOperation,I/OAddressingCapability,SpecialProcessorActiviti
es,Minimum and Maximum mode system andtimings.
8086 Instruction Set and Assembler Directives-Machine language instruction formats,
Addressing modes, Instruction set of 8086, Assembler directives and operators.
UNIT - III
Assembly Language Programming with 8086- Machine level programs, Machine coding
the programs, Programming with an assembler, Assembly Language example programs. Stack
structure of 8086, Interrupts and Interrupt service routines, Interrupt cycle of 8086, Interrupt
programming, Passing parameters to procedures, Macros, Timings and Delays.
UNIT - IV
Computer Arithmetic: Introduction, Addition and Subtraction, Multiplication Algorithms,
Division Algorithms, Floating - point Arithmetic operations.
Input-Output Organization: Peripheral Devices, Input-Output Interface, Asynchronous
datatransfer, Modes of Transfer, Priority Interrupt, Direct memory Access, Input –Output
Processor (IOP),Intel 8089 IOP.
UNIT - V
Memory Organization: Memory Hierarchy, Main Memory, Auxiliary memory, Associate
Memory, Cache Memory.
Pipeline and Vector Processing: Parallel Processing, Pipelining, Arithmetic Pipeline,
Instruction Pipeline, RISC Pipeline, Vector Processing, Array Processors.
TEXT BOOKS:
1. Computer Organization and Architecture, William Stallings, 9th Edition, Pearson.
2. Microprocessors and Interfacing, D V Hall, SSSP Rao, 3rd edition, McGraw Hill
IndiaEducation PrivateLtd.
REFERENCE BOOKS:
1. Carl Hamacher, Zvonko Vranesic, Safwat Zaky: Computer Organization,
5thEdition, Tata McGraw Hill,2002
2. David A. Patterson, John L. Hennessy: Computer Organization and Design –
TheHardware/ Software Interface ARM Edition, 4th Edition, Elsevier, 2009.
COURSE OUTCOMES:
Students will be able:
To identify the basic components and the design of CPU, ALU and Control Unit.
To interpret memory hierarchy and describe the impact on computer
cost/performance.
To express instruction level parallelism and pipelining for high performance
Processor design.
To represent the instruction set, instruction formats and addressing modes
of8086.
To write assembly language programs to solve problems.
UNIT-1
.
Lecture Notes
Digital Computers: Introduction, Block diagram of Digital Computer, Definition of ComputerOrganization,
Computer Design and Computer Architecture.
Basic Computer Organization and Design: Instruction codes, Computer Registers, Computer instructions,
Timing and Control, Instruction cycle, Memory Reference Instructions, Input – Output and Interrupt,
Complete Computer Description.
Micro Programmed Control: Control memory, Address sequencing, micro program example,
design of control unit.
Introduction: A Digital computer can be considered as a digital system that performs various computational
tasks.The first electronic digital computer was developed in the late 1940s and was used primarily for
numerical computations.By convention, the digital computers use the binary number system, which has two
digits: 0 and 1. A binary digit is called a bit.A computer system is subdivided into two functional entities:
Hardware and SoftwareThe hardware consists of all the electronic components and electromechanical devices that
comprise the physical entity of the device.The software of the computer consists of the instructions and data that
the computer manipulates to perform various data-processing tasks.
The Central Processing Unit (CPU) contains an arithmetic and logic unit for manipulating data, a number
of registers for storing data, and a control circuit for fetching and executing instructions.
The memory unit of a digital computer contains storage for instructions and data.
The Random Access Memory (RAM) for real-time processing of the data.
The Input-Output devices for generating inputs from the user and displaying the final results to the user.
The Input-Output devices connected to the computer include the keyboard, mouse, terminals, magnetic
disk drives, and other communication devices
Computer Organization:
Computer Organization is realization of what is specified by the computer architecture .It deals with how
operational attributes are linked together to meet the requirements specified by computer architecture. Some
organizational attributes are hardware details, control signals, peripherals.
EXAMPLE: Say you are in a company that manufactures cars, design and all low-level details of the car come
under computer architecture (abstract, programmers view), while making it’s parts piece by piece and connecting
together the different components of that car by keeping the basic design in mind comes under computer
organization (physical and visible).
Computer Architecture:
Computer Architecture deals with giving operational attributes of the computer or Processor to be specific. It
deals with details like physical memory, ISA (Instruction Set Architecture) of the processor, the number of bits
used to represent the data types, Input Output mechanism and technique for addressing memories.
Computer Architecture is concerned with the way hardware Computer Organization is concerned with the structure
components are connected together to form a computer and behaviour of a computer system as seen by the
system. user.
It acts as the interface between hardware and software. It deals with the components of a connection in a
system.
Computer Architecture helps us to understand the Computer Organization tells us how exactly all the
functionalities of a system. units in the system are arranged and interconnected.
A programmer can view architecture in terms of Whereas Organization expresses the realization of
instructions, addressing modes and registers. architecture.
While designing a computer system architecture is An organization is done on the basis of architecture.
considered first.
Computer Architecture deals with high-level design issues. Computer Organization deals with low-level design
issues.
Architecture involves Logic (Instruction sets, Addressing Organization involves Physical Components (Circuit
modes, Data types, Cache optimization) design, Adders, Signals, Peripherals)
Instruction Codes
Computer instructions are the basic components of a machine language program. Theyare also known as macro
operations, since each one is comprised of sequences of micro operations. Each instruction initiates a sequence
of micro operations that fetch operands from registers or memory, possibly perform arithmetic, logic, or shift
operations, and store results in registers or memory.
Instructions are encoded as binary instruction codes. Each instruction code contains of a operation code,
or opcode, which designates the overall purpose of the instruction (e.g. add, subtract, move, input, etc.). The
number of bits allocated for the opcode determined how many different instructions the architecture supports.
In addition to the opcode, many instructions also contain one or more operands, which indicate where in
registers or memory the data required for the operation is located. For example, and add instruction requires two
operands, and a not instructionrequires one.
15 12 11 65 0
+ +
| Opcode | Operand | Operand|
+ +
The opcode and operands are most often encoded as unsigned binary numbers in order to minimize the number
of bits used to store them. For example, a 4-bit opcode encoded as a binary number could represent up to 16
different operations.
The control unit is responsible for decoding the opcode and operand bits in the instruction register, and then
generating the control signals necessary to drive allother hardware in the CPU to perform the sequence of
microoperations that comprise the instruction.
Computer Instructions
All Basic Computer instruction codes are 16 bits wide. There are 3 instruction codeformats:
Memory-reference instructions take a single memory address as an operand, andhave the format:
15 14 12 11 0
+ +
| I | OP | Address |
+ +
If I = 0, the instruction uses direct addressing. If I = 1, addressing in indirect.How many memory-reference
instructions can exist?
Register-reference instructions operate solely on the AC register, and have the following format:
15 14 12 11 0
+ -+
| 0 | 111 | OP |
+ -+
How many register-reference instructions can exist? How many memory-reference instructions can coexist with
register-reference instructions?
Control unit design and implementation can be done by two general methods:
A hardwired control unit is designed from scratch using traditional digital logic design techniques to
produce a minimal, optimized circuit. In other words, the control unit is like an ASIC (application-
specific integrated circuit).
A micro-programmed control unit is built from some sort of ROM. The desired control signals are
simply stored in the ROM, and retrieved in sequence to drive the micro operations needed by a
particular instruction.
Instruction Cycle
In this chapter, we examine the sequences of micro operations that the Basic Computer goes through for each
instruction. Here, you should begin to understand how the required control signals for each state of the CPU are
determined, and how they are generated by the control unit.
The CPU performs a sequence of micro operations for each instruction. The sequence for each instruction of the
Basic Computer can be refined into 4 abstract phases
1. Fetch instruction
2. Decode
3. Fetch operand
4. Execute
Program execution can be represented as a top-down design:
1. Program execution
a. Instruction 1
i. Fetch instruction
ii. Decode
iii. Fetch operand
iv. Execute
b. Instruction 2
i. Fetch instruction
ii. Decode
iii. Fetch operand
iv. Execute
Instruction 3 ..
The Basic Computer I/O consists of a simple terminal with a keyboard and aprinter/monitor.
The keyboard is connected serially (1 data wire) to the INPR register. INPR is a shift register capable of shifting
in external data from the keyboard one bit at a time. INPR outputs are connected in parallel to the ALU.
Shift enable
|
v
+ -+ 1 + +
| Keyboard |---/-->| INPR <|--- serial I/O clock
+ -+ + -+
|
/8
| | |
v v v
+ -+
| ALU |
+ -+
|
/ 16
v
+ -+
| AC <|--- CPU master clock
+ -+
I/O Operations:
Since input and output devices are not under the full control of the CPU (I/O events are asynchronous), the
CPU must somehow be told when an input device has new input ready to send, and an output device is ready
to receive more output. The FGI flip-flop is set to 1 after a new character is shifted into INPR. This is done by
the I/O interface, not by the control unit. This is an example of an asynchronous input event (not synchronized
with or controlled by the CPU).
The FGI flip-flop must be cleared after transferring the INPR to AC. This must bedone as a
micro operation controlled by the CU, so we must include it in the CU design. The FGO flip-flop is
set to 1 by the I/O interface after the terminal has finished displaying the last character sent. It must
be cleared by the CPU after transferring a character into OUTR. Since the keyboard controller only
sets FGI and the CPU only clears it, a JK flip-flop is convenient:
+ +
Keyboard controller --->| J Q | ------>
| | |
+ \ \ | |
) or >----->|> FGI |
+--------/-----/ | |
| | |
CPU >| K |
+
How do we control the CK input on the FGI flip-flop? (Assume leading-edge triggering.)
There are two common methods for detecting when I/O devices are ready, namely software polling and
interrupts. These two methods are discussed in the following sections.
Micro Programmed Control:
Control Memory:
Control memory is a random access memory(RAM) consisting of addressable storage registers. It is
primarily used in mini and mainframe computers. It is used as a temporary storage for data. Access to control
memory data requires less time than to main memory; this speeds up CPU operation by reducing the number of
memory references for data storage and retrieval. Access is performed as part of a control section sequence
while the master clock oscillator is running. The control memory addresses are divided into two groups: a task
modeand an executive (interrupt) mode.
Addressing words stored in control memory is via the address select logic for each of the register
groups. There can be up to five register groups in control memory. These groups select a register for fetching
data for programmed CPU operation or for maintenance console or equivalent display or storage of data via
maintenance console or equivalent. During programmed CPU operations, these registers are accessed directly by
the CPU logic. Data routing circuits are used by control memory to interconnect the registers used in control
memory. Some of the registers contained in a control memory that operate in the task andthe executive modes
include the following: Accumulators Indexes Monitor clock status indicating registers Interrupt data
registers
• The control variables can be represented by a string of 1’s and 0’s called a control word
• A micro programmed control unit is a control unit whose binary control variables are storedin memory
• Dynamic microprogramming permits a micro program to be loaded and uses a writablecontrol memory
• A computer with a micro programmed control unit will have two separate memories: amain memory and
a control memory
• The micro program consists of microinstructions that specify various internal controlsignals for execution of
register micro operations
• The control memory address register specifies the address of the microinstruction
• The control data register holds the microinstruction read from memory
• The microinstruction contains a control word that specifies one or more micro operations for the data processor
• The location for the next micro instruction may, or may not be the next in sequence
Addressing Sequencing:
Each machine instruction is executed through the application of a sequence of microinstructions.
Clearly, we must be able to sequence these; the collection of microinstructions which implements a
particular machine instruction is called a routine.
The MCU typically determines the address of the first microinstruction which implements a machine
instruction based on that instruction's opcode. Upon machine power- up, the CAR should contain the
address of the first microinstruction to be executed.
The MCU must be able to execute microinstructions sequentially (e.g., within routines), but must also
be able to ``branch'' to other microinstructions as required; hence, the need for a sequencer.
The microinstructions executed in sequence can be found sequentially in the CM, or can be found by
branching to another location within the CM. Sequential retrieval of microinstructions can be done by
simply incrementing the current CAR contents; branching requires determining the desired CW address,
and loading that into the CAR.
CAR
Control Address Register
Control ROM
control memory (CM); holds CWs
opcode
opcode field from machine instruction
mapping logic
hardware which maps opcode into microinstruction address
branch logic
determines how the next CAR value will be determined from all the various possibilities
multiplexors
implements choice of branch logic for next CAR value
incrementer
generates CAR + 1 as a possible next CAR value
SBR
used to hold return address for subroutine-call branch operations
Subroutine branches are helpful to have at the micro program level. Many routines contain
identical sequences of microinstructions; putting them into subroutines allows those routines to be
shorter, thus saving memory. Mapping of opcodes to microinstruction addresses can be done very
simply. When the CM is designed, a ``required'' length is determine for the machine instruction
routines (i.e., the length of the longest one). This is rounded up to the next power of 2, yielding a
value k such that 2 k microinstructions will be sufficient to implement any routine.
Alternately, the n-bit opcode value can be used as the ``address'' input of a 2n x M ROM; the contents of the
selected ``word'' in the ROM will be the desired M-bit CAR address for the beginning of the routine
implementing that instruction. (This technique allows for variable- length routines in the CM.) >pp We choose
between all the possible ways of generating CAR values by feeding them all into a multiplexor bank, and
implementing special branch logic which will determine how the muxes will pass on the next address to the
CAR
UNIT-II
Central Processing Unit
Control Flags: Control flags controls the operations of the execution unit.
Following is the list of control flags:
1. Trap flag − It is used for single step control and allows the user to execute
one instruction at a time for debugging. If it is set, then the program can be
run in a single step mode.
2. Interrupt flag − It is an interrupt enable/disable flag, i.e. used to
allow/prohibit the interruption of a program. It is set to 1 for interrupt enabled
condition and set to 0 for interrupt disabled condition.
3. Direction flag − It is used in string operation. As the name suggests when it is
set then string bytes are accessed from the higher memory address to the
lower memory address and vice-a-versa.
3. General purpose register: There are 8 general purpose registers, i.e., AH, AL, BH,
BL, CH, CL, DH, and DL. These registers can be used individually to store 8-bit data
and can be used in pairs to store 16bit data. The valid register pairs are AH and AL,
BH and BL, CH and CL, and DH and DL. It is referred to the AX, BX, CX, and DX
respectively.
a) AX register − It is also known as accumulator register. It is used to store
operands for arithmetic operations.
b) BX register − It is used as a base register. It is used to store the starting base
address of the memory area within the data segment.
c) CX register − It is referred to as counter. It is used in loop instruction to store
the loop counter.
d) DX register − This register is used to hold I/O port address for I/O
instruction.
4. Stack Pointer Register: It is a 16-bit register, which holds the address from the start
of the segment to the memory location, where a word was most recently stored on the
stack.
BIU (Bus Interface Unit)
BIU takes care of all data and addresses transfers on the buses for the EU like sending
addresses, fetching instructions from the memory, reading data from the ports and the
memory as well as writing data to the ports and the memory. EU has no direction connection
with System Buses so this is possible with the BIU. EU and BIU are connected with the
Internal Bus.
It has the following functional parts −
Instruction queue − BIU contains the instruction queue. BIU gets upto 6 bytes of
next instructions and stores them in the instruction queue. When EU executes
instructions and is ready for its next instruction, then it simply reads the instruction
from this instruction queue resulting in increased execution speed.
Fetching the next instruction while the current instruction executes is
called pipelining.
Segment register − BIU has 4 segment buses, i.e. CS, DS, SS& ES. It holds the
addresses of instructions and data in memory, which are used by the processor to
access memory locations. It also contains 1 pointer register IP, which holds the
address of the next instruction to executed by the EU.
o CS − It stands for Code Segment. It is used for addressing a memory location
in the code segment of the memory, where the executable program is stored.
o DS − It stands for Data Segment. It consists of data used by the program andis
accessed in the data segment by an offset address or the content of other
register that holds the offset address.
o SS − It stands for Stack Segment. It handles memory to store data and
addresses during execution.
o ES − It stands for Extra Segment. ES is additional data segment, which is
used by the string to hold the extra destination data.
Instruction pointer − It is a 16-bit register used to hold the address of the next
instruction to be executed.
Register Organization:
A register is a very small amount of fast memory that is built in the CPU (or Processor) in
order to speed up the operation. Register is very fast and efficient than the other memories
like RAM, ROM, external memory etc. For which the registers occupied the top position in
the memory hierarchy model.
The 8086 microprocessor has a total of fourteen registers that are accessible to the
programmer. All these registers are 16-bit in size. The registers of 8086 are categorized into 5
different groups:
a) General Registers
b) Index Registers
c) Segment Registers
d) Pointer Registers
e) Status Registers
General Registers:
All general registers of the 8086 microprocessor can be used for arithmetic and logic
operations. These all general registers can be used as either 8-bit or 16-bit registers. The
general registers are:
c) CX (Count Register): The register CX is used default counter in case of string and
loop instructions. Count register can also be used as a counter in string manipulation
and shift/rotate instruction.
d) DX (Data Register): DX register is a general-purpose register which may be used as
an implicit operand or destination in case of a few instructions. Data register can also
be used as a port number in I/O operations.
Index Register:
The index registers can be used for arithmetic operations but their use is usually concerned
with the memory addressing modes of the 8086 microprocessor (indexed, base indexed and
relative base indexed addressing modes). The index registers are particularly useful for string
manipulation.
a) SI (Source Index): SI is a 16-bit register. This register is used to store the offset of
source data in data segment. In other words the Source Index Register is used to point
the memory locations in the data segment.
Segment Register:
The 8086 architecture uses the concept of segmented memory. 8086 can able to access a
memory capacity of up to 1 megabyte. This 1 megabyte of memory is divided into 16 logical
segments. Each segment contains 64 Kbytes of memory. There are four segment registers to
access this 1 megabyte of memory. The segment registers of 8086 are:
a) CS (Code Segment): Code segment (CS) is a 16-bit register that is used for
addressing memory location in the code segment of the memory (64Kb), where the
executable program is stored. CS register cannot be changed directly. The CS register
is automatically updated during far jump, far call and far return instructions.
b) Stack segment (SS): Stack Segment (SS) is a 16-bit register that used for addressing
stack segment of the memory (64kb) where stack data is stored. SS register can be
changed directly using POP instruction.
c) Data segment (DS): Data Segment (DS) is a 16-bit register that points the data
segment of the memory (64kb) where the program data is stored. DS register can be
changed directly using POP and LDS instructions.
d) Extra segment (ES): Extra Segment (ES) is a 16-bit register that also points the data
segment of the memory (64kb) where the program data is stored. ES register can be
changed directly using POP and LES instructions.
Pointer Register:
Pointer Registers contains the offset of data (variables, labels) and instructions from their
base segments (default segments).8086 microprocessor contains three pointer registers.
a) SP (Stack Pointer): Stack Pointer register points the program stack that means SP
tores the base address of the Stack Segment.
b) BP (Base Pointer): Base Pointer register also points the same stack segment. Unlike
SP, we can use BP to access data in the other segments also.
c) IP (Instruction Pointer): The Instruction Pointer is a register that holds the address
of the next instruction to be fetched from memory. It contains the offset of the next
word of instruction code instead of its actual address.
Status Register:
The status register also called as flag register. The 8086 flag register contents indicate the
results of computation in the ALU. It also contains some flag bits to control the CPU
operations.
Flag register is 16-bit register with only nine bits that are implemented. Six of these are status
flags. The complete bit configuration of 8086 is shown in the figure.
PF (Parity Flag): This flag is set to 1, if the lower byte of the result contains even number of
1’s.
0 - Odd Parity
1 – Even Parity
CF (Carry Flag): This flag is set, when there is a carry out of MSB in case of addition or
borrow in case of subtraction.
0 – No Carry/Borrow
1 – Carry/Borrow
TF (Trap Flag): If this flag is set, the processor enters the single step execution mode. When
in the single-step mode, it executes an instruction and then jumps to a special service routine
that may determine the effect of executing the instruction. This type of operation is very
useful for debugging programs.
IF (Interrupt Flag): If this flag is set, the maskable interrupts are recognized by the CPU,
otherwise they are ignored.
AC (Auxiliary Carry Flag): This is set when there is a carry from the lowest nibble (i.e, bit
three during addition), or borrow for the lowest nibble (i.e, bit three, during subtraction).
OF (Over flow Flag): This flag is set, if an overflow occurs, i.e, if the result of a signed
operation is large enough to accommodate in a destination register.
Memory Organization
As far as we know 8086 is 16-bit processor that can supports 1Mbyte (i.e. 20-bit address bus:
220) of external memory over the address range 0000016 to FFFFF16. The 8086 organizes
memory as individual bytes of data. The 8086 can access any two consecutive bytes as a
word of data. The lower-addressed byte is the least significant byte of the word, and the
higher- addressed byte is its most significant byte.
The above figure represents: storage location of address 0000916 contains the value 716,
while the location of address 0001016 contains the value 7D16. The 16-bit word 225A16is
stored in the locations 0000C16 to 0000D16.
The word of data is at an even-address boundary (i.e. address of least significant byte is even)
is called aligned word. The word of data is at an odd-address boundary is called misaligned
word, as shown in Figure below.
Figure: Aligned and misaligned word
To store double word four locations are needed. The double word that it’s least significant
byte address is a multiple of 4 (e.g. 0 16, 416, 816 ...) is called aligned double word. The
double word at address of non-multiples of 4 is called misaligned double word shown in
Figure below.
a) Memory Segmentation: The size of address bus of 8086 is 20 and is able to address
1 Mbytes of physical memory, but all this memory is not active at one time. Actually,
this 1Mbytes of memory are partitioned into 16 parts named as segments.Size of the
each segment is 64Kbytes (65,536).
The segment registers are user accessible, which means that the programmer can change the
content of segment registers through software.
b) Programming model:
How can a 20-bit address be obtained, if there are only 16-bit registers? However, the largest
register is only 16 bits (64k); so physical addresses have to be calculated. These calculations
are done in hardware within the microprocessor.
The 16-bit contents of segment register gives the starting/ base address of particular segment.
To address a specific memory location within a segment we need an offset address. The
offset address is also 16-bit wide and it is provided by one of the associated pointer or index
register.
Figure: Software model of 8086 microprocessor
To be able to program a microprocessor, one does not need to know all of its hardware
architectural features. What is important to the programmer is being aware of the various
registers within the device and to understand their purpose, functions, operating capabilities,
and limitations.
The above figure illustrates the software architecture of the 8086 microprocessor. From this
diagram, we see that it includes fourteenl6-bit internal registers: the instruction pointer (IP),
four data registers (AX, BX, CX, and DX), two pointer registers (BP and SP), two index
registers (SI and DI), four segment registers (CS, DS, SS, and ES) and status register (SR),
with nine of its bits implemented as status and control flags.
The point to note is that the beginning segment address must begin at an address divisible by
16.Also note that the four segments need not be defined separately. It is allowable for all four
segments to completely overlap (CS = DS = ES = SS).
c) Logical and Physical Address: Addresses within a segment can range from address
00000h to address 0FFFFh. This corresponds to the 64K-bytelength of the segment. An
address within a segment is called an offset or logical address.
A logical address gives the displacement from the base address of the segment to the desired
location within it, as opposed to its "real" address, which maps directly anywhere into the 1
MByte memory space. This "real" address is called the physical address.
What is the difference between the physical and the logical address?
The physical address is 20 bits long and corresponds to the actual binary code output by the
BIU on the address bus lines. The logical address is an offset from location 0 of a given
segment.
You should also be careful when writing addresses on paper to do so clearly. To specify the
logical address XXXX in the stack segment, use the convention SS:XXXX,
which is equal to [SS] * 16 + XXXX.
Address bits A1 through A19 select the storage location that is to be accessed. They
are applied to both banks in parallel. A0and bank high enable (BHE) are used as bank-
select signals.
Case 4: When a word of data at an odd address (misaligned word) is to be accessed, then the
8086 need two bus cycles to access it:
a) During the first bus cycle, the odd byte of the word (in the high bank) is addressed
The bus can be demultiplexed using a few latches and transreceivers, whenever
required.
Basically, all the processor bus cycles consist of at least four clock cycles. These
are referred to as T1, T2, T3, T4. The address is transmitted by the processor
during T1. It is present on the bus only for one cycle.
The negative edge of this ALE pulse is used to separate the address and the data or
status information. In maximum mode, the status lines S0, S1 and S2 are used to
indicate the type of operation.
Status bits S3 to S7 are multiplexed with higher order address bits and the BHE
signal.
Address is valid during T1 while status bits S3 to S7 are valid during T2 through
T4.
The microprocessor 8086 is operated in minimum mode by strapping its MN/MX pin
to logic 1.
In this mode, all the control signals are given out by the microprocessor chip itself.
There is a single microprocessor in the minimum mode system.
The remaining components in the system are latches, transreceivers, clock generator,
memory and I/O devices.
Latches are generally buffered output D-type flip-flops like 74LS373 or 8282. They
are used for separating the valid address from the multiplexed address/data signals
and are controlled by the ALE signal generated by 8086.
Trans receivers are the bidirectional buffers and sometimes they are called as data
amplifiers. They are required to separate the valid data from the time multiplexed
address/data signals. They are controlled by two signals namely, DEN and DT/R.
The DEN signal indicates the direction of data, i.e. from or to the processor.
The system contains memory for the monitor and users program storage. Usually,
EPROM are used for monitor storage, while RAM for users program storage. A
system may contain I/O devices.
The opcode fetch and read cycles are similar. Hence the timing diagram can be
categorized in two parts, the first is the timing diagram for read cycle and the second
is the timing diagram for write cycle.
The read cycle begins in T1 with the assertion of address latch enable (ALE) signal
and also M / IO signal. During the negative going edge of this signal, the valid
address is latched on the local bus. 9
The BHE and A0 signals address low, high or both bytes. From T1 to T4 , the M/IO
signal indicates a memory or I/O operation.
At T2, the address is removed from the local bus and is sent to the output. The bus is
then tristated. The read (RD) control signal is also activated in T2.
The read (RD) signal causes the address device to enable its data bus drivers. After
RD goes low, the valid data is available on the data bus.
The addressed device will drive the READY line high. When the processor returns the
read signal to high level, the addressed device will again tristate its bus drivers.
A write cycle also begins with the assertion of ALE and the emission of the address.
The M/IO signal is again asserted to indicate a memory or I/O operation. In T2, after
sending the address in T1, the processor sends the data to be written to the addressed
location.
The data remains on the bus until middle of T4 state. The WR becomes active at the
beginning of T2 (unlike RD is somewhat delayed in T2 to provide time for floating).
The BHE and A0 signals are used to select the proper byte or bytes of memory or I/O
word to be read or write. The M/IO, RD and WR signals indicate the type of data
transfer as specified in table below.
Hold Response sequence:
The HOLD pin is checked at leading edge of each clock pulse. If it is received active
by the processor before T4 of the previous cycle or during T1 state of the current
cycle, the CPU activates HLDA in the next clock cycle and for succeeding bus cycles,
the bus will be given to another requesting master.
The control of the bus is not regained by the processor until the requesting master
does not drop the HOLD pin low.
When the request is dropped by the requesting master, the HLDA is dropped by the
processor at the trailing edge of the next clock. 13 Hold Response Timing Cycle
In the maximum mode, the 8086 is operated by strapping the MN/MX pin to ground.
In this mode, the processor derives the status signal S2, S1, S0. Another chip called
bus controller derives the control signal using this status information.
In the maximum mode, there may be more than one microprocessor in the system
configuration. The components in the system are same as in the minimum mode
system.
The basic function of the bus controller chip IC8288, is to derive control signals like
RD and WR ( for memory and I/O devices), DEN, DT/R, ALE etc. using the
information by the processor on the status lines. 15
The bus controller chip has input lines S2, S1, S0 and CLK. These inputs to 8288 are
driven by CPU.
It derives the outputs ALE, DEN, DT/R, MRDC, MWTC, AMWC, IORC, IOWC and
AIOWC. The AEN, IOB and CEN pins are specially useful for multiprocessor
systems.
AEN and IOB are generally grounded. CEN pin is usually tied to +5V. The
significance of the MCE/PDEN output depends upon the status of the IOB pin.
INTA pin used to issue two interrupt acknowledge pulses to the interrupt controller or
to an interrupting device.
IORC, IOWC are I/O read command and I/O write command signals respectively .
These signals enable an IO interface to read or write the data from or to the address
port.
The MRDC, MWTC are memory read command and memory write command signals
respectively and may be used as memory read or write signals.
All these command signals instructs the memory to accept or send data from or to the
bus.
Here the only difference between in timing diagram between minimum mode and
maximum mode is the status signals used and the available control and advanced
command signals.
R0, S1, S2 are set at the beginning of bus cycle.8288 bus controller will output a pulse
as on the ALE and apply a required signal to its DT / R pin during T1.
In T2, 8288 will set DEN=1 thus enabling transceivers, and for an input it will
activate MRDC or IORC. These signals are activated until T4.
For an output, the AMWC or AIOWC is activated from T2 to T4 and MWTC or
IOWC is activated from T3 to T4.
The status bit S0 to S2 remains active until T3 and become passive during T3 and T4.
If reader input is not activated before T3, wait state will be inserted between T3 and
T4.
8086 Instruction Set and Assembler Directives
A machine language instruction format has one or more number of fields associated
with it.
The first field is called as operation code field or op-code field, which indicates the
type of operation to be performed by the CPU.
The instruction format also contains other fields known as operand fields.
The CPU executes the instruction using the information which reside in these fields.
There are six general formats of instructions in 8086 instructions set.
The length of an instruction may vary from 1 byte to 6 bytes.
The instruction formatsare described as follow:
The MOD field shows the mode of addressing. The MOD, R/M, REG and the
‘W’fields are decided in Table 2.2.
4. Register to/from Memory with Displacement:
This type of instruction format contains 1 or 2 additional bytes for
displacement along with 2 byte format of the register to/from memory without
displacement. The format is as shown below
Sequential control flow instructions are the instructions which after execution,
transfer control to the next instruction appearing immediately after it (in the
sequence) in the program.
For Example, the arithmetic, logic, data transfer and processor control
instructions are Sequential control flow instructions.
The control transfer instructions on the other hand transfer control to some
predefined address or the address somehow specified in the instruction, after their
execution.
For Example, INT, CALL, RET & JUMP instructions fall under this
category.
The addressing modes for Sequential flow instructions are explained as follows:
In the above example, 0005H is the immediate data. The immediate data may be 8- bit
or 16-bit insize.
2. Direct addressing mode: In the direct addressing mode, a 16-bit memory address
(offset) directly specified in theinstruction as a part of it.
3. Register addressing mode: In the register addressing mode, the data is stored in a
register and it is referred using theparticular register. All the registers, except IP, may
be used in this mode.
5. Indexed addressing mode: In this addressing mode, offset of the operand is stored
one of the index registers. DS & ES are the default segments for index registers SI &
DI respectively.
Example: MOV AX, [SI]
6. Register relative addressing mode: In this addressing mode, the data is available at
an effective address formed by adding an 8-bit or 16-bit displacement with the content
of any one of the register BX, BP, SI & DI in the default(either in DS & ES) segment.
7. Based indexed addressing mode: The effective address of data is formed in this
addressing mode, by adding content of a base register (any one of BX or BP) to the
content of an index register (any one of SI or DI). The default segment register may
be ES or DS.
Example: MOV AX, [BX][SI]
Intersegment direct:
In this mode, the address to which the control is to be transferred is in a different segment.
This addressing mode provides a means of branching from one code segment to another
code segment. Here, the CS and IP of the destination address are specified directly in the
instruction.
Intersegment indirect:
In this mode, the address to which the control is to be transferred lies in a different
segment and it is passed to the instruction indirectly, i.e. contents of a memory block
containing four bytes, i.e.IP(LSB), IP(MSB), CS(LSB) and CS(MSB)
sequentially.
The starting address of the memory block may be referred using any of the addressing
modes, except immediate mode.
Example: JMP [2000H].
Jump to an address in the other segment specified at effective address 2000H in DS.
In this mode, the displacement to which the control is to be transferred is in the same
segment in which the control transfer instruction lies, but it is passed to the instruction
directly. Here, the branch address is found as the content of a register or a memory
location.
The addressing mode may be used in unconditional branch instructions.
Data transfer instruction, as the name suggests is for the transfer of data from memory to
internal register, from internal register to memory, from one register to another register,
from input port to internal register, from internal register to output port etc.
1. MOV Instruction:
Here the source and destination need to be of the same size, that is both 8
bit or both 16 bit. MOV instruction does not affect any flags.
Example:
2. PUSH Instruction:
The PUSH instruction decrements the stack pointer by two and copies the word
from source to the location where stack pointer now points. Here the source must of
word size data. Source can be a general-purpose register, segment register or a
memory location.
The PUSH instruction first pushes the most significant byte to sp-1, then the least
significant to the sp-2.
Example:
3. POP Instruction:
The POP instruction copies a word from the stack location pointed by the stack
pointer to the destination. The destination can be a General purpose register, a
segment register or a memory location. Here after the content is copied the stack
pointer isautomatically incremented by two.
Example:
The IN instruction will copy data from a port to the accumulator. If 8 bit is read the
data will go to AL and if 16 bit then to AX. Similarly OUT instruction is used to
copy data from accumulator to an output port.
Both IN and OUT instructions can be done using direct and indirect addressing modes.
Example:
5. XCHG Instruction:
The XCHG instruction exchanges contents of the destination and source. Here
destination and source can be register and register or register and memory location,
but XCHG cannot interchange the value of 2 memory locations.
Example:
1. ADD Instruction: Add instruction is used to add the current contents of destination
with that of source and store the result in destination. Here we can use register and/or
memory locations. AF, CF, OF, PF, SF, and ZF flags are affected.
Example:
ADD AL, 0FH: Add the immediate content, 0FH to the content of AL and
store the result in AL
ADD AX, BX; AX <= AX+BX
ADD AX, 0100H – IMMEDIATE
ADD AX, BX – REGISTER
ADD AX,[SI] – REGISTER INDIRECT OR INDEXED
ADD AX, [5000H] – DIRECT
ADD [5000H], 0100H – IMMEDIATE
ADD 0100H – DESTINATION AX (IMPLICT)
2. ADC: ADD WITH CARRY: This instruction performs the same operation as ADD
instruction, but adds the carry flag bit (which may be set as a result of the previous
calculation) to the result. All the condition code flags are affected by this instruction.
The examples of this instruction along with the modes are as follows:
Example:
ADC AX,BX – REGISTER
ADC AX,[SI] – REGISTER INDIRECT OR INDEXED
ADC AX, [5000H] – DIRECT
ADC [5000H], 0100H – IMMEDIATE
ADC 0100H – IMMEDIATE (AX IMPLICT)
Example:
SUB AL, 0FH: Subtract the immediate content, 0FH from the content of AL
andstore the result in AL
SUB AX, BX ; AX <= AX-BX
SUB AX,0100H – IMMEDIATE (DESTINATION AX)
SUB AX,BX – REGISTER
SUB AX,[5000H] – DIRECT
SUB [5000H], 0100H – IMMEDIATE
The result is stored in the destination operand. All the flags are affected (condition
code) by this instruction. The examples of this instruction are as follows:
Example:
SBB AX, 0100H – IMMEDIATE (DESTINATION AX)
SBB AX, BX – REGISTER
SBB AX,[5000H] – DIRECT
SBB [5000H], 0100H – IMMEDIATE
5. CMP: COMPARE: The instruction compares the source operand, which may be a
register or an immediate data or a memory location, with a destination operand that
may be a register or a memory location. For comparison, it subtracts the source
operand from the destination operand but does not store the result anywhere. The
flags are affected depending upon the result of the subtraction. If both of the operands
are equal, zero flag is set. If the source operand is greater than the destination
operand, carry flag is set or else, carry flag is reset.
EXAMPLE:
6. INC & DEC Instruction: INC and DEC instructions are used to increment and
decrement the content ofthe specified destination by one. AF, CF, OF, PF, SF, and ZF
flags are affected.
Example:
7. AND Instruction: This instruction logically ANDs each bit of the source
byte/word with the corresponding bit in the destination and stores the result in
destination. The source can be an immediate number, register or memory location,
register can be a register or memory location.
The CF and OF flags are both made zero, PF, ZF, SF are affected by the operation and
AF is undefined.
Example:
AND BL, AL; suppose BL=1000 0110 and AL = 1100 1010 then after the
operation BLwould be BL= 1000 0010.
AND CX, AX; CX <= CX AND AX
AND CL, 08; CL<= CL AND (0000 1000)
8. OR Instruction: This instruction logically ORs each bit of the source byte/word
with the corresponding bit in the destination and stores the result in destination. The
source can be an immediate number, register or memory location, register can be a
register ormemory location.
The CF and OF flags are both made zero, PF, ZF, SF are affected by the operation
and AF is undefined.
Example:
OR BL, AL; suppose BL=1000 0110 and AL=11001010 then
after the operation BL would be BL= 1100 1110.
OR CX, AX ; CX <= CX AND AX
OR CL, 08 ; CL<= CL AND (0000 1000)
Example:
NOT AX (BEFORE AX= (1011)2= (B) 16 AFTER EXECUTION AX=
(0100)2= (4)16).
NOT [5000H]
10. XOR Instruction: The XOR operation is again carried out in a similar way to the
AND and OR operation. The constraints on the operands are also similar. The XOR
operation gives a high output, when the 2 input bits are dissimilar. Otherwise, the
output is zero. The example instructions are as follows:
Example:
XOR AX,0098H
XOR AX, BX
XOR AX, [5000H]
A near CALL is a call to a procedure which is in the same code segment as the
CALL instruction. 8086 when encountered a near call, it decrements the SP by 2 and
copies the offset of the next instruction after the CALL on the stack. It loads the IP
with the offset of the procedure then to start the execution of the procedure.
A far CALL is the call to a procedure residing in a different segment. Here value of
CS and offset of the next instruction both are backed up in the stack. And then
branches to the procedure by changing the content of CS with the segment base
containing procedure and IP with the offset of the first instruction of the procedure.
Example:
Near Call
CALL PRO PRO is the name of the procedure
CALL CX
Here CX contains the offset of the
first instruction of the procedure,
that is replaces the content of IP with
the content of CX
Far Call
CALL DWORD PTR[8X] New values for CS and IP are fetched
from four memory locations in the DS.
The new value for CS is fetched from
[8X] and [8X+1], the new IP is fetched
from [8X+2] and [8X+3].
2. RET Instruction: RET instruction will return execution from a procedure to the
next instruction after the CALL instruction in the calling program. If it was a near
call, then IP is replaced with the value at thetop of the stack, if it had been a far call,
then another POP of the stack is required. This second popped data from the stack is
put in the CS, thus resuming the execution of the calling program.
3. JMP Instruction: This is also called as unconditional jump instruction, because the
processor jumps to the specified location rather than the instruction after the JMP
instruction. Jumps can be short jumps when the target address is in the same
segment as the JMP instruction or far jumps when it is in a different segment.
2. WAIT Instruction: When this instruction is executed, the 8086 enters into an idle
state. This idle state is continued till a high is received on the TEST input pin or a
valid interrupt signal is received. Wait affects no flags. It generally is used to
synchronize the 8086 with a peripheral device(s).
5. NOP Instruction: At the end of NOP instruction, no operation is done other than
the fetching anddecoding of the instruction. It takes 3 clock cycles. NOP is used to
fill in time delays or to provide space for instructions while trouble shooting. NOP
affects no flags.
Shift/Rotation Instruction
Shift instructions move the binary data to the left or right by shifting them withinthe register
or memory location. They also can perform multiplication of powers of 2 +n and division of
powers of 2-n.
There are two type of shifts logical shifting and arithmetic shifting, later is used with signed
numbers while former with unsigned.
Rotate on the other hand rotates the information in a register or memoryeither from one end
to another or through the carry flag
Both the instruction shifts each bit to left, and places the MSB in CF and LSB is made 0.
The destination can be of byte size or of word size, also it can be a register or amemory
location. Number of shifts is indicated by the count.
All flags are affected.
Example:
MOV BL, B7 : BL is made B7H
SHR BL, 1 : shift the content of BL register one place to the right.
2. ROL Instruction: This instruction rotates all the bits in a specified byte or word to
the left some number of bit positions. MSB is placed as a new LSB and a new CF. The
destination can be of byte size or of word size, also it can be a register or a memory
location. Number of shifts is indicated by the count.
Example:
3. ROR Instruction: This instruction rotates all the bits in a specified byte or word to
the right some number of bit positions. LSB is placed as a new MSB and a new CF.
The destination can be of byte size or of word size, also it can be a register or a
memory location. Number of shifts is indicated by the count.
Example:
MOV BL, B7H : BL is made B7H
ROR BL, 1 : shift the content of BL register one place to the right
4. RCR Instruction: This instruction rotates all the bits in a specified byte or word to
the right some number of bit positions along with the carry flag. LSB is placed in a
new CF and previous carry is placed in the new MSB. The destination can be of byte
size or of word size, also it can be a register or a memorylocation. Number of shifts is
indicated by the count.
Example:
MOV BL, B7H : BL is made B7H
RCR BL, 1 . : shift the content of BL register one place to the right
1. STC Instruction: This instruction sets the carry flag. It does not affect any other flag.
2. CLC Instruction: This instruction resets the carry flag to zero. CLC does not affect
any other flag.
3. CMC Instruction: This instruction complements the carry flag. CMC does not affect
any other flag.
4. STD Instruction: This instruction is used to set the direction flag to one so that SI
and/or DI can be decremented automatically after execution of string instruction. STD
does not affectany other flag.
5. CLD Instruction: This instruction is used to reset the direction flag to zero so that SI
and/or DI can be incremented automatically after execution of string instruction. CLD
does not affect any other flag.
6. STI Instruction: This instruction sets the interrupt flag to 1. This enables INTR
interrupt of the 8086. STI does not affect any other flag.
7. CLI Instruction: This instruction resets the interrupt flag to 0. Due to this the 8086
will not respond to an interrupt signal on its INTR input. CLI does not affect any other
flag.
String Instructions
When direction flag is 0, SI and DI are incremented and when it is 1, SI and DI are
decremented.
MOVS affect no flags. MOVSB is used for byte sized movements while MOVSW is
for word sized.
Example:
Example:
REP REPE/REPZ → CX=0
→ CX=0 OR ZF=0
REPNE/REPNZ → CX=0 OR ZF=1
Example:
MOVDL, OFFSET D_STRING: assign DI with destination address
STOS D_STRING: assembler uses string name to determine byte or
word, if byte then AL is used and if of word size, AX is used.
5. CMPS/CMPSB/CMPSW: CMPS is used to compare the strings, byte wise or word
wise. The comparison is affected by subtraction of content pointed by DI from that
pointed by SI. The AF, CF, OF, PF, SF and ZF flags are affected by this instruction,
but neither operand is affected.
Example:
Assembler Directive
There are some instructions in the assembly language program which are not a part of
processor instruction set. These instructions are instructions to the assembler, linker
and loader. These are referred to as pseudo-operations or as assembler directives. The
assembler directives enable us to control the way in which a program assembles and
lists. They act during the assembly of a program and do not generate any executable
machine code.
1. ASSUME:
It is used to tell the name of the logical segment the assembler to use for aspecified
segment.
E.g.: ASSUME CS: CODE tells that the instructions for a program are in
a logical segment named CODE.
2. DB-Define Byte:
The DB directive is used to reserve byte or bytes of memory locations in the available
memory. While preparing the EXE file, this directive directs theassembler to allocate
the specified number of memory bytes to the said data type that may be a constant,
variable, string, etc. Another option of this directive also initializes the reserved
memory bytes with the ASCII codes of the characters specified as a string.
The following examples show how the DB directive is used for different purposes:
RANKS DB 01H,02H,03H,04H
This statement directs the assembler to reserve four memory locations for list
named RANKS and initialize them with the above specified four values
MESSAGE DB “GOOD MORNING”
This makes the assembler reserve the number of bytes of memory equal to the
number ofcharacters in the string named MESSAGE and initializes those locations by
the ASCII equivalent of these characters.
VALUE DB 50H
This statement directs the assembler to reserve 50H memory bytes
and leave themuninitialized for the variable named VALUE.
Example:
PACKED_BCD 11223344556677889900 declares an array that is 10 bytes in
length.
6. DW – Define Word
The DW directives serves the same purposes as the DB directive, but it
now makes the assembler reserve the number of memory words (16-bit) instead
of bytes.
Some examples are given to explain this directive.
This makes the assembler reserve four words in memory (8 bytes), and
initialize the words with the specified values in the statements. During
initialization, the lower bytes are stored at the lower memory addresses, while
the upper bytes are stored at the higher addresses.
NUMBER1 DW 1245H
This makes the assembler reserve one word in memory.
Used along with the name of the procedure to indicate the end of a
procedure.
Example:
This directive marks the end of a logical segment. The logical segments are
assigned with the names using the ASSUME directive. The names appear with the
ENDS directive as prefixes to mark the end of those particular segments. Whatever are
the contents of the segments, they should appear in the program before ENDS. Any
statement appearing after ENDS will be neglected from the segment.
The structure shown below explains the fact more clearly.
DATA SEGMENT
DATA ENDS
SEGMENT
CODE ENDS
ENDS
10. EQU-Equate: Used to give a name to some value or symbol. Each time the
assembler findsthe given name in the program, it will replace the name with the
value.
Example:
CORRECTION_FACTOR EQU
03H MOV AL,
CORRECTION_FACTOR
11. EVEN - Tells the assembler to increment the location counter to the next
even address if it is not already at an even address.
Used because the processor can read even addressed data in one clock cycle
12. EXTRN - Tells the assembler that the names or labels following the directive
are in someother assembly module.
Example: if a procedure in a program module assembled at a different time
from that which contains the CALL instruction ,this directive is used to tell the
assembler that the procedure is external
14. GROUP - Used to tell the assembler to group the logical statements named after
the directive into one logical group segment, allowing the contents of all the
segments to be accessed from the same group segment base.
15. INCLUDE - Used to tell the assembler to insert a block of source code from
the named file into the current source module. This will shorten the source code.
16. LABEL - Used to give a name to the current value in the location counter.This
directive is followed by a term that specifies the type you want associated with that
name.
17. NAME - Used to give a specific name to each assembly module when programs
consisting of several modules are written.
18. OFFSET- Used to determine the offset or displacement of a named data item or
procedure from the start of the segment which contains it.
19. ORG- The location counter is set to 0000 when the assembler starts reading a
segment. The ORG directive allows setting a desired value at any point in the
program.
Example: ORG 2000H
20. PROC- Used to identify the start of a procedure.
22. PUBLIC- Used to tell the assembler that a specified name or label will beaccessed
from other modules.
Example: PUBLIC DIVISOR, DIVIDEND makes the two variables DIVISOR and
DIVIDEND available to other assembly modules.
23. SEGMENT- Used to indicate the start of a logical segment.
24. SHORT- Used to tell the assembler that only a 1 byte displacement is
needed to code ajump instruction.
Macros:
Macro is a group of instruction. The macro assembler generates the code in theprogram each
time where the macro is “called”. Macros can be defined by MACROP and ENDM
assembler directives. Creating macro is very similar to creating a newopcode that can used in
the program, as shown below.
Example:
INIT MACRO
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
ENDM
It is important to note that macro sequences execute faster than procedures because
there is no CALL and RET instructions to execute. The assembler places the macro
instructions in the program each time when it is invoked. This procedure is known as
Macro expansion.
WHILE:
In Macro, the WHILE statement is used to repeat macro sequence until the expression
specified with it is true. Like REPEAT, end of loop is specified by ENDM statement.
The WHILE statement allows to use relational operators in its expressions.
The table-1 shows the relational operators used with WHILE statements.
PERATOR FUNCTION
EQ Equal
NE Not equal
LE Less than or equal
LT Less than
GE Greater than or equal
GT Greater than
NOT Logical inversion
AND Logical AND
OR Logical OR
Table: Shows the relational operators used with WHILE statements.
UNIT-III
Assembly Language Programming with 8086
levelmain endp
; Other Procedures
Segments are declared using directives. The following directives are used to specify the
following segments:
Stack
Data
Code
Stack Segment:
Used to set aside storage for the stack
Stack addresses are computed as offsets into this segment
Use: .stack followed by a value that indicates the size of the stack
Data Segment:
Used to set aside storage for variables.
Constants are defined within this segment in the program source.
Variable addresses are computed as offsets from the start of this segment
Use: .data followed by declarations of variables or definitions of constants.
Code Segment:
The code segment contains executable instructions macros and calls to procedures.
Use: code followed by a sequence of program statements
A utility program called an assembler is used to translate assembly language statements into
the target computer's machine code.
It is proved that any logic problem can be solved with only sequence, choice (for e.g., if-
then-else) and repetition (do-while). This is called as Structured Theorem.
Sequential Structures:
Sequential structures are structures that are stepped through sequential. These are alsocalled
sequences or iterative structures. Basic arithmetic, logical, and bit operations are in this
category. Data moves and copies are sequences.
Branching Structures:
Branching structures consist of direct and indirect jumps (including the in famous “GOTO”),
conditional jumps (IF), nested ifs, and case (or switch) structures.
Loop Structures:
The basic looping structures are DO iterative, do WHILE, and do UNTIL. An infinite loop is
one that has no exit. Normally, infinite loops are programming errors, but event loops and
task schedulers are examples of intentional infinite loops.
Conditional Statement in Assembly Language Program. IF-.ELSE-.ENDIF Statement
The conditional statements are implemented in the assembly language program using. IF,
ELSE, ENDIF structure found in higher level language. Only MASM version 6-X supports
this. The earlier versions of the assembler does not support IF statement. Here is the general
format for the IF conditional statement.
As shown above every .IF directive must have a matching ENDIF to terminate a tested
condition. ELSE is optional. It provides an alternate action. The assembly also allows to use
relational operators with .IF statement.
Like DO-WHILE statement in higher level language, the assembler supports .WHILE- .ENDW
statement. The WHILE statement is used with a condition to begin the loop, and the .ENDW
statement ends the loop.
.BREAK and .CONTINUE statements function in the same manner in a C-language program.
The .BREAK statement is used to break out of the .WHILE loop.
.REPEAT – .UNTIL statements allow to execute series of instructions repeatedly until some
condition occurs. The .REPEAT defines the start of the loop and .UNTIL defines the end of
loop. A .UNTIL statement has a condition. When condition is true loop is terminated.
The conditional assembly statements are implemented in macros using IF – ELSE – ENDIF
structure found in higher level languages. Here is the general format for the IF family of
conditional statements.
every IF directive must have a matching ENDIF to terminate a tested condition. ELSE is
optional.
REPEAT Statement
In macro the REPEAT statement is used to repeat macro sequence for a fix number of time.
The repetition count is specified immediately after the REPEAT statement as shown in the
program. The statements within the REPEAT and the first ENDM are repeated 26 times.
WHILE Statement
In macro, the WHILE statement is used to repeat macro sequence until the expression specified
with it is true. Like REPEAT, end of loop is specified by ENDM statement. The WHILE
statement allows to use relational operators in its expression.
FOR Statement
A FOR statement in the macro repeats the macro sequence for a list of data. For example, if we
pass two arguments to the macro then in the first iteration the FOR statement gives the macro
sequence using first argument and in the second iteration it gives the macro sequence using
second argument. Like WHILE statement, end of FOR is indicated by ENDM statement.
ORG 100h ; this directive required for a simple 1 segment .com program.
MOV AX, 0B800h ; set AX to hexadecimal value of B800h.
MOV DS, AX ; copy value of AX to DS.
MOV CL, 'A' ; set CL to ASCII code of 'A', it is 41h.
MOV CH, 1101_1111b ; set CH to binary value.
MOV BX, 15Eh ; set BX to 15Eh.
MOV [BX], CX ; copy contents of CX to memory at B800:015E
RET ; returns to operating system.
Stack contains a set of sequentially arranged datatypes, with the last item appearing on
top of the stack.
T is a top-down data structure whose elements are accessed using the stack pointer (SP)
which gets decremented by two when a data word is stored in the stack and gets
incremented by two when a data word is retrieved from the stack back to the CPU
register.
The process of storing the data in the stack is called “Pushing into” the stack and the
process of transferring the data back from the stack to the CPU register is known as
“Popping off” the stack.
The stack is Last-In-First-Out (LIFO) data segment i.e., the data which is pushed last
will be on top of the stack and will be popped off the stack first.
The stack pointer is a 16-bit register that contains the offset address of the memory
location in the stack segment.
The stack segment may have a memory block of a maximum of 64 Kbytes locations
and it may be overlapped with any other segment.
Stack Segment register (SS) contains the base address of the stack segment in the
memory.
Interrupts and Interrupt service routines
Interrupt is the method of creating a temporary halt during program execution and allows
peripheral devices to access the microprocessor. The microprocessor responds to that interrupt
with an ISR (Interrupt Service Routine), which is a short program to instruct the
microprocessor on how to handle the interrupt.
The following image shows the types of interrupts we have in a 8086 microprocessor −
Hardware Interrupts
Hardware interrupt is caused by any peripheral device by sending a signal through a specified
pin to the microprocessor.
The 8086 has two hardware interrupt pins, i.e. NMI and INTR. NMI is a non-maskable
interrupt and INTR is a maskable interrupt having lower priority. One more interrupt pin
associated is INTA called interrupt acknowledge.
NMI (Non-Maskable Interrupt)
It is a single non-maskable interrupt pin (NMI) having higher priority than the maskable
interrupt request pin (INTR)and it is of type 2 interrupt.
When this interrupt is activated, these actions take place:
Completes the current instruction that is in progress.
Pushes the Flag register values on to the stack.
Pushes the CS (code segment) value and IP (instruction pointer) value of the return
address on to the stack.
IP is loaded from the contents of the word location 00008H.
CS is loaded from the contents of the next word location 0000AH.
Interrupt flag and trap flag are reset to 0.
INTR
The INTR is a maskable interrupt because the microprocessor will be interrupted only if
interrupts are enabled using set interrupt flag instruction. It should not be enabled using clear
interrupt Flag instruction.
The INTR interrupt is activated by an I/O port. If the interrupt is enabled and NMI is disabled,
then the microprocessor first completes the current execution and sends ‘0’ on INTA pin twice.
The first ‘0’ means INTA informs the external device to get ready and during the second ‘0’
the microprocessor receives the 8 bit, say X, from the programmable interrupt controller.
These actions are taken by the microprocessor:
First completes the current instruction.
Activates INTA output and receives the interrupt type, say X.
Flag register value, CS value of the return address and IP value of the return address
are pushed on to the stack.
IP value is loaded from the contents of word location X × 4
CS is loaded from the contents of the next word location.
Interrupt flag and trap flag is reset to 0
Software Interrupts
Some instructions are inserted at the desired position into the program to create interrupts.
These interrupt instructions can be used to test the working of various interrupt handlers. It
includes:
Register
Memory
Pointers
Stack
.model small
.data
MULTIPLICAND DW 1234H
MULTIPLIER DW 4232H
.code
MOV AX, MULTIPLICAND
MOV BX, MULTIPLIER
CALL MULTI
:
:
MULTI PROC NEAR
MUL BX ; Procedure to access data from BX register
RET
MULTI ENDP
:
:
END
The disadvantage of using registers to pass parameters is that the number of registers limits the
number of parameters you can pass.
Passing parameters using memory-
In the cases where few parameters have to be passed to and from a procedure, registers are
convenient. But, in cases when we need to pass a large number of parameters to procedure, we
use memory. This memory may be a dedicated section of general memory or a part of it.
.model small
.data
MULTIPLICAND DW 1234H ; Storage for multiplicand value
MULTIPLIER DW 4232H ; Storage for multiplier value
MULTIPLICATION DW ? ; Storage for multiplication result
.code
MOV AX, @Data
MOV DS, AX
:
:
CALL MULTI
:
:
MULTI PROC NEAR
MOV AX, MULTIPLICAND
MOV BX, MULTIPLIER
:
:
MOV MULTIPLICATION, AX ; Store the multiplication value in named memory location
RET
MULTI ENDP
END
Passing parameter using pointers-
A parameter passing method which overcomes the disadvantage of using data item names (i.e.
variable names) directly in a procedure is to use registers to pass the procedure pointers to the
desired data.
.model small
.data
MULTIPLICAND DB 12H ; Storage for multiplicand value
MULTIPLIER DB 42H ; Storage for multiplier value
MULTIPLICATION DW ? ; Storage for multiplication result
.code
MOV AX, @Data
MOV DS, AX
MOV SI, OFFSET MULTIPLICAND
MOV DI, OFFSET MULTIPLIER
MOV BX, OFFSET MULTIPLICATION
CALL MULTI
:
:
MULTI PROC NEAR
:
:
MOV AL, [SI] ; Get multiplicand value pointed by SI in accumulator
MOV BL, [DI] ; Get multiplier value pointed by DI in BL
:
:
MOV [BX], AX ; Store result in location pointed out by BX
RET
MULTI ENDP
END
Passing parameters using stack-
In order to pass the parameters using stack we push them on the stack before the call for the
procedure in the main program. The instructions used in the procedure read these parameters
from the stack. Whenever stack is used to pass parameters it is important to keep a track of
what is pushed on the stack and what is popped off the stack in the main program.
.model small
.data
MULTIPLICAND DW 1234H
MULTIPLIER DW 4232H
.code
MOV AX, @data
MOV DS, AX
:
:
PUSH MULTIPLICAND
PUSH MULTIPLIER
CALL MULTI
:
:
MULTI PROC NEAR
PUSH BP
MOV BP, SP ; Copies offset of SP into BP
MOV AX, [BP + 6] ; MULTIPLICAND value is available at
; [BP + 6] and is passed to AX
MUL WORD PTR [BP + 4] ; MULTIPLIER value is passed
POP BP
RET ; Increments SP by 4 to return address
MULTI ENDP ; End procedure
END
MACROS:
A Macro is a set of instructions grouped under a single unit. It is another method for
implementing modular programming in the 8086 microprocessors (The first one was using
Procedures).
The Macro is different from the Procedure in a way that unlike calling and returning the
control as in procedures, the processor generates the code in the program every time whenever
and wherever a call to the Macro is made.
And a call to Macro is made just by mentioning the name of the Macro:
It is optional to pass the parameters in the Macro. If you want to pass them to your macros, you
can simply mention them all in the very first statement of the Macro just after the directive:
MACRO.
The advantage of using Macro is that it avoids the overhead time involved in calling and
returning (as in the procedures). Therefore, the execution of Macros is faster as compared to
procedures. Another advantage is that there is no need for accessing stack or providing any
separate memory to it for storing and returning the address locations while shifting the
processor controls in the program.
But it should be noted that every time you call a macro, the assembler of the microprocessor
places the entire set of Macro instructions in the mainline program from where the call to Macro
is being made. This is known as Macro expansion. Due to this, the program code (which uses
Macros) takes more memory space than the code which uses procedures for implementing the
same task using the same set of instructions. Hence, it is better to use Macros where we have
small instruction sets containing less number of instructions to execute.
UNIT-IV
Hardware Implementation
The hardware consists of two registers A and B to store the magnitudes, and two flip- flops As and
Bs to store the corresponding signs. The results can be stored in the register A and As which acts as an
accumulator. The subtraction is performed by adding A to the 2’s complement of B. The output carry is
transferred to the flip-flop E. The overflow may occur during the add operation which is stored in the flip-
flop A Ë… F. When m = 0, the output of E is transferred to the adder without any change along with the
input carry of ‘0".
The output of the parallel adder is equal to A + B which is an add operation. When m = 1, the
content of register B is complemented and transferred to parallel adder along with the input carry of 1.
Therefore, the output of parallel is equal to A + B’ + 1 = A – B which is a subtract operation.
Hardware Algorithm
fig: flowchart for add and subtract operations
As and Bs are compared by an exclusive-OR gate. If output=0, signs are identical, if 1 signs are different.
For Add operation, identical signs dictate addition of magnitudes and for operation
identical signs dictate addition of magnitudes and for subtraction, different magnitudes
dictate magnitudes be added. Magnitudes are added with a micro operation EA
Two magnitudes are subtracted if signs are different for add operation and identical for
subtract operation. Magnitudes are subtracted with a micro operation EA = B and number
(this number is checked again for 0 to make positive 0 [As=0]) in A is correct result. E = 0
indicates A < B, so we take 2’s complement of A.
Multiplication
Hardware Implementation and Algorithm
Generally, the multiplication of two final point binary number in signed magnitude representation is
performed by a process of successive shift and ADD operation. The process consists of looking at the
successive bits of the multiplier (least significant bit first). If the multiplier is 1, then the multiplicand is
copied down otherwise, 0’s are copied. The numbers
copied down in successive lines are shifted one position to the left and finally, all the numbersare
added to get the product
But, in digital computers, an adder for the summation (∑) of only two binary numbers are used and
the partial product is accumulated in register. Similarly, instead of shifting the multiplicand to the
left, the partial product is shifted to the right. The hardware for the multiplication of signed
magnitude data is shown in the figure below.
Initially, the multiplier is stored q register and the multiplicand in the B register. A register is used to store the
partial product and the sequence counter (SC) is set to a number equal to the number of bits in the multiplier.
The sum of A and B form the partial product and both shifted to the right using a statement “Shr EAQ” as
shown in the hardware algorithm. The flip flops As, Bs & Qs store the sign of A, B & Q respectively. A binary
‘0” inserted into the flip-flop E during the shift right.
Hardware Algorithm
Iteration3(Qn=0)
0 01000 10110 010(2)
shrEAQ,
Iteration4(Qn=0)
0 00100 01011 001(1)
shrEAQ,
Iteration5(Qn=1 00100
Add B 0 +10111 01011
Fifth partial product 11011
shrEAQ,
0 01101 10101 000
FinalProductinAQ 0110110101
Booth Algorithm
The algorithm that is used to multiply binary integers in signed 2’s complement form is called booth
multiplication algorithm. It works on the principle that the string 0’s in the multiplier doesn’t need addition but
just the shifting and the sting of 1’s from bit weight 2k to 2m can be treated as 2k+1 – 2m (Example, +14 =
001110 = 23=1 – 21 = 14). The product can be obtained by shifting the binary multiplication to the left and
subtraction the multiplier shifted left once.
According to booth algorithm, the rule for multiplication of binary integers in signed 2’s complement form are:
The multiplicand is subtracted from the partial product of the first least significant bit is 1in a string
of 1’s in the multiplicand.
The multiplicand is added to the partial product if the first least significant bit is 0(provided that
there was a previous 1) in a string of 0’s in the multiplier
The partial product doesn’t change when the multiplier bit is identical to the previousmultiplier
bit.
• This algorithm is used for both the positive and negative numbers in signed 2’s complementform. The
hardware implementation of this algorithm is in figure below:
Example.
Consider that the multiplicand bits are b1 and b0 and the multiplier bits are a1 and a0. The partial
product is c3c2c1c0. The multiplication two bits a0 and a1 produces a binary 1 if both the bits are 1,
otherwise it produces a binary 0. This is identical to the AND operation and can be implemented with
the AND gates as shown in figure
Division Algorithm
The division of two fixed point signed numbers can be done by a process of successive compare shift and
subtraction. When it is implemented in digital computers, instead of shifting the divisor to the right, the
dividend or the partial remainder is shifted to the left. The subtraction can be obtained by adding the number A
to the 2’s complement of number B. The information about the relative magnitudes of the information about
the relative magnitudesof numbers can be obtained from the end carry,
Hardware Implementation
Division Algorithm
The divisor is stored in register B and a double length dividend is stored in register A and Q. the dividend is
shifted to the left and the divider is subtracted by adding twice complement of the value. If E = 1, then A >=
B. In this case, a quotient bit 1 is inserted into Qn and the partial remainder is shifted to the left to repeat the
process. If E = 0, then A > B. In this case, the quotient bit Qn remains zero and the value of B is added to
restore the partial remainder in A to the previous value. The partial remainder is shifted to the left and
approaches continues until the sequence counter reaches to 0. The registers E, A & Q are shifted to the left
with 0 inserted into Qn and the previous value of E is lost as shown in the flow chart for division algorithm.
Add/Subtract Rule
The steps in addition (FA) or subtraction (FS) of floating-point numbers (s1, eˆ , f1) fad{s2, eˆ 2, f2) are as
follows.
1. Unpack sign, exponent, and fraction fields. Handle special operands such as zero,infinity, or
NaN(not a number).
2. Shift the significand of the number with the smaller exponent right by bits.
3. Set the result exponent er to max(e1,e2).
4. If the instruction is FA and s1= s2 or if the instruction is FS and s1 ≠ s2 then add the
significands; otherwise subtract them.
5. Count the number z of leading zeros. A carry can make z = -1. Shift the resultsignificand left z
bits or right 1 bit if z = -1
6. Round the result significand, and shift right and adjust z if there is rounding overflow,which is
a carry-out of the leftmost digit upon rounding.
7. Adjust the result exponent by er = er - z, check for overflow or underflow, and packthe result
sign, biased exponent, and fraction bits into the result word.
Multiplication and division are somewhat easier than addition and subtraction, in thatno alignment
of mantissas is needed.
BCD Adder:
BCD adder A 4-bit binary adder that is capable of adding two 4-bit words having a BCD (binary-coded
decimal) format. The result of the addition is a BCD-format 4-bit output word, representing the
decimal sum of the addend and augend, and a carry that is generated if this sum exceeds a decimal
value of 9. Decimal addition is thus possible using these device
Input-Output Organization: Peripheral Devices, Input-Output Interface, Asynchronous data transfer, modes of
Transfer,Priority Interrupt, Direct memory Access, Input –Output Processor (IOP),Intel 8089 IOP
Input-output subsystems
The Input/output organization of computer depends upon the size of computer and the peripherals connected to
it. The I/O Subsystem of the computer provides an efficient mode of communication between the central
system and the outside environment.
The most common input output devices are: Monitor, Keyboard, Mouse, Printer, Magnetic tapes
Input Output Interface provides a method for transferring information between internal storage and external
I/O devices. Peripherals connected to a computer need special communication links for interfacing them with
the central processing unit. The purpose of communication link is to resolve the differences that exist between
the central computer and each peripheral
In this case the interface receives on item of data from the peripheral and places it in its buffer register. I/O
Versus Memory Bus
To communicate with I/O, the processor must communicate with the memory unit. Like the I/O bus,
the memory bus contains data, address and read/write control lines. There are 3 ways that computer buses can
be used to communicate with memory and I/O:
1. Use two Separate buses, one for memory and other for I/O.
2. Use one common bus for both memory and I/O but separate control lines for each.
3. Use one common bus for memory and I/O with common control lines.
But, the Asynchronous Data Transfer between two independent units requires that control signals be transmitted
between the communicating units so that the time can be indicated at which they send data. These two methods
can achieve this asynchronous way of data transfer:
o Strobe control: A strobe pulse is supplied by one unit to indicate to the other unit when the transfer has to
occur.
o Handshaking: This method is commonly used to accompany each data item being transferred with a
control signal that indicates data in the bus. The unit receiving the data item responds with another signal to
acknowledge receipt of the data.
The strobe pulse and handshaking method of asynchronous data transfer is not restricted to I/O transfer.
They are used extensively on numerous occasions requiring the transfer of data between two independent
units. So, here we consider the transmitting unit as a source and receiving unit as a destination.
The asynchronous data transfer between two independent units requires that control signals be transmitted
between the communicating units to indicate when they send the data. Thus, the two methods can achieve the
asynchronous way of data transfer.
The Strobe Control method of asynchronous data transfer employs a single control line to time each transfer.
This control line is also known as a strobe, and it may be achieved either by source or destination, depending
on which initiate the transfer.
Source initiated strobe: In the below block diagram, you can see that strobe is initiated by source, and as
shown in the timing diagram, the source unit first places the data on the data bus.
Destination initiated strobe: In the below block diagram, you see that the strobe initiated by destination, and in
the timing diagram, the destination unit first activates the strobe pulse,informing the source to provide the data.
The source unit responds by placing the requested binary information on the data bus. The data must be valid and
remain on the bus long enough for the destination unit to accept it.
The falling edge of the strobe pulse can use again to trigger a destination register. The destination unit then
disables the strobe. Finally, and source removes the data from the data bus after a determined time interval.
In this case, the strobe may be a memory read control from the CPU to a memory unit. The CPU initiates the read
operation to inform the memory, which is a source unit, to place the selected word into the data bus.
Modes of Transfer
In this technique CPU is responsible for executing data from the memory for output and storing data in memory
for executing of Programmed I/O as shown in Fig
Vectored Interrupt: In vectored interrupt the source that interrupts the CPU
provides the branch information. This information is called interrupt
vectored.
Non-vectored Interrupt: In non-vectored interrupt, the branch address is
assigned to the fixed address in the memory.
The CPU may be placed in an idle state in a variety of ways. One common method extensively used in
microprocessor is to disable the buses through special control signals such as:
Bus Request (BR)
Bus Grant (BG)
These two control signals in the CPU that facilitates the DMA transfer. The Bus
is used by the DMA controller to request the CPU. When this input is active, the CPU terminates the execution of
the current instruction and places the address bus, data bus and read write lines into a high Impedance state. High
Impedance state means that the output is disconnected.
The CPU activates the Bus Grant (BG) output to inform the external DMA that the Bus Request (BR)
can now take control of the buses to conduct memory transfer without processor.
When the DMA terminates the transfer, it disables the Bus Request (BR) line. The CPU disables the Bus
Grant (BG), takes control of the buses and return to its normal operation.
DMA Controller:
The DMA controller needs the usual circuits of an interface to communicate with the CPU and I/O
device. The DMA controller has three registers:
Address Register
Word Count Register
Control Register
Address Register: Address Register contains an address to specify the desired location in memory.
Word Count Register: WC holds the number of words to be transferred. The register is incre/decre by
one after each word transfer and internally tested for zero.
Control Register: Control Register specifies the mode of transfer
The unit communicates with the CPU via the data bus and control lines. The registers in the DMA are selected
by the CPU through the address bus by enabling the DS (DMA select) and RS (Register select) inputs. The RD
(read) and WR (write) inputs are bidirectional.
When the BG (Bus Grant) input is 0, the CPU can communicate with the DMA registers through the data bus
to read from or write to the DMA registers. When BG =1, the DMA can communicate directly with the
memory by specifying an address in the address bus and activating the RD or WR control.
DMA Transfer:
The CPU communicates with the DMA through the address and data buses as with any interface
unit. The DMA has its own address, which activates the DS and RS lines. The CPU initializes
the DMA through the data bus. Once the DMA receives the start control command, it can
transfer between the peripheral and the memory.When BG = 0 the RD and WR are input lines
allowing the CPU to communicate with the internal DMA registers. When BG=1, the RD and
WR are output lines from the DMA controller to the random access memory to specify the read
or write operation of data.
A memory unit is an essential component in any digital computer since it is needed for storing
programs and data.
1. The memory unit that establishes direct communication with the CPU is called Main
Memory. The main memory is often referred to as RAM (Random Access Memory).
2. The memory units that provide backup storage are called Auxiliary Memory. For
instance, magnetic disks and magnetic tapes are the most commonly used auxiliary
memories.
Apart from the basic classifications of a memory unit, the memory hierarchy consists all of the
storage devices available in a computer system ranging from the slow but high-capacity
auxiliary memory to relatively faster main memory.
Auxiliary Memory:
A magnetic disk is a digital computer memory that uses a magnetization process to write,
rewrite and access data. For example, hard drives, zip disks, and floppy disks.
Magnetic tape is a storage medium that allows for data archiving, collection, and backup for
different kinds of data.
Main Memory:
The main memory in a computer system is often referred to as Random Access Memory
(RAM). This memory unit communicates directly with the CPU and with auxiliary memory
devices through an I/O processor.
The programs that are not currently required in the main memory are transferred into auxiliary
memory to provide space for currently used programs and data.
I/O Processor:
The primary function of an I/O Processor is to manage the data transfers between auxiliary
memories and the main memory.
Cache Memory:
The data or contents of the main memory that are used frequently by CPU are stored in the
cache memory so that the processor can easily access that data in a shorter time. Whenever the
CPU requires accessing memory, it first checks the required data into the cache memory. If the
data is found in the cache memory, it is read from the fast memory. Otherwise, the CPU moves
onto the main memory for the required data.
Main Memory
The main memory acts as the central storage unit in a computer system. It is a relatively large
and fast memory which is used to store programs and data during the run time operations.
The primary technology used for the main memory is based on semiconductor integrated
circuits. The integrated circuits for the main memory are classified into two major units.
The RAM integrated circuit chips are further classified into two possible operating
modes, static and dynamic.
The primary compositions of a static RAM are flip-flops that store the binary information. The
nature of the stored information is volatile, i.e. it remains valid as long as power is applied to
the system. The static RAM is easy to use and takes less time performing read and write
operations as compared to dynamic RAM.
The dynamic RAM exhibits the binary information in the form of electric charges that are
applied to capacitors. The capacitors are integrated inside the chip by MOS transistors. The
dynamic RAM consumes less power and provides large storage capacity in a single memory
chip.
RAM chips are available in a variety of sizes and are used as per the system requirement. The
following block diagram demonstrates the chip interconnection in a 128 * 8 RAM chip.
A 128 * 8 RAM chip has a memory capacity of 128 words of eight bits (one byte) per
word. This requires a 7-bit address and an 8-bit bidirectional data bus.
The 8-bit bidirectional data bus allows the transfer of data either from memory to CPU
during a read operation or from CPU to memory during a write operation.
The read and write inputs specify the memory operation, and the two chip select (CS)
control inputs are for enabling the chip only when the microprocessor selects it.
The bidirectional data bus is constructed using three-state buffers.
The output generated by three-state buffers can be placed in one of the three possible
states which include a signal equivalent to logic 1, a signal equal to logic 0, or a high-
impedance state.
Note: The logic 1 and 0 are standard digital signals whereas the high-impedance state
behaves like an open circuit, which means that the output does not carry a signal and has
no logic significance.
The following function table specifies the operations of a 128 * 8 RAM chip.
From the functional table, we can conclude that the unit is in operation only when CS1 = 1
and CS2 = 0. The bar on top of the second select variable indicates that this input is enabled
when it is equal to 0.
The primary component of the main memory is RAM integrated circuit chips, but a portion of
memory may be constructed with ROM chips.
A ROM memory is used for keeping programs and data that are permanently resident in the
computer.
Apart from the permanent storage of data, the ROM portion of main memory is needed for
storing an initial program called a bootstrap loader. The primary function of the bootstrap
loader program is to start the computer software operating when power is turned on.
ROM chips are also available in a variety of sizes and are also used as per the system
requirement. The following block diagram demonstrates the chip interconnection in a 512 * 8
ROM chip.
A ROM chip has a similar organization as a RAM chip. However, a ROM can only
perform read operation; the data bus can only operate in an output mode.
The 9-bit address lines in the ROM chip specify any one of the 512 bytes stored in it.
The value for chip select 1 and chip select 2 must be 1 and 0 for the unit to operate.
Otherwise, the data bus is said to be in a high-impedance state.
Auxiliary Memory
An Auxiliary memory is known as the lowest-cost, highest-capacity and slowest-access storage
in a computer system. It is where programs and data are kept for long-term storage or when not
in immediate use. The most common examples of auxiliary memories are magnetic tapes and
magnetic disks.
Magnetic Disks
A magnetic disk is a type of memory constructed using a circular plate of metal or plastic
coated with magnetized materials. Usually, both sides of the disks are used to carry out
read/write operations. However, several disks may be stacked on one spindle with read/write
head available on each surface.
The following image shows the structural representation for a magnetic disk.
The memory bits are stored in the magnetized surface in spots along the concentric
circles called tracks.
The concentric circles (tracks) are commonly divided into sections called sectors.
Magnetic Tape
Magnetic tape is a storage medium that allows data archiving, collection, and backup for
different kinds of data. The magnetic tape is constructed using a plastic strip coated with a
magnetic recording medium.
The bits are recorded as magnetic spots on the tape along several tracks. Usually, seven or nine
bits are recorded simultaneously to form a character together with a parity bit.
Magnetic tape units can be halted, started to move forward or in reverse, or can be rewound.
However, they cannot be started or stopped fast enough between individual characters. For this
reason, information is recorded in blocks referred to as records.
Associative Memory
An associative memory can be considered as a memory unit whose stored data can be identified
for access by the content of the data itself rather than by an address or memory location.
From the block diagram, we can say that an associative memory consists of a memory array
and logic for 'm' words with 'n' bits per word.
The functional registers like the argument register A and key register K each have n bits, one
for each bit of a word. The match register M consists of m bits, one for each memory word.
The words which are kept in the memory are compared in parallel with the content of the
argument register.
The key register (K) provides a mask for choosing a particular field or key in the argument
word. If the key register contains a binary value of all 1's, then the entire argument is compared
with each memory word. Otherwise, only those bits in the argument that have 1's in their
corresponding position of the key register are compared. Thus, the key provides a mask for
identifying a piece of information which specifies how the reference to memory is made.
The following diagram can represent the relation between the memory array and the external
registers in an associative memory.
The cells present inside the memory array are marked by the letter C with two subscripts. The
first subscript gives the word number and the second specifies the bit position in the word. For
instance, the cell Cij is the cell for bit j in word i.
A bit Aj in the argument register is compared with all the bits in column j of the array provided
that Kj = 1. This process is done for all columns j = 1, 2, 3 , n.
If a match occurs between all the unmasked bits of the argument and the bits in word i, the
corresponding bit M i in the match register is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is cleared to 0.
Cache Memory
The data or contents of the main memory that are used frequently by CPU are stored in the
cache memory so that the processor can easily access that data in a shorter time. Whenever the
CPU needs to access memory, it first checks the cache memory. If the data is not found in
cache memory, then the CPU moves into the main memory.
Cache memory is placed between the CPU and the main memory. The block diagram for a
cache memory can be represented as:
The cache is the fastest component in the memory hierarchy and approaches the speed of CPU
components.
When the CPU needs to access memory, the cache is examined. If the word is found in
the cache, it is read from the fast memory.
If the word addressed by the CPU is not found in the cache, the main memory is
accessed to read the word.
A block of words one just accessed is then transferred from main memory to cache
memory. The block size may vary from one word (the one just accessed) to about 16
words adjacent to the one just accessed.
The performance of the cache memory is frequently measured in terms of a quantity
called hit ratio.
When the CPU refers to memory and finds the word in cache, it is said to produce a hit.
If the word is not found in the cache, it is in main memory and it counts as a miss.
The ratio of the number of hits divided by the total CPU references to memory (hits
plus misses) is the hit ratio.
A parallel processing system can carry out simultaneous data-processing to achieve faster
execution time. For instance, while an instruction is being processed in the ALU component of
the CPU, the next instruction can be read from memory.
The primary purpose of parallel processing is to enhance the computer processing capability
and increase its throughput, i.e. the amount of processing that can be accomplished during a
given interval of time.
A parallel processing system can be achieved by having a multiplicity of functional units that
perform identical or different operations simultaneously. The data can be distributed among
various multiple functional units.
The following diagram shows one possible way of separating the execution unit into eight
functional units operating in parallel.
The operation performed in each functional unit is indicated in each block if the diagram:
The adder and integer multiplier performs the arithmetic operation with integer
numbers.
The floating-point operations are separated into three circuits operating in parallel.
The logic, shift, and increment operations can be performed concurrently on different
data. All units are independent of each other, so one number can be shifted while
another number is being incremented.
Pipelining
The term Pipelining refers to a technique of decomposing a sequential process into sub -
operations, with each sub-operation being executed in a dedicated segment that operates
concurrently with all other segments.
The most important characteristic of a pipeline technique is that several computations can be
in progress in distinct segments at the same time. The overlapping of computation is made
possible by associating a register with each segment in the pipeline. The registers provide
isolation between each segment so that each can operate on distinct data simultaneously.
Let us consider an example of combined multiplication and addition operation to get a better
understanding of the pipeline organization.
The combined multiplication and addition operation is done with a stream of numbers such as:
Ai * Bi + Ci for i = 1, 2, 3, ....... , 7
The operation to be performed on the numbers is decomposed into sub-operations with each
sub-operation to be implemented in a segment within a pipeline.
The sub-operations performed in each segment of the pipeline are defined as:
The following block diagram represents the combined as well as the sub-operations performed
in each segment of the pipeline.
Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a particular
segment.
The output generated by the combinational circuit in a given segment is applied as an input
register of the next segment. For instance, from the block diagram, we can see that the register
R3 is used as one of the input registers for the combinational adder circuit.
In general, the pipeline organization is applicable for two areas of computer design which
includes:
1. Arithmetic Pipeline
2. Instruction Pipeline
Arithmetic Pipeline
Arithmetic Pipelines are mostly used in high-speed computers. They are used to implement
floating-point operations, multiplication of fixed-point numbers, and similar computations
encountered in scientific problems.
To understand the concepts of arithmetic pipeline in a more convenient way, let us consider an
example of a pipeline unit for floating-point addition and subtraction.
The inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers defined as:
X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102
Where A and B are two fractions that represent the mantissa and a and b are the exponents.
The combined operation of floating-point addition and subtraction is divided into four
segments. Each segment contains the corresponding suboperation to be performed in the given
pipeline. The suboperations that are shown in the four segments are:
We will discuss each suboperation in a more detailed manner later in this section.
The following block diagram represents the suboperations performed in each segment of the
pipeline.
Note: Registers are placed after each suboperation to store the intermediate results.
1. Compare exponents by subtraction:
The exponents are compared by subtracting them to determine their difference. The larger
exponent is chosen as the exponent of the result.
The difference of the exponents, i.e., 3 - 2 = 1 determines how many times the mantissa
associated with the smaller exponent must be shifted to the right.
The mantissa associated with the smaller exponent is shifted according to the difference of
exponents determined in segment one.
X = 0.9504 * 103
Y = 0.08200 * 103
3. Add mantissas:
Z = X + Y = 1.0324 * 103
4. Normalize the result:
Z = 0.1324 * 104
Instruction Pipeline
Pipeline processing can occur not only in the data stream but in the instruction stream as well.
Most of the digital computers with complex instructions require instruction pipeline to carry
out operations like fetch, decode and execute instructions.
In general, the computer needs to process each instruction with the following sequence of steps.
Each step is executed in a particular segment, and there are times when different segments may
take different times to operate on the incoming information. Moreover, there are times when
two or more segments may require memory access at the same time, causing one segment to
wait until another is finished with the memory.
The organization of an instruction pipeline will be more efficient if the instruction cycle is
divided into segments of equal duration. One of the most common examples of this type of
organization is a Four-segment instruction pipeline.
A four-segment instruction pipeline combines two or more different segments and makes it
as a single one. For instance, the decoding of the instruction can be combined with the
calculation of the effective address into one segment.
The following block diagram shows a typical example of a four-segment instruction pipeline.
The instruction cycle is completed in four segments.
Segment 1:
The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.
Segment 2:
The instruction fetched from memory is decoded in the second segment, and eventually, the
effective address is calculated in a separate arithmetic circuit.
Segment 3:
Segment 4:
The instructions are finally executed in the last segment of the pipeline organization.
RISC Pipeline
RISC stands for Reduced Instruction Set Computers. It was introduced to execute as fast as
one instruction per clock cycle. This RISC pipeline helps to simplify the computer
architecture’s design.
It relates to what is known as the Semantic Gap, that is, the difference between the operations
provided in the high-level languages (HLLs) and those provided in computer architectures.
To avoid these consequences, the conventional response of the computer architects is to add
layers of complexity to newer architectures. This also increases the number and complexity of
instructions together with an increase in the number of addressing modes. The architecture
which resulted from the adoption of this “add more complexity” are known as Complex
Instruction Set Computers (CISC).
The main benefit of RISC to implement instructions at the cost of one per clock cycle is
continually not applicable because each instruction cannot be fetched from memory and
implemented in one clock cycle correctly under all circumstances.
The method to obtain the implementation of an instruction per clock cycle is to initiate each
instruction with each clock cycle and to pipeline the processor to manage the objective of
single-cycle instruction execution.
RISC compiler gives support to translate the high-level language program into a machine
language program. There are various issues in managing complexity about data conflicts and
branch penalties are taken care of by the RISC processors, which depends on the adaptability
of the compiler to identify and reduce the delays encountered with these issues.
Principles of RISCs Pipeline
There are various principles of RISCs pipeline which are as follows −
Vector Processor
Vector processor is basically a central processing unit that has the ability to execute the
complete vector input in a single instruction.
The elements of the vector are ordered properly so as to have successive addressing format of
the memory to implement the data sequentially.
It holds a single control unit but has multiple execution units that perform the same operation
on different data elements of the vector.
Unlike scalar processors that operate on only a single pair of data, a vector processor operates
on multiple pair of data. However, one can convert a scalar code into vector code. This
conversion process is known as vectorization.
These instructions are said to be single instruction multiple data or vector instructions.
Architecture:
Once the instruction is fetched then IPU determines either the fetched instruction is scalar or
vector in nature. If it is scalar in nature, then the instruction is transferred to the scalar register
and then further scalar processing is performed.
While, when the instruction is a vector in nature then it is fed to the vector instruction
controller. This vector instruction controller first decodes the vector instruction then
accordingly determines the address of the vector operand present in the memory.
Then it gives a signal to the vector access controller about the demand of the respective
operand. This vector access controller then fetches the desired operand from the memory. Once
the operand is fetched then it is provided to the instruction register so that it can be processed
at the vector processor.
At times when multiple vector instructions are present, then the vector instruction controller
provides the multiple vector instructions to the task system. And in case the task system shows
that the vector task is very long then the processor divides the task into subvectors.
These subvectors are fed to the vector processor that makes use of several pipelines in order to
execute the instruction over the operand fetched from the memory at the same time. The various
vector instructions are scheduled by the vector instruction controller.
In memory to memory architecture, source operands, intermediate and final results are
retrieved (read) directly from the main memory. For memory to memory vector instructions,
the information of the base address, the offset, the increment, and the the vector length must be
specified in order to enable streams of data transfers between the main memory and pipelines.
The processors like TI-ASC, CDC STAR-100, and Cyber-205 have vector instructions in
memory to memory formats. The main points about memory to memory architecture are:
There is no limitation of size
Speed is comparatively slow in this architecture
In register to register architecture, operands and results are retrieved indirectly from the main
memory through the use of large number of vector registers or scalar registers. The
processors like Cray-1 and the Fujitsu VP-200 use vector instructions in register to register
formats. The main points about register to register architecture are:
Register to register architecture has limited size.
Speed is very high as compared to the memory to memory architecture.
The hardware cost is high in this architecture.
Array Processors
Array Processor performs computations on large array of data. These are two types of Array
Processors: Attached Array Processor, and SIMD Array Processor. T hese are explained as
following below.
Attached Array Processor:
To improve the performance of the host computer in numerical computational tasks auxiliary
processor is attatched to it.
Here local memory interconnects main memory. Host computer is general purpose computer.
Attached processor is back end machine driven by the host computer.
The array processor is connected through an I/O controller to the computer & the computer
treats it as an external interface.
SIMD array processor:
This is computer with multiple process unit operating in parallel Both types of array
processors, manipulate vectors but their internal organization is different.
SIMD is a computer with multiple processing units operating in parallel.
The processing units are synchronized to perform the same operation under the control of a
common control unit. Thus providing a single instruction stream, multiple data stream
(SIMD) organization. As shown in figure, SIMD contains a set of identical processing
elements (PES) each having a local memory M.
Each PE includes:
ALU
Floating point arithmetic unit
Working registers
Master control unit controls the operation in the PEs. The function of master control unit is
to decode the instruction and determine how the instruction to be executed. If the instruction
is scalar or program control instruction then it is directly executed within the master control
unit.
Main memory is used for storage of the program while each PE uses operands stored in its
local memory.