You are on page 1of 109

DIGITAL NOTES

ON
COMPUTER ORGANIZATION & MICROPROCESSORS
(R20A1201)

B.TECH II YEAR I SEM

DEPARTMENT OF INFORMATION TECHNOLOGY

MALLA REDDY COLLEGE OF ENGINEERING & TECHNOLOGY


(Autonomous Institution – UGC, Govt. of India)
(Affiliated to JNTUH, Hyderabad, Approved by AICTE - Accredited by NBA & NAAC – ‘A’ Grade - ISO 9001:2015 Certified)
Maisammaguda, Dhulapally (Post Via. Hakimpet), Secunderabad – 500100, Telangana State, INDIA.
MALLA REDDY COLLEGE OF ENGINEERING & TECHNOLOGY
DEPARTMENT OF INFORMATION TECHNOLOGY

B.TECH II YEAR I SEM

L/T/P/C
3/-/-/3
(R20A1201) COMPUTER ORGANIZATION & MICROPROCESSORS

COURSE OBJECTIVES:
Students should be able:
1. To understand basic components of computers and architecture of 8086microprocessor
2. To learn to classify the instruction formats and various addressing modes of8086
microprocessor.
3. To know how to represent the data and understand how computations are
performed at machine level.
4. To have knowledge of the memory organization and I/O Organization.
5. To understand the parallelism both in terms of single and multiple processors.

UNIT - I
Digital Computers: Introduction, Block diagram of Digital Computer, Definition of
ComputerOrganization, Computer Design and Computer Architecture.
Basic Computer Organization and Design: Instruction codes, Computer Registers,
Computerinstructions, Timing and Control, Instruction cycle, Memory Reference Instructions,
Input – Output and Interrupt, Complete Computer Description.
Micro Programmed Control: Control memory, Address sequencing, micro program
example, design of control unit.

UNIT - II
Central Processing Unit: The 8086 Processor Architecture, Register organization, Physical
memoryorganization,GeneralBusOperation,I/OAddressingCapability,SpecialProcessorActiviti
es,Minimum and Maximum mode system andtimings.
8086 Instruction Set and Assembler Directives-Machine language instruction formats,
Addressing modes, Instruction set of 8086, Assembler directives and operators.

UNIT - III
Assembly Language Programming with 8086- Machine level programs, Machine coding
the programs, Programming with an assembler, Assembly Language example programs. Stack
structure of 8086, Interrupts and Interrupt service routines, Interrupt cycle of 8086, Interrupt
programming, Passing parameters to procedures, Macros, Timings and Delays.
UNIT - IV
Computer Arithmetic: Introduction, Addition and Subtraction, Multiplication Algorithms,
Division Algorithms, Floating - point Arithmetic operations.
Input-Output Organization: Peripheral Devices, Input-Output Interface, Asynchronous
datatransfer, Modes of Transfer, Priority Interrupt, Direct memory Access, Input –Output
Processor (IOP),Intel 8089 IOP.

UNIT - V
Memory Organization: Memory Hierarchy, Main Memory, Auxiliary memory, Associate
Memory, Cache Memory.
Pipeline and Vector Processing: Parallel Processing, Pipelining, Arithmetic Pipeline,
Instruction Pipeline, RISC Pipeline, Vector Processing, Array Processors.

TEXT BOOKS:
1. Computer Organization and Architecture, William Stallings, 9th Edition, Pearson.
2. Microprocessors and Interfacing, D V Hall, SSSP Rao, 3rd edition, McGraw Hill
IndiaEducation PrivateLtd.

REFERENCE BOOKS:
1. Carl Hamacher, Zvonko Vranesic, Safwat Zaky: Computer Organization,
5thEdition, Tata McGraw Hill,2002
2. David A. Patterson, John L. Hennessy: Computer Organization and Design –
TheHardware/ Software Interface ARM Edition, 4th Edition, Elsevier, 2009.

COURSE OUTCOMES:
Students will be able:

 To identify the basic components and the design of CPU, ALU and Control Unit.
 To interpret memory hierarchy and describe the impact on computer
cost/performance.
 To express instruction level parallelism and pipelining for high performance
Processor design.
 To represent the instruction set, instruction formats and addressing modes
of8086.
 To write assembly language programs to solve problems.
UNIT-1
.
Lecture Notes
Digital Computers: Introduction, Block diagram of Digital Computer, Definition of ComputerOrganization,
Computer Design and Computer Architecture.
Basic Computer Organization and Design: Instruction codes, Computer Registers, Computer instructions,
Timing and Control, Instruction cycle, Memory Reference Instructions, Input – Output and Interrupt,
Complete Computer Description.
Micro Programmed Control: Control memory, Address sequencing, micro program example,
design of control unit.

Introduction: A Digital computer can be considered as a digital system that performs various computational
tasks.The first electronic digital computer was developed in the late 1940s and was used primarily for
numerical computations.By convention, the digital computers use the binary number system, which has two
digits: 0 and 1. A binary digit is called a bit.A computer system is subdivided into two functional entities:
Hardware and SoftwareThe hardware consists of all the electronic components and electromechanical devices that
comprise the physical entity of the device.The software of the computer consists of the instructions and data that
the computer manipulates to perform various data-processing tasks.
 The Central Processing Unit (CPU) contains an arithmetic and logic unit for manipulating data, a number
of registers for storing data, and a control circuit for fetching and executing instructions.
 The memory unit of a digital computer contains storage for instructions and data.
 The Random Access Memory (RAM) for real-time processing of the data.
 The Input-Output devices for generating inputs from the user and displaying the final results to the user.
 The Input-Output devices connected to the computer include the keyboard, mouse, terminals, magnetic
disk drives, and other communication devices

Computer Organization:
Computer Organization is realization of what is specified by the computer architecture .It deals with how
operational attributes are linked together to meet the requirements specified by computer architecture. Some
organizational attributes are hardware details, control signals, peripherals.

EXAMPLE: Say you are in a company that manufactures cars, design and all low-level details of the car come
under computer architecture (abstract, programmers view), while making it’s parts piece by piece and connecting
together the different components of that car by keeping the basic design in mind comes under computer
organization (physical and visible).
Computer Architecture:

Computer Architecture deals with giving operational attributes of the computer or Processor to be specific. It
deals with details like physical memory, ISA (Instruction Set Architecture) of the processor, the number of bits
used to represent the data types, Input Output mechanism and technique for addressing memories.

Computer Architecture Computer Organization

Computer Architecture is concerned with the way hardware Computer Organization is concerned with the structure
components are connected together to form a computer and behaviour of a computer system as seen by the
system. user.

It acts as the interface between hardware and software. It deals with the components of a connection in a
system.

Computer Architecture helps us to understand the Computer Organization tells us how exactly all the
functionalities of a system. units in the system are arranged and interconnected.

A programmer can view architecture in terms of Whereas Organization expresses the realization of
instructions, addressing modes and registers. architecture.

While designing a computer system architecture is An organization is done on the basis of architecture.
considered first.

Computer Architecture deals with high-level design issues. Computer Organization deals with low-level design
issues.

Architecture involves Logic (Instruction sets, Addressing Organization involves Physical Components (Circuit
modes, Data types, Cache optimization) design, Adders, Signals, Peripherals)

Basic Computer Organization and Design:

Instruction Codes
Computer instructions are the basic components of a machine language program. Theyare also known as macro
operations, since each one is comprised of sequences of micro operations. Each instruction initiates a sequence
of micro operations that fetch operands from registers or memory, possibly perform arithmetic, logic, or shift
operations, and store results in registers or memory.

Instructions are encoded as binary instruction codes. Each instruction code contains of a operation code,
or opcode, which designates the overall purpose of the instruction (e.g. add, subtract, move, input, etc.). The
number of bits allocated for the opcode determined how many different instructions the architecture supports.
In addition to the opcode, many instructions also contain one or more operands, which indicate where in
registers or memory the data required for the operation is located. For example, and add instruction requires two
operands, and a not instructionrequires one.
15 12 11 65 0
+ +
| Opcode | Operand | Operand|
+ +
The opcode and operands are most often encoded as unsigned binary numbers in order to minimize the number
of bits used to store them. For example, a 4-bit opcode encoded as a binary number could represent up to 16
different operations.

The control unit is responsible for decoding the opcode and operand bits in the instruction register, and then
generating the control signals necessary to drive allother hardware in the CPU to perform the sequence of
microoperations that comprise the instruction.

Basic Computer Instruction Format:


The Basic Computer has a 16-bit instruction code similar to the examples describedabove. It supports direct
and indirect addressing modes.
How many bits are required to specify the addressing mode?
15 14 12 11 0
+ -+
| I | OP | ADDRESS |
+ -+
I = 0: direct
I = 1: indirect

Computer Instructions
All Basic Computer instruction codes are 16 bits wide. There are 3 instruction codeformats:
Memory-reference instructions take a single memory address as an operand, andhave the format:
15 14 12 11 0
+ +
| I | OP | Address |
+ +
If I = 0, the instruction uses direct addressing. If I = 1, addressing in indirect.How many memory-reference
instructions can exist?
Register-reference instructions operate solely on the AC register, and have the following format:
15 14 12 11 0
+ -+
| 0 | 111 | OP |
+ -+
How many register-reference instructions can exist? How many memory-reference instructions can coexist with
register-reference instructions?

Input/output instructions have the following format:15 14 12 11


0
+ -+
| 1 | 111 | OP |
+ -+
Timing and Control
All sequential circuits in the Basic Computer CPU are driven by a master clock, withthe exception of the INPR
register. At each clock pulse, the control unit sends control signals to control inputs of the bus, the registers, and
the ALU.

Control unit design and implementation can be done by two general methods:
 A hardwired control unit is designed from scratch using traditional digital logic design techniques to
produce a minimal, optimized circuit. In other words, the control unit is like an ASIC (application-
specific integrated circuit).

 A micro-programmed control unit is built from some sort of ROM. The desired control signals are
simply stored in the ROM, and retrieved in sequence to drive the micro operations needed by a
particular instruction.

Micro programmed control:


Micro programmed control is a control mechanism to generate control signals by using a memory called
control storage (CS), which contains the control signals. Although micro programmed control seems to
be advantageous to CISC machines, since CISC requires systematic development of sophisticated control
signals, there is no intrinsic difference between these 2 control mechanisms.
Hard-wired control:
Hardwired control is a control mechanism to generate control signals by using appropriate finite state machine
(FSM). The pair of "microinstruction-register" and "control storage address register" can be regarded as a "state
register" for the hardwired control. Note that the control storage can be regarded as a kind of combinational
logic circuit. We can assign any 0, 1 values to each output corresponding to each address, which can be
regarded as the input for a combinational logic circuit. This is a truth table.

Instruction Cycle
In this chapter, we examine the sequences of micro operations that the Basic Computer goes through for each
instruction. Here, you should begin to understand how the required control signals for each state of the CPU are
determined, and how they are generated by the control unit.
The CPU performs a sequence of micro operations for each instruction. The sequence for each instruction of the
Basic Computer can be refined into 4 abstract phases
1. Fetch instruction
2. Decode
3. Fetch operand
4. Execute
Program execution can be represented as a top-down design:

1. Program execution
a. Instruction 1
i. Fetch instruction
ii. Decode
iii. Fetch operand

iv. Execute
b. Instruction 2
i. Fetch instruction

ii. Decode
iii. Fetch operand
iv. Execute
Instruction 3 ..

Program execution begins with:


PC ← address of first instruction, SC ← 0
After this, the SC is incremented at each clock cycle until an instruction is completed, and then it is cleared to
begin the next instruction. This process repeats until a HLT instruction is executed, or until the power is shut off.

Instruction Fetch and Decode:


The instruction fetch and decode phases are the same for all instructions, so the control
functions and micro operations will be independent of the instruction code. Everything that
happens in this phase is driven entirely by timing variables T0, T1 and T2. Hence, all control
inputs in the CPU during fetch and decode are functions of these three variables alone.
T0: AR ← PC
T1: IR ← M[AR], PC ← PC + 1
T2: D0-7 ← decoded IR(12-14), AR ← IR(0-11), I ← IR(15)
For every timing cycle, we assume SC ← SC + 1 unless it is stated that SC ← 0.
The operation D0-7 ← decoded IR(12-14) is not a register transfer like most of our micro operations, but is
actually an inevitable consequence of loading a value into the IR register. Since the IR outputs 12-14 are directly
connected to a decoder, the outputsof that decoder will change as soon as the new values of IR(12-14) propagate
through the decoder.

In hardware development, unlike serial software development, it is often advantageous to


perform work that may not be necessary. Since we can performmultiple micro operations at the same time,
we might was well do everything that might be useful at the earliest possible time. Likewise, loading AR
with the address field from IR at T2 is only useful if the instruction is a memory-reference instruction. We
won't know this until T3, but there is no reason to wait since there isno harm in loading AR immediately

Input-Output and Interrupt:

The Basic Computer I/O consists of a simple terminal with a keyboard and aprinter/monitor.
The keyboard is connected serially (1 data wire) to the INPR register. INPR is a shift register capable of shifting
in external data from the keyboard one bit at a time. INPR outputs are connected in parallel to the ALU.
Shift enable
|

v
+ -+ 1 + +
| Keyboard |---/-->| INPR <|--- serial I/O clock
+ -+ + -+
|
/8
| | |
v v v
+ -+
| ALU |
+ -+
|
/ 16
v
+ -+
| AC <|--- CPU master clock
+ -+

I/O Operations:
Since input and output devices are not under the full control of the CPU (I/O events are asynchronous), the
CPU must somehow be told when an input device has new input ready to send, and an output device is ready
to receive more output. The FGI flip-flop is set to 1 after a new character is shifted into INPR. This is done by
the I/O interface, not by the control unit. This is an example of an asynchronous input event (not synchronized
with or controlled by the CPU).

The FGI flip-flop must be cleared after transferring the INPR to AC. This must bedone as a
micro operation controlled by the CU, so we must include it in the CU design. The FGO flip-flop is
set to 1 by the I/O interface after the terminal has finished displaying the last character sent. It must
be cleared by the CPU after transferring a character into OUTR. Since the keyboard controller only
sets FGI and the CPU only clears it, a JK flip-flop is convenient:

+ +
Keyboard controller --->| J Q | ------>
| | |
+ \ \ | |
) or >----->|> FGI |
+--------/-----/ | |
| | |
CPU >| K |
+
How do we control the CK input on the FGI flip-flop? (Assume leading-edge triggering.)
There are two common methods for detecting when I/O devices are ready, namely software polling and
interrupts. These two methods are discussed in the following sections.
Micro Programmed Control:

Control Memory:
Control memory is a random access memory(RAM) consisting of addressable storage registers. It is
primarily used in mini and mainframe computers. It is used as a temporary storage for data. Access to control
memory data requires less time than to main memory; this speeds up CPU operation by reducing the number of
memory references for data storage and retrieval. Access is performed as part of a control section sequence
while the master clock oscillator is running. The control memory addresses are divided into two groups: a task
modeand an executive (interrupt) mode.

Addressing words stored in control memory is via the address select logic for each of the register
groups. There can be up to five register groups in control memory. These groups select a register for fetching
data for programmed CPU operation or for maintenance console or equivalent display or storage of data via
maintenance console or equivalent. During programmed CPU operations, these registers are accessed directly by
the CPU logic. Data routing circuits are used by control memory to interconnect the registers used in control
memory. Some of the registers contained in a control memory that operate in the task andthe executive modes
include the following: Accumulators Indexes Monitor clock status indicating registers Interrupt data
registers

• The control unit in a digital computer initiates sequences of micro operations


• The complexity of the digital system is derived form the number of sequences that are
performed
• When the control signals are generated by hardware, it is hardwired
• In a bus-oriented system, the control signals that specify micro operations are groups of bitsthat
select the paths in multiplexers, decoders, and ALUs.

• The control unit initiates a series of sequential steps of micro operations

• The control variables can be represented by a string of 1’s and 0’s called a control word

• A micro programmed control unit is a control unit whose binary control variables are storedin memory

• A sequence of microinstructions constitutes a micro program

• The control memory can be a read-only memory

• Dynamic microprogramming permits a micro program to be loaded and uses a writablecontrol memory

• A computer with a micro programmed control unit will have two separate memories: amain memory and
a control memory

• The micro program consists of microinstructions that specify various internal controlsignals for execution of
register micro operations

• These microinstructions generate the micro operations to:

 fetch the instruction from main memory


 evaluate the effective address
 execute the operation
• return control to the fetch phase for the next instruction

• The control memory address register specifies the address of the microinstruction

• The control data register holds the microinstruction read from memory

• The microinstruction contains a control word that specifies one or more micro operations for the data processor

• The location for the next micro instruction may, or may not be the next in sequence

Addressing Sequencing:
Each machine instruction is executed through the application of a sequence of microinstructions.
Clearly, we must be able to sequence these; the collection of microinstructions which implements a
particular machine instruction is called a routine.
The MCU typically determines the address of the first microinstruction which implements a machine
instruction based on that instruction's opcode. Upon machine power- up, the CAR should contain the
address of the first microinstruction to be executed.
The MCU must be able to execute microinstructions sequentially (e.g., within routines), but must also
be able to ``branch'' to other microinstructions as required; hence, the need for a sequencer.

The microinstructions executed in sequence can be found sequentially in the CM, or can be found by
branching to another location within the CM. Sequential retrieval of microinstructions can be done by
simply incrementing the current CAR contents; branching requires determining the desired CW address,
and loading that into the CAR.

CAR
Control Address Register

Control ROM
control memory (CM); holds CWs
opcode
opcode field from machine instruction

mapping logic
hardware which maps opcode into microinstruction address

branch logic
determines how the next CAR value will be determined from all the various possibilities

multiplexors
implements choice of branch logic for next CAR value

incrementer
generates CAR + 1 as a possible next CAR value
SBR
used to hold return address for subroutine-call branch operations

Conditional branches are necessary in the micro program. We must be able to


perform some sequences of micro-ops only when certain situations or conditions exist (e.g., for
conditional branching at the machine instruction level); to implement these, we need to be able to
conditional execute or avoid certain microinstructions within routines.

Subroutine branches are helpful to have at the micro program level. Many routines contain
identical sequences of microinstructions; putting them into subroutines allows those routines to be
shorter, thus saving memory. Mapping of opcodes to microinstruction addresses can be done very
simply. When the CM is designed, a ``required'' length is determine for the machine instruction
routines (i.e., the length of the longest one). This is rounded up to the next power of 2, yielding a
value k such that 2 k microinstructions will be sufficient to implement any routine.

Alternately, the n-bit opcode value can be used as the ``address'' input of a 2n x M ROM; the contents of the
selected ``word'' in the ROM will be the desired M-bit CAR address for the beginning of the routine
implementing that instruction. (This technique allows for variable- length routines in the CM.) >pp We choose
between all the possible ways of generating CAR values by feeding them all into a multiplexor bank, and
implementing special branch logic which will determine how the muxes will pass on the next address to the
CAR
UNIT-II
Central Processing Unit

The 8086 Processor Architecture

8086 Microprocessor is divided into two functional units:


 EU (Execution Unit)
 BIU (Bus Interface Unit)
Execution Unit (EU):
 Execution unit gives instructions to BIU stating from where to fetch the data and then
decode and execute those instructions.
 Its function is to control operations on data using the instruction decoder & ALU. EU
has no direct connection with system buses as shown in the above figure,
 It performs operations over data through BIU.
The functional parts of 8086 microprocessors:
1. ALU: It handles all arithmetic and logical operations, like +, −, ×, /, OR, AND, NOT
operations.
2. Flag Register: It is a 16-bit register that behaves like a flip-flop, i.e. it changes its
status according to the result stored in the accumulator.
It has 9 flags and they are divided into 2 groups − Conditional Flags and Control
Flags.

 Conditional Flag: It represents the result of the last arithmetic or logical


instruction executed. Following is the list of conditional flags:

1. Carry flag − This flag indicates an overflow condition for arithmetic


operations.
2. Auxiliary flag − When an operation is performed at ALU, it results in a
carry/barrow from lower nibble (i.e. D0 – D3) to upper nibble (i.e. D4 – D7),
then this flag is set, i.e. carry given by D3 bit to D4 is AF flag. The processor
uses this flag to perform binary to BCD conversion.
3. Parity flag − This flag is used to indicate the parity of the result, i.e. when the
lower order 8-bits of the result contains even number of 1’s, then the Parity
Flag is set. For odd number of 1’s, the Parity Flag is reset.
4. Zero flag − This flag is set to 1 when the result of arithmetic or logical
operation is zero else it is set to 0.
5. Sign flag − This flag holds the sign of the result, i.e. when the result of the
operation is negative, then the sign flag is set to 1 else set to 0.
6. Overflow flag − This flag represents the result when the system capacity is
exceeded.

 Control Flags: Control flags controls the operations of the execution unit.
Following is the list of control flags:

1. Trap flag − It is used for single step control and allows the user to execute
one instruction at a time for debugging. If it is set, then the program can be
run in a single step mode.
2. Interrupt flag − It is an interrupt enable/disable flag, i.e. used to
allow/prohibit the interruption of a program. It is set to 1 for interrupt enabled
condition and set to 0 for interrupt disabled condition.
3. Direction flag − It is used in string operation. As the name suggests when it is
set then string bytes are accessed from the higher memory address to the
lower memory address and vice-a-versa.

3. General purpose register: There are 8 general purpose registers, i.e., AH, AL, BH,
BL, CH, CL, DH, and DL. These registers can be used individually to store 8-bit data
and can be used in pairs to store 16bit data. The valid register pairs are AH and AL,
BH and BL, CH and CL, and DH and DL. It is referred to the AX, BX, CX, and DX
respectively.
a) AX register − It is also known as accumulator register. It is used to store
operands for arithmetic operations.
b) BX register − It is used as a base register. It is used to store the starting base
address of the memory area within the data segment.
c) CX register − It is referred to as counter. It is used in loop instruction to store
the loop counter.
d) DX register − This register is used to hold I/O port address for I/O
instruction.
4. Stack Pointer Register: It is a 16-bit register, which holds the address from the start
of the segment to the memory location, where a word was most recently stored on the
stack.
BIU (Bus Interface Unit)
BIU takes care of all data and addresses transfers on the buses for the EU like sending
addresses, fetching instructions from the memory, reading data from the ports and the
memory as well as writing data to the ports and the memory. EU has no direction connection
with System Buses so this is possible with the BIU. EU and BIU are connected with the
Internal Bus.
It has the following functional parts −
 Instruction queue − BIU contains the instruction queue. BIU gets upto 6 bytes of
next instructions and stores them in the instruction queue. When EU executes
instructions and is ready for its next instruction, then it simply reads the instruction
from this instruction queue resulting in increased execution speed.
 Fetching the next instruction while the current instruction executes is
called pipelining.
 Segment register − BIU has 4 segment buses, i.e. CS, DS, SS& ES. It holds the
addresses of instructions and data in memory, which are used by the processor to
access memory locations. It also contains 1 pointer register IP, which holds the
address of the next instruction to executed by the EU.
o CS − It stands for Code Segment. It is used for addressing a memory location
in the code segment of the memory, where the executable program is stored.
o DS − It stands for Data Segment. It consists of data used by the program andis
accessed in the data segment by an offset address or the content of other
register that holds the offset address.
o SS − It stands for Stack Segment. It handles memory to store data and
addresses during execution.
o ES − It stands for Extra Segment. ES is additional data segment, which is
used by the string to hold the extra destination data.
 Instruction pointer − It is a 16-bit register used to hold the address of the next
instruction to be executed.
Register Organization:
A register is a very small amount of fast memory that is built in the CPU (or Processor) in
order to speed up the operation. Register is very fast and efficient than the other memories
like RAM, ROM, external memory etc. For which the registers occupied the top position in
the memory hierarchy model.
The 8086 microprocessor has a total of fourteen registers that are accessible to the
programmer. All these registers are 16-bit in size. The registers of 8086 are categorized into 5
different groups:
a) General Registers
b) Index Registers
c) Segment Registers
d) Pointer Registers
e) Status Registers

General Registers:

All general registers of the 8086 microprocessor can be used for arithmetic and logic
operations. These all general registers can be used as either 8-bit or 16-bit registers. The
general registers are:

a) AX (Accumulator): AX is used as 16-bit accumulator. The lower 8-bits of AX are


designated to use as AL and higher 8-bits as AH. AL can be used as an 8-bit
accumulator for 8-bit operation. This Accumulator used in arithmetic, logic and data
transfer operations. For manipulation and division operations, one of the numbers
must be placed in AX or AL.

b) BX (Base Register): BX is a 16 bit register, but BL indicates the lower 8-bits of BX


and BH indicates the higher 8-bits of BX. The register BX is used as address register
to form physical address in case of certain addressing modes (ex: indexed and register
indirect).

c) CX (Count Register): The register CX is used default counter in case of string and
loop instructions. Count register can also be used as a counter in string manipulation
and shift/rotate instruction.
d) DX (Data Register): DX register is a general-purpose register which may be used as
an implicit operand or destination in case of a few instructions. Data register can also
be used as a port number in I/O operations.

Index Register:

The index registers can be used for arithmetic operations but their use is usually concerned
with the memory addressing modes of the 8086 microprocessor (indexed, base indexed and
relative base indexed addressing modes). The index registers are particularly useful for string
manipulation.

a) SI (Source Index): SI is a 16-bit register. This register is used to store the offset of
source data in data segment. In other words the Source Index Register is used to point
the memory locations in the data segment.

b) DI (Destination Index): DI is a 16-bit register. This is destination index register


performs the same function as SI. There is a class of instructions called string
operations that use DI to access the memory locations in Data or Extra Segment.

Segment Register:

The 8086 architecture uses the concept of segmented memory. 8086 can able to access a
memory capacity of up to 1 megabyte. This 1 megabyte of memory is divided into 16 logical
segments. Each segment contains 64 Kbytes of memory. There are four segment registers to
access this 1 megabyte of memory. The segment registers of 8086 are:

a) CS (Code Segment): Code segment (CS) is a 16-bit register that is used for
addressing memory location in the code segment of the memory (64Kb), where the
executable program is stored. CS register cannot be changed directly. The CS register
is automatically updated during far jump, far call and far return instructions.
b) Stack segment (SS): Stack Segment (SS) is a 16-bit register that used for addressing
stack segment of the memory (64kb) where stack data is stored. SS register can be
changed directly using POP instruction.
c) Data segment (DS): Data Segment (DS) is a 16-bit register that points the data
segment of the memory (64kb) where the program data is stored. DS register can be
changed directly using POP and LDS instructions.
d) Extra segment (ES): Extra Segment (ES) is a 16-bit register that also points the data
segment of the memory (64kb) where the program data is stored. ES register can be
changed directly using POP and LES instructions.

Pointer Register:

Pointer Registers contains the offset of data (variables, labels) and instructions from their
base segments (default segments).8086 microprocessor contains three pointer registers.

a) SP (Stack Pointer): Stack Pointer register points the program stack that means SP
tores the base address of the Stack Segment.
b) BP (Base Pointer): Base Pointer register also points the same stack segment. Unlike
SP, we can use BP to access data in the other segments also.
c) IP (Instruction Pointer): The Instruction Pointer is a register that holds the address
of the next instruction to be fetched from memory. It contains the offset of the next
word of instruction code instead of its actual address.
Status Register:

The status register also called as flag register. The 8086 flag register contents indicate the
results of computation in the ALU. It also contains some flag bits to control the CPU
operations.

Flag register is 16-bit register with only nine bits that are implemented. Six of these are status
flags. The complete bit configuration of 8086 is shown in the figure.

SF (Sign Flag): This flag represents sign of the result.


0 - Result is Positive.
1 – Result is Negative

ZF (Zero Flag): ZF is set if the result produced by an instruction is zero. Otherwise, ZF is


reset.

PF (Parity Flag): This flag is set to 1, if the lower byte of the result contains even number of
1’s.
0 - Odd Parity
1 – Even Parity

CF (Carry Flag): This flag is set, when there is a carry out of MSB in case of addition or
borrow in case of subtraction.
0 – No Carry/Borrow
1 – Carry/Borrow

TF (Trap Flag): If this flag is set, the processor enters the single step execution mode. When
in the single-step mode, it executes an instruction and then jumps to a special service routine
that may determine the effect of executing the instruction. This type of operation is very
useful for debugging programs.
IF (Interrupt Flag): If this flag is set, the maskable interrupts are recognized by the CPU,
otherwise they are ignored.

DF (Direction Flag): This is used by string manipulation instructions.


0 - The string is processed beginning from the lowest address to the highest address, i.e., auto
incrementing mode.
1 - The string is processed from the highest address towards the lowest address, i.e., auto
incrementing mode.

AC (Auxiliary Carry Flag): This is set when there is a carry from the lowest nibble (i.e, bit
three during addition), or borrow for the lowest nibble (i.e, bit three, during subtraction).

OF (Over flow Flag): This flag is set, if an overflow occurs, i.e, if the result of a signed
operation is large enough to accommodate in a destination register.

Memory Organization
As far as we know 8086 is 16-bit processor that can supports 1Mbyte (i.e. 20-bit address bus:
220) of external memory over the address range 0000016 to FFFFF16. The 8086 organizes
memory as individual bytes of data. The 8086 can access any two consecutive bytes as a
word of data. The lower-addressed byte is the least significant byte of the word, and the
higher- addressed byte is its most significant byte.

Figure: Part of 1 Mbyte Memory

The above figure represents: storage location of address 0000916 contains the value 716,
while the location of address 0001016 contains the value 7D16. The 16-bit word 225A16is
stored in the locations 0000C16 to 0000D16.

The word of data is at an even-address boundary (i.e. address of least significant byte is even)
is called aligned word. The word of data is at an odd-address boundary is called misaligned
word, as shown in Figure below.
Figure: Aligned and misaligned word
To store double word four locations are needed. The double word that it’s least significant
byte address is a multiple of 4 (e.g. 0 16, 416, 816 ...) is called aligned double word. The
double word at address of non-multiples of 4 is called misaligned double word shown in
Figure below.

Figure: Aligned and misaligned double word

a) Memory Segmentation: The size of address bus of 8086 is 20 and is able to address
1 Mbytes of physical memory, but all this memory is not active at one time. Actually,
this 1Mbytes of memory are partitioned into 16 parts named as segments.Size of the
each segment is 64Kbytes (65,536).

Only four of these segments are active at a time:

 Code segment holds the program instruction codes


 Stack segment is used to store interrupt and subroutine return addresses
 Data segment stores data for the program
 Extra segment is an extra data segment (often used for shared data)
 Each of these segments are addressed by an address stored in corresponding
segment registers: CS(code segment), SS(stack segment), DS(data segment), and
ES(extra segment).
 These registers contain a 16-bit base address that points to the lowest addressed
byte of the segment. Because the segment registers cannot store 20 bits, they only
store the upper 16 bits.
 The BIU takes care of this problem by appending four 0's to the low-order bits of
the segment register. In effect, this multiplies the segment register contents by
16.

The segment registers are user accessible, which means that the programmer can change the
content of segment registers through software.

b) Programming model:

How can a 20-bit address be obtained, if there are only 16-bit registers? However, the largest
register is only 16 bits (64k); so physical addresses have to be calculated. These calculations
are done in hardware within the microprocessor.

The 16-bit contents of segment register gives the starting/ base address of particular segment.
To address a specific memory location within a segment we need an offset address. The
offset address is also 16-bit wide and it is provided by one of the associated pointer or index
register.
Figure: Software model of 8086 microprocessor

To be able to program a microprocessor, one does not need to know all of its hardware
architectural features. What is important to the programmer is being aware of the various
registers within the device and to understand their purpose, functions, operating capabilities,
and limitations.

The above figure illustrates the software architecture of the 8086 microprocessor. From this
diagram, we see that it includes fourteenl6-bit internal registers: the instruction pointer (IP),
four data registers (AX, BX, CX, and DX), two pointer registers (BP and SP), two index
registers (SI and DI), four segment registers (CS, DS, SS, and ES) and status register (SR),
with nine of its bits implemented as status and control flags.

The point to note is that the beginning segment address must begin at an address divisible by
16.Also note that the four segments need not be defined separately. It is allowable for all four
segments to completely overlap (CS = DS = ES = SS).

c) Logical and Physical Address: Addresses within a segment can range from address
00000h to address 0FFFFh. This corresponds to the 64K-bytelength of the segment. An
address within a segment is called an offset or logical address.

A logical address gives the displacement from the base address of the segment to the desired
location within it, as opposed to its "real" address, which maps directly anywhere into the 1
MByte memory space. This "real" address is called the physical address.

What is the difference between the physical and the logical address?
The physical address is 20 bits long and corresponds to the actual binary code output by the
BIU on the address bus lines. The logical address is an offset from location 0 of a given
segment.
You should also be careful when writing addresses on paper to do so clearly. To specify the
logical address XXXX in the stack segment, use the convention SS:XXXX,
which is equal to [SS] * 16 + XXXX.

Logical address is in the form of: Base Address: Offset


Offset is the displacement of the memory location from the starting location of the segment.
To calculate the physical address of the memory, BIU uses the following formula:

Physical Address = Base Address of Segment * 16 + Offset


d) Physical memory organization: The 8086’s 1Mbyte memory address space is
divided in to two independent 512Kbyte banks: the low (even) bank and the high
(odd) bank. Data bytes associated with an even address (0000016, 0000216, etc.)
reside in the low bank, and those with odd addresses (0000116, 0000316, etc.) reside
in the high bank.

Address bits A1 through A19 select the storage location that is to be accessed. They
are applied to both banks in parallel. A0and bank high enable (BHE) are used as bank-
select signals.

The four different cases that happen during accessing data:

Case 1: When a byte of data at an even address (such as X) is to be accessed:

 A0 is set to logic 0 to enable the low bank of memory.


 BHE is set to logic 1 to disable the high bank.

Case 2: When a byte of data at an odd address (such as X+1) is to be accessed:


 A0is set to logic 1 to disable the low bank of memory.
 BHE is set to logic 0 to enable the high bank.

Case 3: When a word of data at an even address (aligned word) is to be accessed:

 A0 is set to logic 0 to enable the low bank of memory.


 BHE is set to logic 0 to enable the high bank.

Case 4: When a word of data at an odd address (misaligned word) is to be accessed, then the
8086 need two bus cycles to access it:

a) During the first bus cycle, the odd byte of the word (in the high bank) is addressed

 A0 is set to logic 1 to disable the low bank of memory


 BHE is set to logic 0 to enable the high bank.
b) During the second bus cycle, the odd byte of the word (in the low bank) is addressed

 A0is set to logic 0 to enable the low bank of memory.


 BHE is set to logic 1 to disable the high bank.

General Bus Operation


 The 8086 has a combined address and data bus commonly referred as a time
multiplexed address and data bus. The main reason behind multiplexing address
and data over the same pins is the maximum utilization of processor pins and it
facilitates the use of 40 pin standard DIP package.

 The bus can be demultiplexed using a few latches and transreceivers, whenever
required.

 Basically, all the processor bus cycles consist of at least four clock cycles. These
are referred to as T1, T2, T3, T4. The address is transmitted by the processor
during T1. It is present on the bus only for one cycle.

 The negative edge of this ALE pulse is used to separate the address and the data or
status information. In maximum mode, the status lines S0, S1 and S2 are used to
indicate the type of operation.

 Status bits S3 to S7 are multiplexed with higher order address bits and the BHE
signal.

 Address is valid during T1 while status bits S3 to S7 are valid during T2 through
T4.

General Bus Cycle For 8086


Minimum Mode 8086 System:

 The microprocessor 8086 is operated in minimum mode by strapping its MN/MX pin
to logic 1.
 In this mode, all the control signals are given out by the microprocessor chip itself.
There is a single microprocessor in the minimum mode system. 
 The remaining components in the system are latches, transreceivers, clock generator,
memory and I/O devices.
 Latches are generally buffered output D-type flip-flops like 74LS373 or 8282. They
are used for separating the valid address from the multiplexed address/data signals
and are controlled by the ALE signal generated by 8086.
 Trans receivers are the bidirectional buffers and sometimes they are called as data
amplifiers. They are required to separate the valid data from the time multiplexed
address/data signals. They are controlled by two signals namely, DEN and DT/R.
 The DEN signal indicates the direction of data, i.e. from or to the processor.
 The system contains memory for the monitor and users program storage. Usually,
EPROM are used for monitor storage, while RAM for users program storage. A
system may contain I/O devices.
 The opcode fetch and read cycles are similar. Hence the timing diagram can be
categorized in two parts, the first is the timing diagram for read cycle and the second
is the timing diagram for write cycle.
 The read cycle begins in T1 with the assertion of address latch enable (ALE) signal
and also M / IO signal. During the negative going edge of this signal, the valid
address is latched on the local bus. 9
 The BHE and A0 signals address low, high or both bytes. From T1 to T4 , the M/IO
signal indicates a memory or I/O operation.
 At T2, the address is removed from the local bus and is sent to the output. The bus is
then tristated. The read (RD) control signal is also activated in T2.
 The read (RD) signal causes the address device to enable its data bus drivers. After
RD goes low, the valid data is available on the data bus. 
 The addressed device will drive the READY line high. When the processor returns the
read signal to high level, the addressed device will again tristate its bus drivers. 
 A write cycle also begins with the assertion of ALE and the emission of the address. 
 The M/IO signal is again asserted to indicate a memory or I/O operation. In T2, after
sending the address in T1, the processor sends the data to be written to the addressed
location. 
 The data remains on the bus until middle of T4 state. The WR becomes active at the
beginning of T2 (unlike RD is somewhat delayed in T2 to provide time for floating). 
 The BHE and A0 signals are used to select the proper byte or bytes of memory or I/O
word to be read or write.  The M/IO, RD and WR signals indicate the type of data
transfer as specified in table below.
Hold Response sequence:

 The HOLD pin is checked at leading edge of each clock pulse. If it is received active
by the processor before T4 of the previous cycle or during T1 state of the current
cycle, the CPU activates HLDA in the next clock cycle and for succeeding bus cycles,
the bus will be given to another requesting master.

 The control of the bus is not regained by the processor until the requesting master
does not drop the HOLD pin low.

 When the request is dropped by the requesting master, the HLDA is dropped by the
processor at the trailing edge of the next clock. 13 Hold Response Timing Cycle

Maximum Mode 8086 System:

 In the maximum mode, the 8086 is operated by strapping the MN/MX pin to ground.
 In this mode, the processor derives the status signal S2, S1, S0. Another chip called
bus controller derives the control signal using this status information. 
 In the maximum mode, there may be more than one microprocessor in the system
configuration. The components in the system are same as in the minimum mode
system.
 The basic function of the bus controller chip IC8288, is to derive control signals like
RD and WR ( for memory and I/O devices), DEN, DT/R, ALE etc. using the
information by the processor on the status lines. 15
 The bus controller chip has input lines S2, S1, S0 and CLK. These inputs to 8288 are
driven by CPU.
 It derives the outputs ALE, DEN, DT/R, MRDC, MWTC, AMWC, IORC, IOWC and
AIOWC. The AEN, IOB and CEN pins are specially useful for multiprocessor
systems. 
 AEN and IOB are generally grounded. CEN pin is usually tied to +5V. The
significance of the MCE/PDEN output depends upon the status of the IOB pin. 
 INTA pin used to issue two interrupt acknowledge pulses to the interrupt controller or
to an interrupting device. 
 IORC, IOWC are I/O read command and I/O write command signals respectively .
These signals enable an IO interface to read or write the data from or to the address
port.
 The MRDC, MWTC are memory read command and memory write command signals
respectively and may be used as memory read or write signals.
 All these command signals instructs the memory to accept or send data from or to the
bus.
 Here the only difference between in timing diagram between minimum mode and
maximum mode is the status signals used and the available control and advanced
command signals. 
 R0, S1, S2 are set at the beginning of bus cycle.8288 bus controller will output a pulse
as on the ALE and apply a required signal to its DT / R pin during T1.
 In T2, 8288 will set DEN=1 thus enabling transceivers, and for an input it will
activate MRDC or IORC. These signals are activated until T4.
 For an output, the AMWC or AIOWC is activated from T2 to T4 and MWTC or
IOWC is activated from T3 to T4.
 The status bit S0 to S2 remains active until T3 and become passive during T3 and T4.
 If reader input is not activated before T3, wait state will be inserted between T3 and
T4.
8086 Instruction Set and Assembler Directives
 A machine language instruction format has one or more number of fields associated
with it.
 The first field is called as operation code field or op-code field, which indicates the
type of operation to be performed by the CPU.
 The instruction format also contains other fields known as operand fields. 
 The CPU executes the instruction using the information which reside in these fields. 
 There are six general formats of instructions in 8086 instructions set.
 The length of an instruction may vary from 1 byte to 6 bytes.
The instruction formatsare described as follow:

1. One Byte Instruction:


 This format is only one byte long and may have the implied data or register
operands.
 The least significant 3-bits of the opcode are used for specifying the register
operand, ifany.
 Otherwise, all the 8 bits form an opcode and the operands are implied.
2. Register to Register:
 This format is 2 bytes long.
 The first byte of the code specifies the operation code and width of the
operandspecified by ‘w’ bit.
 The second byte of the code shows the register operands and R/M field, as
shownbelow:

 The register represented by the REG field is one of the operands.


 The R/M field specifies another register or memory location i.e. the other
operand.

3. Register to/from memory with no displacement:


 This format is also 2 bytes long and similar to the Register to Register format
exceptfor the MOD field as shown.

 The MOD field shows the mode of addressing. The MOD, R/M, REG and the
‘W’fields are decided in Table 2.2.
4. Register to/from Memory with Displacement:
 This type of instruction format contains 1 or 2 additional bytes for
displacement along with 2 byte format of the register to/from memory without
displacement. The format is as shown below

5. Immediate Operand to Register:


 In this format, the first byte as well as the 3-bits from the second byte which
are used for REG field in case of register-to-register format are used for
opcode.
 It also contains one or two bytes of immediate data. The complete instruction
format is as shown below.

6. Immediate Operand to Memory with 16-bit displacement:


 This type of instruction format requires 5 or 6 bytes for coding.
 The first 2 bytes contain the information regarding OPCODE, MOD and R/M
fields.The remaining 4 bytes contain 2 bytes of displacement and 2 bytes of
data as shown.
Addressing Modes of 8086:
 Addressing mode indicates a way of locating data or operands. Depending up on the
data type used in the instruction and the memory addressing modes, any instruction
may belong to one or more addressing modes or same instruction may not belong to
any of the addressing modes.
 The addressing mode describes the types of operands and the way they are accessed
for executing an instruction. According to the flow of instruction execution, the
instructions may be categorized as:

1. Sequential control flow instructions and


2. Control transfer instructions.

 Sequential control flow instructions are the instructions which after execution,
transfer control to the next instruction appearing immediately after it (in the
sequence) in the program.

For Example, the arithmetic, logic, data transfer and processor control
instructions are Sequential control flow instructions.

 The control transfer instructions on the other hand transfer control to some
predefined address or the address somehow specified in the instruction, after their
execution. 

For Example, INT, CALL, RET & JUMP instructions fall under this
category.

The addressing modes for Sequential flow instructions are explained as follows:

1. Immediate addressing mode: In this type of addressing, immediate data is a part of


instruction, and appears inthe form of successive byte or bytes.
Example: MOV AX, 0005H.

In the above example, 0005H is the immediate data. The immediate data may be 8- bit
or 16-bit insize.

2. Direct addressing mode: In the direct addressing mode, a 16-bit memory address
(offset) directly specified in theinstruction as a part of it.

Example: MOV AX, [5000H].

3. Register addressing mode: In the register addressing mode, the data is stored in a
register and it is referred using theparticular register. All the registers, except IP, may
be used in this mode.

Example: MOV BX, AX


4. Register Indirect addressing mode: Sometimes, the address of the memory location
which contains data or operands is determined in an indirect way, using the offset
registers. The mode of addressing is known as register indirect mode.
In this addressing mode, the offset address of data is in either BX or SI or DI
Register. The default segment is either DS or ES.

Example: MOV AX, [BX].

5. Indexed addressing mode: In this addressing mode, offset of the operand is stored
one of the index registers. DS & ES are the default segments for index registers SI &
DI respectively.
Example: MOV AX, [SI]

Here, data is available at an offset address stored in SI in DS.

6. Register relative addressing mode: In this addressing mode, the data is available at
an effective address formed by adding an 8-bit or 16-bit displacement with the content
of any one of the register BX, BP, SI & DI in the default(either in DS & ES) segment.

Example: MOV AX, 50H [BX]

7. Based indexed addressing mode: The effective address of data is formed in this
addressing mode, by adding content of a base register (any one of BX or BP) to the
content of an index register (any one of SI or DI). The default segment register may
be ES or DS.
Example: MOV AX, [BX][SI]

8. Relative based indexed: The effective address is formed by adding an 8 or 16-bit


displacement with the sum of contents of any of the base registers (BX or BP) and any
one of the index registers, in a default segment.
Example: MOV AX, 50H [BX] [SI]

The control transfer instructions:


 The addressing modes depend upon whether the destination location is within the
same segment or in a different one. It also depends upon the method of passing
the destination address to the processor. Basically, there are two addressing modes
for the control transfer instructions, viz. Inter segment and intra segment addressing
modes.
 If the location to which the control is to be transferred lies in a different segmentother
than the current one, the mode is called intersegment mode. If the destination
location lies in the same segment, the mode is called intersegment mode.

Addressing Modes for control transfer instructions:


1. Intersegment
· Intersegment direct
· Intersegment indirect
2. Intrasegment
· Intrasegment direct
· Intrasegment indirect

Intersegment direct:
In this mode, the address to which the control is to be transferred is in a different segment.
This addressing mode provides a means of branching from one code segment to another
code segment. Here, the CS and IP of the destination address are specified directly in the
instruction.

Example: JMP 5000H, 2000H;


jump to effective address 2000H in segment 5000H.

Intersegment indirect:
 In this mode, the address to which the control is to be transferred lies in a different
segment and it is passed to the instruction indirectly, i.e. contents of a memory block
containing four bytes, i.e.IP(LSB), IP(MSB), CS(LSB) and CS(MSB)
sequentially.
 The starting address of the memory block may be referred using any of the addressing
modes, except immediate mode.
Example: JMP [2000H].
 Jump to an address in the other segment specified at effective address 2000H in DS.

Intersegment direct mode:


 In this mode, the address to which the control is to be transferred lies in the same
segment in which the control transfers instruction lies and appears directly in the
instruction as an immediate displacement value.
 In this addressing mode, the displacement is computed relative to the content of the
instruction pointer.
 The effective address to which the control will be transferred is given by the sum of
8- or 16-bit displacement and current content of IP. In case of jump instruction,
ifthe signed displacement (d) is of 8-bits (i.e. -128<d<+127), it as short jump and
if it is of 16 bits (i.e. -32768<d<+32767), it is termed as long jump.
Example: JMP SHORT LABEL.
Intersegment indirect mode:

 In this mode, the displacement to which the control is to be transferred is in the same
segment in which the control transfer instruction lies, but it is passed to the instruction
directly. Here, the branch address is found as the content of a register or a memory
location.
 The addressing mode may be used in unconditional branch instructions.

Example: JMP [BX]; Jump to effective address stored in BX.

Instruction set of 8086:


The Instruction set of 8086 microprocessor is classified into 7, they are: -

 Data transfer instructions


 Arithmetic& logical instructions
 Program control transfer instructions
 Machine Control Instructions
 Shift / rotate instructions
 Flag manipulation instructions
 String instructions

Data transfer instructions:

Data transfer instruction, as the name suggests is for the transfer of data from memory to
internal register, from internal register to memory, from one register to another register,
from input port to internal register, from internal register to output port etc.

1. MOV Instruction:

It is a general-purpose instruction to transfer byte or word from register to register,


memory to register, register to memory or with immediate addressing.

General Form: MOV Destination, Source

Here the source and destination need to be of the same size, that is both 8
bit or both 16 bit. MOV instruction does not affect any flags.

Example:

MOV BX, 00F2H load the immediate number 00F2H in


BX register
MOV CL, Copy the 8 bit content of the memory
[2000H] location, at a displacement of 2000H
from data segment base to the CL
Register
MOV [589H], BX Copy the 16 bit content of BX register on to
the memory location, which at a
displacement of 589H from the data
segment base.

2. PUSH Instruction:

The PUSH instruction decrements the stack pointer by two and copies the word
from source to the location where stack pointer now points. Here the source must of
word size data. Source can be a general-purpose register, segment register or a
memory location.

The PUSH instruction first pushes the most significant byte to sp-1, then the least
significant to the sp-2.

Push instruction does not affect any flags.

Example:

PUSH CX Decrements SP by 2, copy content of CX to the stack


PUSH DS Decrement SP by 2 and copy DS to stack

3. POP Instruction:

The POP instruction copies a word from the stack location pointed by the stack
pointer to the destination. The destination can be a General purpose register, a
segment register or a memory location. Here after the content is copied the stack
pointer isautomatically incremented by two.

The execution pattern is similar to that of the PUSH instruction.

Example:

POP CX Copy a word from the top of the stack to CX and


increment SP by 2

4. IN & OUT Instructions:

The IN instruction will copy data from a port to the accumulator. If 8 bit is read the
data will go to AL and if 16 bit then to AX. Similarly OUT instruction is used to
copy data from accumulator to an output port.

Both IN and OUT instructions can be done using direct and indirect addressing modes.
Example:

IN AL, 0F8H Copy a byte from the port 0F8H to AL


MOV DX, 30F8H Copy port address in DX
IN AL, DX Move 8 bit data from 30F8H port
IN AX, DX Move 16 bit data from 30F8H port
OUT 047H, AL Copy contents of AL to 8 bit port 047H
MOV DX, 30F8H Copy port address in DX
OUT DX, AL Move 8 bit data to the 30F8H port
OUT DX, AX Move 16 bit data to the 30F8H port

5. XCHG Instruction:

The XCHG instruction exchanges contents of the destination and source. Here
destination and source can be register and register or register and memory location,
but XCHG cannot interchange the value of 2 memory locations.

General Format: XCHG Destination, Source

Example:

XCHG BX, CX exchange word in CX with the word in BX


XCHG AL, CL exchange byte in CL with the byte in AL
XCHG AX, SUM[BX] here physical address, which is DS+SUM+[BX]. The
content at physical address and the content of AX are
interchanged

Arithmetic and Logic Instructions:


The Arithmetic and Logical group of instruction include,

1. ADD Instruction: Add instruction is used to add the current contents of destination
with that of source and store the result in destination. Here we can use register and/or
memory locations. AF, CF, OF, PF, SF, and ZF flags are affected.

General format: ADD Destination, Source

Example:

 ADD AL, 0FH: Add the immediate content, 0FH to the content of AL and
store the result in AL
 ADD AX, BX; AX <= AX+BX
 ADD AX, 0100H – IMMEDIATE
 ADD AX, BX – REGISTER
 ADD AX,[SI] – REGISTER INDIRECT OR INDEXED
 ADD AX, [5000H] – DIRECT
 ADD [5000H], 0100H – IMMEDIATE
 ADD 0100H – DESTINATION AX (IMPLICT)

2. ADC: ADD WITH CARRY: This instruction performs the same operation as ADD
instruction, but adds the carry flag bit (which may be set as a result of the previous
calculation) to the result. All the condition code flags are affected by this instruction.
The examples of this instruction along with the modes are as follows:

Example:
 ADC AX,BX – REGISTER
 ADC AX,[SI] – REGISTER INDIRECT OR INDEXED
 ADC AX, [5000H] – DIRECT
 ADC [5000H], 0100H – IMMEDIATE
 ADC 0100H – IMMEDIATE (AX IMPLICT)

3. SUB Instruction: SUB instruction is used to subtract the current contents of


destination with that of source and store the result in destination. Here we can use
register and/or memory locations. AF, CF, OF, PF, SF,and ZF flags are affected.
General Format: SUB Destination, Source

Example:
 SUB AL, 0FH: Subtract the immediate content, 0FH from the content of AL
andstore the result in AL
 SUB AX, BX ; AX <= AX-BX
 SUB AX,0100H – IMMEDIATE (DESTINATION AX)
 SUB AX,BX – REGISTER
 SUB AX,[5000H] – DIRECT
 SUB [5000H], 0100H – IMMEDIATE

4. SBB: SUBTRACT WITH BORROW: The subtract with borrow instruction


subtracts the source operand and the borrow flag (CF) which may reflect the result of
the previous calculations, from the destination operand. Subtractionwith borrow, here
means subtracting 1 from the subtraction obtained by SUB, if carry (borrow) flag is
set.

The result is stored in the destination operand. All the flags are affected (condition
code) by this instruction. The examples of this instruction are as follows:

Example:
 SBB AX, 0100H – IMMEDIATE (DESTINATION AX)
 SBB AX, BX – REGISTER
 SBB AX,[5000H] – DIRECT
 SBB [5000H], 0100H – IMMEDIATE
5. CMP: COMPARE: The instruction compares the source operand, which may be a
register or an immediate data or a memory location, with a destination operand that
may be a register or a memory location. For comparison, it subtracts the source
operand from the destination operand but does not store the result anywhere. The
flags are affected depending upon the result of the subtraction. If both of the operands
are equal, zero flag is set. If the source operand is greater than the destination
operand, carry flag is set or else, carry flag is reset.

EXAMPLE:

 CMP BX,0100H – IMMEDIATE


 CMP AX,0100H – IMMEDIATE
 CMP [5000H], 0100H – DIRECT
 CMP BX,[SI] – REGISTER INDIRECT OR INDEXED
 CMP BX, CX – REGISTER

6. INC & DEC Instruction: INC and DEC instructions are used to increment and
decrement the content ofthe specified destination by one. AF, CF, OF, PF, SF, and ZF
flags are affected.

Example:

 INC AL: AL<= AL + 1


 INC AX: AX<=AX + 1
 DEC AL: AL<= AL – 1
 DEC AX: AX<=AX – 1

7. AND Instruction: This instruction logically ANDs each bit of the source
byte/word with the corresponding bit in the destination and stores the result in
destination. The source can be an immediate number, register or memory location,
register can be a register or memory location.

The CF and OF flags are both made zero, PF, ZF, SF are affected by the operation and
AF is undefined.

General Format: AND Destination, Source

Example:
 AND BL, AL; suppose BL=1000 0110 and AL = 1100 1010 then after the
operation BLwould be BL= 1000 0010.
 AND CX, AX; CX <= CX AND AX
 AND CL, 08; CL<= CL AND (0000 1000)

8. OR Instruction: This instruction logically ORs each bit of the source byte/word
with the corresponding bit in the destination and stores the result in destination. The
source can be an immediate number, register or memory location, register can be a
register ormemory location.
The CF and OF flags are both made zero, PF, ZF, SF are affected by the operation
and AF is undefined.

General Format: OR Destination, Source

Example:
 OR BL, AL; suppose BL=1000 0110 and AL=11001010 then
after the operation BL would be BL= 1100 1110.
 OR CX, AX ; CX <= CX AND AX
 OR CL, 08 ; CL<= CL AND (0000 1000)

9. NOT Instruction: The NOT instruction complements (inverts) the


contents of an operand register or a memory location, bit by bit. The
examples are as follows:

Example:
 NOT AX (BEFORE AX= (1011)2= (B) 16 AFTER EXECUTION AX=
(0100)2= (4)16).
 NOT [5000H]

10. XOR Instruction: The XOR operation is again carried out in a similar way to the
AND and OR operation. The constraints on the operands are also similar. The XOR
operation gives a high output, when the 2 input bits are dissimilar. Otherwise, the
output is zero. The example instructions are as follows:

Example:
 XOR AX,0098H
 XOR AX, BX
 XOR AX, [5000H]

Program Control Transfer Instructions

There are 2 types of such instructions. They are:


 Unconditional transfer instructions – CALL, RET, JMP
 Conditional transfer instructions – J condition

Unconditional Transfer Instructions

1. CALL Instruction: The CALL instruction is used to transfer execution to a


subprogram or procedure. There aretwo types of CALL instructions, near and far.

A near CALL is a call to a procedure which is in the same code segment as the
CALL instruction. 8086 when encountered a near call, it decrements the SP by 2 and
copies the offset of the next instruction after the CALL on the stack. It loads the IP
with the offset of the procedure then to start the execution of the procedure.
A far CALL is the call to a procedure residing in a different segment. Here value of
CS and offset of the next instruction both are backed up in the stack. And then
branches to the procedure by changing the content of CS with the segment base
containing procedure and IP with the offset of the first instruction of the procedure.

Example:
Near Call
CALL PRO PRO is the name of the procedure
CALL CX
Here CX contains the offset of the
first instruction of the procedure,
that is replaces the content of IP with
the content of CX
Far Call
CALL DWORD PTR[8X] New values for CS and IP are fetched
from four memory locations in the DS.
The new value for CS is fetched from
[8X] and [8X+1], the new IP is fetched
from [8X+2] and [8X+3].

2. RET Instruction: RET instruction will return execution from a procedure to the
next instruction after the CALL instruction in the calling program. If it was a near
call, then IP is replaced with the value at thetop of the stack, if it had been a far call,
then another POP of the stack is required. This second popped data from the stack is
put in the CS, thus resuming the execution of the calling program.

General format: RET


Example:
P1 PROC : Procedure declaration
MOV AX
RET : return to caller p1 ENDP

3. JMP Instruction: This is also called as unconditional jump instruction, because the
processor jumps to the specified location rather than the instruction after the JMP
instruction. Jumps can be short jumps when the target address is in the same
segment as the JMP instruction or far jumps when it is in a different segment.

General Format: JMP <target address>

Conditional transfer instructions – J condition: Conditional jumps are always short


jumps in 8086. Here jump is done only if the condition specified is true/false. If the
condition is not satisfied, then the execution proceeds in the normal way.

Example: There are many conditional jump instructions like


 JC: Jump on Carry (CF=set)
 JNC: Jump on non-carry (CF=reset)
 JZ: Jump on zero (ZF=set)
 JNO: Jump on Overflow (OF=set)

Iteration Control Instructions: These instructions are used to execute a series of


instructions some number of times. The number is specified in the CX register, which will
be automatically decremented in course of iteration. But here the destination address for
the jump must be in the range of -128 to 127 bytes.

Example: Instructions here are: -


 LOOP: loop through the set of instructions until CX is 0
 LOOPE/LOOPZ: here the set of instructions are repeated until CX=0 or ZF=0
 LOOPNE/LOOPNZ: here repeated until CX=0 or ZF=1

Machine Control Instructions


1. HLT Instruction: The HLT instruction will cause the 8086 microprocessor to
fetching and executing instructions. The 8086 will enter a halt state. The processor
gets out of this Halt signal upon an interrupt signal in INTR pin/NMI pin or a reset
signal on RESET input.

General Form: HLT

2. WAIT Instruction: When this instruction is executed, the 8086 enters into an idle
state. This idle state is continued till a high is received on the TEST input pin or a
valid interrupt signal is received. Wait affects no flags. It generally is used to
synchronize the 8086 with a peripheral device(s).

3. ESC Instruction: This instruction is used to pass instruction to a coprocessor like


8087. There is a 6 bit instruction for the coprocessor embedded in the ESC
instruction. In most cases the 8086 treats ESC and a NOP, but in some cases the
8086 will access data items in memory for the coprocessor.

4. LOCK Instruction: In multiprocessor environments, the different microprocessors


share a system bus, which is needed to access external devices like disks. LOCK
Instruction is given as prefix in the case when a processor needs exclusive access of
the system bus for a particular instruction. It affects no flags.

5. NOP Instruction: At the end of NOP instruction, no operation is done other than
the fetching anddecoding of the instruction. It takes 3 clock cycles. NOP is used to
fill in time delays or to provide space for instructions while trouble shooting. NOP
affects no flags.

Shift/Rotation Instruction

Shift instructions move the binary data to the left or right by shifting them withinthe register
or memory location. They also can perform multiplication of powers of 2 +n and division of
powers of 2-n.
There are two type of shifts logical shifting and arithmetic shifting, later is used with signed
numbers while former with unsigned.

Fig. Shift Operation

Rotate on the other hand rotates the information in a register or memoryeither from one end
to another or through the carry flag

Fig: Rotate Operations SHL/SAL Instruction

Both the instruction shifts each bit to left, and places the MSB in CF and LSB is made 0.
The destination can be of byte size or of word size, also it can be a register or amemory
location. Number of shifts is indicated by the count.
All flags are affected.

General Format: SAL/SHL Destination, count


1. SHR Instruction: This instruction shifts each bit in the specified destination to the
right and 0 is stored in the MSB position. The LSB is shifted into the carry flag. The
destination can be of byte size or of word size, also it can be a register or a memory
location. Number of shifts is indicated by the count.

All flags are affected

General Format: SHR Destination, count

Example:
 MOV BL, B7 : BL is made B7H
 SHR BL, 1 : shift the content of BL register one place to the right.

2. ROL Instruction: This instruction rotates all the bits in a specified byte or word to
the left some number of bit positions. MSB is placed as a new LSB and a new CF. The
destination can be of byte size or of word size, also it can be a register or a memory
location. Number of shifts is indicated by the count.

All flags are affected

General Format: ROL destination, count

Example:
3. ROR Instruction: This instruction rotates all the bits in a specified byte or word to
the right some number of bit positions. LSB is placed as a new MSB and a new CF.
The destination can be of byte size or of word size, also it can be a register or a
memory location. Number of shifts is indicated by the count.

All flags are affected.

General format: ROR Destination, count

Example:
 MOV BL, B7H : BL is made B7H
 ROR BL, 1 : shift the content of BL register one place to the right

4. RCR Instruction: This instruction rotates all the bits in a specified byte or word to
the right some number of bit positions along with the carry flag. LSB is placed in a
new CF and previous carry is placed in the new MSB. The destination can be of byte
size or of word size, also it can be a register or a memorylocation. Number of shifts is
indicated by the count.

All flags are affected

General Format: RCR Destination, count

Example:
 MOV BL, B7H : BL is made B7H
 RCR BL, 1 . : shift the content of BL register one place to the right

Flag Manipulation Instruction

1. STC Instruction: This instruction sets the carry flag. It does not affect any other flag.
2. CLC Instruction: This instruction resets the carry flag to zero. CLC does not affect
any other flag.
3. CMC Instruction: This instruction complements the carry flag. CMC does not affect
any other flag.
4. STD Instruction: This instruction is used to set the direction flag to one so that SI
and/or DI can be decremented automatically after execution of string instruction. STD
does not affectany other flag.
5. CLD Instruction: This instruction is used to reset the direction flag to zero so that SI
and/or DI can be incremented automatically after execution of string instruction. CLD
does not affect any other flag.
6. STI Instruction: This instruction sets the interrupt flag to 1. This enables INTR
interrupt of the 8086. STI does not affect any other flag.
7. CLI Instruction: This instruction resets the interrupt flag to 0. Due to this the 8086
will not respond to an interrupt signal on its INTR input. CLI does not affect any other
flag.
String Instructions

1. MOVS/MOVSB/MOVSW: These instructions copy a word or byte from a location


in the data segment to a location in the extra segment. The offset of the source is in
SI and that of destination is inDI. For multiple word/byte transfers the count is stored
in the CX register.

When direction flag is 0, SI and DI are incremented and when it is 1, SI and DI are
decremented.

MOVS affect no flags. MOVSB is used for byte sized movements while MOVSW is
for word sized.

Example:

CLD clear the direction flag to auto increment


SI and DI
MOV DS, AX initialize data segment register to 0
MOV ES, AX Initialize extra segment register to 0
MOV SI, 2000H Load the offset of the string1 in SI
MOVDI, 2400H Load the offset of the string2 in DI
MOV CX,04H load length of the string in CX
REP MOVSB decrement CX and MOVSB until CX will
be 0

2. REP/REPE/REP2/REPNE/REPNZ: REP is used with string instruction; it


repeats an instruction until the specified condition becomes false

Example:
 REP REPE/REPZ → CX=0
→ CX=0 OR ZF=0
 REPNE/REPNZ → CX=0 OR ZF=1

3. LODS/LODSB/LODSW: This instruction copies a byte from a string location


pointed to by SI to AL or a word from a string location pointed to by SI to AX.
LODS does not affect any flags. LODSB copies byte and LODSW copies word.

4. STOS/STOSB/STOSW: The STOS instruction is used to store a byte/word contained


in AL/AX to the offset contained in the DI register. STOS does not affect any flags.
After copying the content DI is automatically incremented or decremented, based on
the value of direction flag.

Example:
 MOVDL, OFFSET D_STRING: assign DI with destination address
 STOS D_STRING: assembler uses string name to determine byte or
word, if byte then AL is used and if of word size, AX is used.
5. CMPS/CMPSB/CMPSW: CMPS is used to compare the strings, byte wise or word
wise. The comparison is affected by subtraction of content pointed by DI from that
pointed by SI. The AF, CF, OF, PF, SF and ZF flags are affected by this instruction,
but neither operand is affected.

Example:

MOV SI, OFFSET F_STRING Point first string


MOV DI, OFFSET S_STRING MOV point second string
CX,0AH CLD set the counter as 0AH
CLD Clear direction flag to auto
increment repeatedly
REPE CMPSB Compare till unequal or counter=0

Assembler Directive
There are some instructions in the assembly language program which are not a part of
processor instruction set. These instructions are instructions to the assembler, linker
and loader. These are referred to as pseudo-operations or as assembler directives. The
assembler directives enable us to control the way in which a program assembles and
lists. They act during the assembly of a program and do not generate any executable
machine code.

There are many specialized assembler directives used in 8086 assembly


language programming:

1. ASSUME:

It is used to tell the name of the logical segment the assembler to use for aspecified
segment.

E.g.: ASSUME CS: CODE tells that the instructions for a program are in
a logical segment named CODE.

2. DB-Define Byte:

The DB directive is used to reserve byte or bytes of memory locations in the available
memory. While preparing the EXE file, this directive directs theassembler to allocate
the specified number of memory bytes to the said data type that may be a constant,
variable, string, etc. Another option of this directive also initializes the reserved
memory bytes with the ASCII codes of the characters specified as a string.

The following examples show how the DB directive is used for different purposes:
 RANKS DB 01H,02H,03H,04H
This statement directs the assembler to reserve four memory locations for list
named RANKS and initialize them with the above specified four values
 MESSAGE DB “GOOD MORNING”
This makes the assembler reserve the number of bytes of memory equal to the
number ofcharacters in the string named MESSAGE and initializes those locations by
the ASCII equivalent of these characters.
 VALUE DB 50H
This statement directs the assembler to reserve 50H memory bytes
and leave themuninitialized for the variable named VALUE.

3. DD – Define Double word


It is used to declare a double word type variable or to reserve memory
locations that can be accessed as double word.
Example:
ARRAY_POINTER DD 25629261H declares a double word named
ARRAY_POINTER

4. DQ- Define Quad Word


This directive is used to direct the assembler to reserve 4 words (8 bytes) of
memory for the specified variable and may initialize it with the specified values.

5. DT – Define Ten Bytes


The DT directive directs the assembler to define the specified variable
requiring 10-bytes forits storage and initialize the 10-bytes with the specified
values. The directive may be used in case of variables facing heavy numerical
calculations, generally processed by numerical processors.

Example:
PACKED_BCD 11223344556677889900 declares an array that is 10 bytes in
length.

6. DW – Define Word
The DW directives serves the same purposes as the DB directive, but it
now makes the assembler reserve the number of memory words (16-bit) instead
of bytes.
Some examples are given to explain this directive.

 WORDS DW 1234H, 4567H, 78ABH, 045CH

This makes the assembler reserve four words in memory (8 bytes), and
initialize the words with the specified values in the statements. During
initialization, the lower bytes are stored at the lower memory addresses, while
the upper bytes are stored at the higher addresses.

 NUMBER1 DW 1245H
This makes the assembler reserve one word in memory.

7. END – End of Program

The END directive marks the end of an assembly language program.


When the assembler comes across this END directive, it ignores the source lines
available later on. Hence, it should be ensured that the END statement should be
the last statement in the file and should not appear in between. Also, no useful
program statement should lie in the file, after the END statement.

8. ENDP – END Procedure

Used along with the name of the procedure to indicate the end of a
procedure.

Example:

SQUARE_ROOT PROC: start of procedure SQUARE_ROOT


ENDP: End of procedure

9. ENDS – End of Segment

This directive marks the end of a logical segment. The logical segments are
assigned with the names using the ASSUME directive. The names appear with the
ENDS directive as prefixes to mark the end of those particular segments. Whatever are
the contents of the segments, they should appear in the program before ENDS. Any
statement appearing after ENDS will be neglected from the segment.
The structure shown below explains the fact more clearly.

DATA SEGMENT

DATA ENDS

ASSUME CS: CODE, DS: DATA CODE

SEGMENT

CODE ENDS

ENDS

10. EQU-Equate: Used to give a name to some value or symbol. Each time the
assembler findsthe given name in the program, it will replace the name with the
value.
Example:
CORRECTION_FACTOR EQU
03H MOV AL,
CORRECTION_FACTOR
11. EVEN - Tells the assembler to increment the location counter to the next
even address if it is not already at an even address.

Used because the processor can read even addressed data in one clock cycle

12. EXTRN - Tells the assembler that the names or labels following the directive
are in someother assembly module.
Example: if a procedure in a program module assembled at a different time
from that which contains the CALL instruction ,this directive is used to tell the
assembler that the procedure is external

13. GLOBAL - Can be used in place of a PUBLIC directive or in place of an


EXTRNdirective.
It is used to make a symbol defined in one module available to other modules

Example: GLOBAL DIVISOR makes the variable DIVISOR public so that it


can beaccessed from other modules.

14. GROUP - Used to tell the assembler to group the logical statements named after
the directive into one logical group segment, allowing the contents of all the
segments to be accessed from the same group segment base.

Example: SMALL_SYSTEM GROUP CODE, DATA, STACK_SEG

15. INCLUDE - Used to tell the assembler to insert a block of source code from
the named file into the current source module. This will shorten the source code.

16. LABEL - Used to give a name to the current value in the location counter.This
directive is followed by a term that specifies the type you want associated with that
name.

Example: ENTRY_POINT LABEL FAR NEXT:


MOV AL, BL

17. NAME - Used to give a specific name to each assembly module when programs
consisting of several modules are written.

Example: NAME PC_BOARD

18. OFFSET- Used to determine the offset or displacement of a named data item or
procedure from the start of the segment which contains it.

Example: MOV BX, OFFSET PRICES

19. ORG- The location counter is set to 0000 when the assembler starts reading a
segment. The ORG directive allows setting a desired value at any point in the
program.
Example: ORG 2000H
20. PROC- Used to identify the start of a procedure.

Example: SMART_DIVIDE PROC FAR identifies the start of a procedure named


SMART_DIVIDE and tells the assembler that the procedure is far

21. PTR- Used to assign a specific type to a variable or to a label.

Example: INC BYTE PTR[BX] tells the assembler that we want to


increment the bytepointed to by BX

22. PUBLIC- Used to tell the assembler that a specified name or label will beaccessed
from other modules.

Example: PUBLIC DIVISOR, DIVIDEND makes the two variables DIVISOR and
DIVIDEND available to other assembly modules.
23. SEGMENT- Used to indicate the start of a logical segment.

Example: CODE SEGMENT indicates to the assembler the start of a logicalsegment


called CODE

24. SHORT- Used to tell the assembler that only a 1 byte displacement is
needed to code ajump instruction.

Example: JMP SHORT NEARBY_LABEL


25. TYPE - Used to tell the assembler to determine the type of a specified variable.

Example: ADD BX, TYPE WORD_ARRAY is used where we want to


increment BXto point to the next word in an array of words.

Macros:

Macro is a group of instruction. The macro assembler generates the code in theprogram each
time where the macro is “called”. Macros can be defined by MACROP and ENDM
assembler directives. Creating macro is very similar to creating a newopcode that can used in
the program, as shown below.

Example:
INIT MACRO
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
ENDM

It is important to note that macro sequences execute faster than procedures because
there is no CALL and RET instructions to execute. The assembler places the macro
instructions in the program each time when it is invoked. This procedure is known as
Macro expansion.

WHILE:

In Macro, the WHILE statement is used to repeat macro sequence until the expression
specified with it is true. Like REPEAT, end of loop is specified by ENDM statement.
The WHILE statement allows to use relational operators in its expressions.

The table-1 shows the relational operators used with WHILE statements.

PERATOR FUNCTION
EQ Equal
NE Not equal
LE Less than or equal
LT Less than
GE Greater than or equal
GT Greater than
NOT Logical inversion
AND Logical AND
OR Logical OR
Table: Shows the relational operators used with WHILE statements.
UNIT-III
Assembly Language Programming with 8086

Machine level programs


An Assembly Language Program has always the following general structure

.model small ; Select a memory model.


.stack stack_size ; Define the stack size
.data ; Variable and array declarations;
; Declare variables at this level
.code
main proc ; Write the program main code at this

levelmain endp

; Other Procedures

; Always organize your program


; into procedures

end main ; To mark the end of the source file

The Model Directive – Segment Directives:


The model directive specifies the total amount of memory the program would take. In other
words, it gives information on how much memory the assembler would allocate for the
program. This depends on the size of the data and the size of the program or code.

Segments are declared using directives. The following directives are used to specify the
following segments:
 Stack
 Data
 Code

Stack Segment:
 Used to set aside storage for the stack
 Stack addresses are computed as offsets into this segment
 Use: .stack followed by a value that indicates the size of the stack

Data Segment:
 Used to set aside storage for variables.
 Constants are defined within this segment in the program source.
 Variable addresses are computed as offsets from the start of this segment
 Use: .data followed by declarations of variables or definitions of constants.
Code Segment:
 The code segment contains executable instructions macros and calls to procedures.
 Use: code followed by a sequence of program statements

Machine Coding the Program:


Assembly languages are a family of low-level languages for programming computers,
microprocessors, microcontrollers, and other (usually) integrated circuits. They implement a
symbolic representation of the numeric machine codes and other constants needed to program
architecture particular CPU.

A program written in assembly language consists of a series of instructions that correspond


to a stream of executable instructions, when translated by an assembler that can be loaded
into memory and executed.

A utility program called an assembler is used to translate assembly language statements into
the target computer's machine code.

There are three basic kinds of control structures:


1. Sequences
2. Branching
3. Loops

It is proved that any logic problem can be solved with only sequence, choice (for e.g., if-
then-else) and repetition (do-while). This is called as Structured Theorem.

Sequential Structures:

Sequential structures are structures that are stepped through sequential. These are alsocalled
sequences or iterative structures. Basic arithmetic, logical, and bit operations are in this
category. Data moves and copies are sequences.

Branching Structures:

Branching structures consist of direct and indirect jumps (including the in famous “GOTO”),
conditional jumps (IF), nested ifs, and case (or switch) structures.

Loop Structures:

The basic looping structures are DO iterative, do WHILE, and do UNTIL. An infinite loop is
one that has no exit. Normally, infinite loops are programming errors, but event loops and
task schedulers are examples of intentional infinite loops.
Conditional Statement in Assembly Language Program. IF-.ELSE-.ENDIF Statement

The conditional statements are implemented in the assembly language program using. IF,
ELSE, ENDIF structure found in higher level language. Only MASM version 6-X supports
this. The earlier versions of the assembler does not support IF statement. Here is the general
format for the IF conditional statement.

As shown above every .IF directive must have a matching ENDIF to terminate a tested
condition. ELSE is optional. It provides an alternate action. The assembly also allows to use
relational operators with .IF statement.

.WHILE - .ENDW Statement

Like DO-WHILE statement in higher level language, the assembler supports .WHILE- .ENDW
statement. The WHILE statement is used with a condition to begin the loop, and the .ENDW
statement ends the loop.

.BREAK and .CONTINUE Statements

.BREAK and .CONTINUE statements function in the same manner in a C-language program.
The .BREAK statement is used to break out of the .WHILE loop.

.REPEAT - .UNTIL Statement

.REPEAT – .UNTIL statements allow to execute series of instructions repeatedly until some
condition occurs. The .REPEAT defines the start of the loop and .UNTIL defines the end of
loop. A .UNTIL statement has a condition. When condition is true loop is terminated.

Conditional Assembly Statement in Macros IF-.ELSE-.ENDIF Statement

The conditional assembly statements are implemented in macros using IF – ELSE – ENDIF
structure found in higher level languages. Here is the general format for the IF family of
conditional statements.
every IF directive must have a matching ENDIF to terminate a tested condition. ELSE is
optional.

REPEAT Statement

In macro the REPEAT statement is used to repeat macro sequence for a fix number of time.
The repetition count is specified immediately after the REPEAT statement as shown in the
program. The statements within the REPEAT and the first ENDM are repeated 26 times.

WHILE Statement

In macro, the WHILE statement is used to repeat macro sequence until the expression specified
with it is true. Like REPEAT, end of loop is specified by ENDM statement. The WHILE
statement allows to use relational operators in its expression.

FOR Statement

A FOR statement in the macro repeats the macro sequence for a list of data. For example, if we
pass two arguments to the macro then in the first iteration the FOR statement gives the macro
sequence using first argument and in the second iteration it gives the macro sequence using
second argument. Like WHILE statement, end of FOR is indicated by ENDM statement.

Assembly Language example programs:

A program that demonstrates the use of MOV instruction:

ORG 100h ; this directive required for a simple 1 segment .com program.
MOV AX, 0B800h ; set AX to hexadecimal value of B800h.
MOV DS, AX ; copy value of AX to DS.
MOV CL, 'A' ; set CL to ASCII code of 'A', it is 41h.
MOV CH, 1101_1111b ; set CH to binary value.
MOV BX, 15Eh ; set BX to 15Eh.
MOV [BX], CX ; copy contents of CX to memory at B800:015E
RET ; returns to operating system.

Stack Structure of 8086:


 The stack is a block of memory that may be used for temporarily storing the contents
of the registers inside the CPU.

 Stack contains a set of sequentially arranged datatypes, with the last item appearing on
top of the stack.

 T is a top-down data structure whose elements are accessed using the stack pointer (SP)
which gets decremented by two when a data word is stored in the stack and gets
incremented by two when a data word is retrieved from the stack back to the CPU
register.
 The process of storing the data in the stack is called “Pushing into” the stack and the
process of transferring the data back from the stack to the CPU register is known as
“Popping off” the stack.

 The stack is Last-In-First-Out (LIFO) data segment i.e., the data which is pushed last
will be on top of the stack and will be popped off the stack first.

 The stack pointer is a 16-bit register that contains the offset address of the memory
location in the stack segment.

 The stack segment may have a memory block of a maximum of 64 Kbytes locations
and it may be overlapped with any other segment.

 Stack Segment register (SS) contains the base address of the stack segment in the
memory.
Interrupts and Interrupt service routines
Interrupt is the method of creating a temporary halt during program execution and allows
peripheral devices to access the microprocessor. The microprocessor responds to that interrupt
with an ISR (Interrupt Service Routine), which is a short program to instruct the
microprocessor on how to handle the interrupt.
The following image shows the types of interrupts we have in a 8086 microprocessor −

Hardware Interrupts
Hardware interrupt is caused by any peripheral device by sending a signal through a specified
pin to the microprocessor.
The 8086 has two hardware interrupt pins, i.e. NMI and INTR. NMI is a non-maskable
interrupt and INTR is a maskable interrupt having lower priority. One more interrupt pin
associated is INTA called interrupt acknowledge.
NMI (Non-Maskable Interrupt)
It is a single non-maskable interrupt pin (NMI) having higher priority than the maskable
interrupt request pin (INTR)and it is of type 2 interrupt.
When this interrupt is activated, these actions take place:
 Completes the current instruction that is in progress.
 Pushes the Flag register values on to the stack.
 Pushes the CS (code segment) value and IP (instruction pointer) value of the return
address on to the stack.
 IP is loaded from the contents of the word location 00008H.
 CS is loaded from the contents of the next word location 0000AH.
 Interrupt flag and trap flag are reset to 0.
INTR
The INTR is a maskable interrupt because the microprocessor will be interrupted only if
interrupts are enabled using set interrupt flag instruction. It should not be enabled using clear
interrupt Flag instruction.
The INTR interrupt is activated by an I/O port. If the interrupt is enabled and NMI is disabled,
then the microprocessor first completes the current execution and sends ‘0’ on INTA pin twice.
The first ‘0’ means INTA informs the external device to get ready and during the second ‘0’
the microprocessor receives the 8 bit, say X, from the programmable interrupt controller.
These actions are taken by the microprocessor:
 First completes the current instruction. 
 Activates INTA output and receives the interrupt type, say X. 
 Flag register value, CS value of the return address and IP value of the return address
are pushed on to the stack.
 IP value is loaded from the contents of word location X × 4
 CS is loaded from the contents of the next word location. 
 Interrupt flag and trap flag is reset to 0

Software Interrupts
Some instructions are inserted at the desired position into the program to create interrupts.
These interrupt instructions can be used to test the working of various interrupt handlers. It
includes:

INT- Interrupt instruction with type number


It is 2-byte instruction. First byte provides the op-code and the second byte provides the
interrupt type number. There are 256 interrupt types under this group.
Its execution includes the following steps −
 Flag register value is pushed on to the stack.
 CS value of the return address and IP value of the return address are pushed on to the
stack.
 IP is loaded from the contents of the word location ‘type number’ × 4
 CS is loaded from the contents of the next word location. 
 Interrupt Flag and Trap Flag are reset to 0
The starting address for type0 interrupt is 000000H, for type1 interrupt is 00004H similarly
for type2 is 00008H and ……so on. The first five pointers are dedicated interrupt pointers. i.e.
 TYPE 0 interrupt represents division by zero situation.
 TYPE 1 interrupt represents single-step execution during the debugging of a program. 
 TYPE 2 interrupt represents non-maskable NMI interrupt.
 TYPE 3 interrupt represents break-point interrupt.
 TYPE 4 interrupt represents overflow interrupt.
The interrupts from Type 5 to Type 31 are reserved for other advanced microprocessors, and
interrupts from 32 to Type 255 are available for hardware and software interrupts.
INT 3-Break Point Interrupt Instruction
It is a 1-byte instruction having op-code is CCH. These instructions are inserted into the
program so that when the processor reaches there, then it stops the normal execution of
program and follows the break-point procedure.
Its execution includes the following steps:
 Flag register value is pushed on to the stack.
 CS value of the return address and IP value of the return address are pushed on to the
stack.
 IP is loaded from the contents of the word location 3×4 = 0000CH
 CS is loaded from the contents of the next word location. 
 Interrupt Flag and Trap Flag are reset to 0
INTO - Interrupt on overflow instruction
It is a 1-byte instruction and their mnemonic INTO. The op-code for this instruction is CEH.
As the name suggests it is a conditional interrupt instruction, i.e. it is active only when the
overflow flag is set to 1 and branches to the interrupt handler whose interrupt type number is
4. If the overflow flag is reset then, the execution continues to the next instruction.
Its execution includes the following steps:
 Flag register values are pushed on to the stack.
 CS value of the return address and IP value of the return address are pushed on to the
stack.
 IP is loaded from the contents of word location 4×4 = 00010H
 CS is loaded from the contents of the next word location. 
 Interrupt flag and Trap flag are reset to 0
 In the zeroth segment of physical address space, the first 1Kbyte(1024 locations) of
memory of 8086 (00000 to 003FF) is set aside as a table for storing the starting
addresses of Interrupt Service Procedures(ISP).
 Since 4-bytes are required for storing starting addresses of ISPs, the table can hold 256
Interrupt procedures.
 The starting address of an ISP is often called the Interrupt Vector or Interrupt Pointer.
Therefore, the table is referred as Interrupt Vector Table.\
 In this table, IP value is put in as low word of the vector & CS is put in high vector.
Passing parameters to procedures: Parameter passing to procedures can be done in the
following ways:

 Register
 Memory
 Pointers
 Stack

Passing parameters using registers:


The data to be passed is stored in the registers and these registers are accessed in the procedure
to process the data.

.model small
.data
MULTIPLICAND DW 1234H
MULTIPLIER DW 4232H
.code
MOV AX, MULTIPLICAND
MOV BX, MULTIPLIER
CALL MULTI
:
:
MULTI PROC NEAR
MUL BX ; Procedure to access data from BX register
RET
MULTI ENDP
:
:
END

The disadvantage of using registers to pass parameters is that the number of registers limits the
number of parameters you can pass.
Passing parameters using memory-
In the cases where few parameters have to be passed to and from a procedure, registers are
convenient. But, in cases when we need to pass a large number of parameters to procedure, we
use memory. This memory may be a dedicated section of general memory or a part of it.
.model small
.data
MULTIPLICAND DW 1234H ; Storage for multiplicand value
MULTIPLIER DW 4232H ; Storage for multiplier value
MULTIPLICATION DW ? ; Storage for multiplication result
.code
MOV AX, @Data
MOV DS, AX
:
:
CALL MULTI
:
:
MULTI PROC NEAR
MOV AX, MULTIPLICAND
MOV BX, MULTIPLIER
:
:
MOV MULTIPLICATION, AX ; Store the multiplication value in named memory location
RET
MULTI ENDP
END
Passing parameter using pointers-
A parameter passing method which overcomes the disadvantage of using data item names (i.e.
variable names) directly in a procedure is to use registers to pass the procedure pointers to the
desired data.

.model small
.data
MULTIPLICAND DB 12H ; Storage for multiplicand value
MULTIPLIER DB 42H ; Storage for multiplier value
MULTIPLICATION DW ? ; Storage for multiplication result
.code
MOV AX, @Data
MOV DS, AX
MOV SI, OFFSET MULTIPLICAND
MOV DI, OFFSET MULTIPLIER
MOV BX, OFFSET MULTIPLICATION
CALL MULTI
:
:
MULTI PROC NEAR
:
:
MOV AL, [SI] ; Get multiplicand value pointed by SI in accumulator
MOV BL, [DI] ; Get multiplier value pointed by DI in BL
:
:
MOV [BX], AX ; Store result in location pointed out by BX
RET
MULTI ENDP
END
Passing parameters using stack-
In order to pass the parameters using stack we push them on the stack before the call for the
procedure in the main program. The instructions used in the procedure read these parameters
from the stack. Whenever stack is used to pass parameters it is important to keep a track of
what is pushed on the stack and what is popped off the stack in the main program.

.model small
.data
MULTIPLICAND DW 1234H
MULTIPLIER DW 4232H
.code
MOV AX, @data
MOV DS, AX
:
:
PUSH MULTIPLICAND
PUSH MULTIPLIER
CALL MULTI
:
:
MULTI PROC NEAR
PUSH BP
MOV BP, SP ; Copies offset of SP into BP
MOV AX, [BP + 6] ; MULTIPLICAND value is available at
; [BP + 6] and is passed to AX
MUL WORD PTR [BP + 4] ; MULTIPLIER value is passed
POP BP
RET ; Increments SP by 4 to return address
MULTI ENDP ; End procedure
END
MACROS:

A Macro is a set of instructions grouped under a single unit. It is another method for
implementing modular programming in the 8086 microprocessors (The first one was using
Procedures).

The Macro is different from the Procedure in a way that unlike calling and returning the
control as in procedures, the processor generates the code in the program every time whenever
and wherever a call to the Macro is made.

A Macro can be defined in a program using the following assembler


directives: MACRO (used after the name of Macro before starting the body of the Macro)
and ENDM (at the end of the Macro). All the instructions that belong to the Macro lie within
these two assembler directives. The following is the syntax for defining a Macro in the 8086
Microprocessor:

Macro_name MACRO [ list of parameters ]


Instruction 1
Instruction 2
-----------
-----------
-----------
Instruction n
ENDM

And a call to Macro is made just by mentioning the name of the Macro:

Macro_name [ list of parameters]

It is optional to pass the parameters in the Macro. If you want to pass them to your macros, you
can simply mention them all in the very first statement of the Macro just after the directive:
MACRO.

The advantage of using Macro is that it avoids the overhead time involved in calling and
returning (as in the procedures). Therefore, the execution of Macros is faster as compared to
procedures. Another advantage is that there is no need for accessing stack or providing any
separate memory to it for storing and returning the address locations while shifting the
processor controls in the program.

But it should be noted that every time you call a macro, the assembler of the microprocessor
places the entire set of Macro instructions in the mainline program from where the call to Macro
is being made. This is known as Macro expansion. Due to this, the program code (which uses
Macros) takes more memory space than the code which uses procedures for implementing the
same task using the same set of instructions. Hence, it is better to use Macros where we have
small instruction sets containing less number of instructions to execute.
UNIT-IV

Computer Arithmetic: Introduction, Addition and Subtraction, Multiplication Algorithms, Division


Algorithms,Floating - point Arithmetic operations.

Input-Output Organization: Peripheral Devices, Input-Output Interface, Asynchronous datatransfer, Modes of


Transfer,Priority Interrupt, Direct memory Access, Input –Output Processor (IOP),Intel 8089 IOP

Addition and Subtraction:


Four basic computer arithmetic operations are addition, subtraction, division and multiplication. The
arithmetic operation in the digital computer manipulate data to produce results. It is necessary to design
arithmetic procedures and circuits to program arithmetic operations using algorithm. The algorithm is a
solution to any problem and it is stated by a finite number of well-defined procedural steps. The algorithms
can be developed for the following types of data.
1. Fixed point binary data in signed magnitude representation
2. Fixed point binary data in signed 2’s complement representation.
3. Floating point representation
4. Binary Coded Decimal (BCD) data

Addition and Subtraction with signed magnitude


Consider two numbers having magnitude A and B. When the signed numbers are added or subtracted, there
can be 8 different conditions depending on the sign and the operation performed as shown in the table
below:
Operation Add magnitude When A > B When A < B When A = B
(+A) + (+B) +(A + B) -- -- --
(+A) + (-B) -- +(A - B) -(B - A) +(A - B)
(-A) + (+B) -- -(A - B) +(B - A) +(A - B)
(-A) + (-B) -(A + B) -- -- --
(+A) - (+B) -- +(A - B) -(B - A) +(A - B)
(+A) - (-B) +(A + B) -- -- --
(-A) - (+B) -(A + B) -- -- --
(-A) - (-B) -- -(A - B) +(B - A) +(A - B)
From the table, we can derive an algorithm for addition and subtraction as follows:
Addition (Subtraction) Algorithm:
 When the signs of A & B are identical, add the two magnitudes and attach the sign of A to the result.
 When the sign of A & B are different, compare the magnitude and subtract the smaller number from
the large number. Choose the sign of the result to be same as A if A > B, or thecomplement of the sign
of A if A < B. If the two numbers are equal, subtract B from A and make the sign of the result positive.

Hardware Implementation

fig: Hardware for signed magnitude addition and subtraction

The hardware consists of two registers A and B to store the magnitudes, and two flip- flops As and
Bs to store the corresponding signs. The results can be stored in the register A and As which acts as an
accumulator. The subtraction is performed by adding A to the 2’s complement of B. The output carry is
transferred to the flip-flop E. The overflow may occur during the add operation which is stored in the flip-
flop A Ë… F. When m = 0, the output of E is transferred to the adder without any change along with the
input carry of ‘0".

The output of the parallel adder is equal to A + B which is an add operation. When m = 1, the
content of register B is complemented and transferred to parallel adder along with the input carry of 1.
Therefore, the output of parallel is equal to A + B’ + 1 = A – B which is a subtract operation.
Hardware Algorithm
fig: flowchart for add and subtract operations

As and Bs are compared by an exclusive-OR gate. If output=0, signs are identical, if 1 signs are different.
 For Add operation, identical signs dictate addition of magnitudes and for operation
identical signs dictate addition of magnitudes and for subtraction, different magnitudes
dictate magnitudes be added. Magnitudes are added with a micro operation EA
 Two magnitudes are subtracted if signs are different for add operation and identical for
subtract operation. Magnitudes are subtracted with a micro operation EA = B and number
(this number is checked again for 0 to make positive 0 [As=0]) in A is correct result. E = 0
indicates A < B, so we take 2’s complement of A.
Multiplication
Hardware Implementation and Algorithm
Generally, the multiplication of two final point binary number in signed magnitude representation is
performed by a process of successive shift and ADD operation. The process consists of looking at the
successive bits of the multiplier (least significant bit first). If the multiplier is 1, then the multiplicand is
copied down otherwise, 0’s are copied. The numbers
copied down in successive lines are shifted one position to the left and finally, all the numbersare
added to get the product
But, in digital computers, an adder for the summation (∑) of only two binary numbers are used and
the partial product is accumulated in register. Similarly, instead of shifting the multiplicand to the
left, the partial product is shifted to the right. The hardware for the multiplication of signed
magnitude data is shown in the figure below.

Initially, the multiplier is stored q register and the multiplicand in the B register. A register is used to store the
partial product and the sequence counter (SC) is set to a number equal to the number of bits in the multiplier.
The sum of A and B form the partial product and both shifted to the right using a statement “Shr EAQ” as
shown in the hardware algorithm. The flip flops As, Bs & Qs store the sign of A, B & Q respectively. A binary
‘0” inserted into the flip-flop E during the shift right.

Hardware Algorithm

flowchart for multiply algorithm


Example: Multiply 23 by 19 using multiply algorithm.
Multiplicand E A Q SC
Initially, 0 00000 10011 101(5)
Iteration1(Qn=1), 00000
add B 0 +10111
first partial product 10111
shrEAQ,
0 01011 11001 100(4)
Iteration2(Qn=1) 01011
Add B 1 +10111 11001
Second partial product 00010
shrEAQ, 0 10001 01100 011(3)

Iteration3(Qn=0)
0 01000 10110 010(2)
shrEAQ,

Iteration4(Qn=0)
0 00100 01011 001(1)
shrEAQ,
Iteration5(Qn=1 00100
Add B 0 +10111 01011
Fifth partial product 11011
shrEAQ,
0 01101 10101 000
FinalProductinAQ 0110110101

The final product is in register A & Q. therefore, the product is 0110110101.

Booth Algorithm
The algorithm that is used to multiply binary integers in signed 2’s complement form is called booth
multiplication algorithm. It works on the principle that the string 0’s in the multiplier doesn’t need addition but
just the shifting and the sting of 1’s from bit weight 2k to 2m can be treated as 2k+1 – 2m (Example, +14 =
001110 = 23=1 – 21 = 14). The product can be obtained by shifting the binary multiplication to the left and
subtraction the multiplier shifted left once.
According to booth algorithm, the rule for multiplication of binary integers in signed 2’s complement form are:
 The multiplicand is subtracted from the partial product of the first least significant bit is 1in a string
of 1’s in the multiplicand.
 The multiplicand is added to the partial product if the first least significant bit is 0(provided that
there was a previous 1) in a string of 0’s in the multiplier
 The partial product doesn’t change when the multiplier bit is identical to the previousmultiplier
bit.
• This algorithm is used for both the positive and negative numbers in signed 2’s complementform. The
hardware implementation of this algorithm is in figure below:

The flowchart for booth multiplication algorithm is given below:


flowchart for booth multiplication algorithm

Numerical Example: Booth algorithm


BR=10111(Multiplicand)
QR=10011(Multiplier)
Array Multiplier
The multiplication algorithm first check the bits of the multiplier one at time and form partial product.
This is a sequential process that requires a sequence of add and shift micro operation. This method is
complicated and time consuming. The multiplication of 2 binarynumbers can also be done with
one micro operation by using combinational circuit thatprovides the product all at once

Example.
Consider that the multiplicand bits are b1 and b0 and the multiplier bits are a1 and a0. The partial
product is c3c2c1c0. The multiplication two bits a0 and a1 produces a binary 1 if both the bits are 1,
otherwise it produces a binary 0. This is identical to the AND operation and can be implemented with
the AND gates as shown in figure

2-bit by 2-bit array multiplier

Division Algorithm
The division of two fixed point signed numbers can be done by a process of successive compare shift and
subtraction. When it is implemented in digital computers, instead of shifting the divisor to the right, the
dividend or the partial remainder is shifted to the left. The subtraction can be obtained by adding the number A
to the 2’s complement of number B. The information about the relative magnitudes of the information about
the relative magnitudesof numbers can be obtained from the end carry,
Hardware Implementation

Division Algorithm
The divisor is stored in register B and a double length dividend is stored in register A and Q. the dividend is
shifted to the left and the divider is subtracted by adding twice complement of the value. If E = 1, then A >=
B. In this case, a quotient bit 1 is inserted into Qn and the partial remainder is shifted to the left to repeat the
process. If E = 0, then A > B. In this case, the quotient bit Qn remains zero and the value of B is added to
restore the partial remainder in A to the previous value. The partial remainder is shifted to the left and
approaches continues until the sequence counter reaches to 0. The registers E, A & Q are shifted to the left
with 0 inserted into Qn and the previous value of E is lost as shown in the flow chart for division algorithm.

flowchart for division algorithm


This algorithm can be explained with the help of an example. Consider that the divisor is 10001 and the
dividend is 01110
Restoring method
Method described above is restoring method in which partial remainder is restored by adding the divisor to
the negative result. Other methods:
Comparison method: A and B are compared prior to subtraction. Then if A >= B, B is subtracted from A. if
A < B nothing is done. The partial remainder is then shifted left and numbers are compared again.
Comparison inspects end carry out of the parallel adder before transferring to E.
Non-restoring method: In contrast to restoring method, when A -B is negative, B is not added to restore A
but instead, negative difference is shifted left and then B is added. How is itpossible? Let’s argue:
 In flowchart for restoring method, when A < B, we restore A by operation A - B + B. Next time in a
loop,
this number is shifted left (multiplied by 2) and B subtracted again, which gives: 2 (A - B + B) –
B = 2 A - B.
 In Non-restoring method, we leave A - B as it is. Next time around the loop, the number is shifted
left and B is added: 2 (A - B) + B = 2 A - B (same as above).
Divide Overflow
The division algorithm may produce a quotient overflow called dividend overflow. The overflow can occur
of the number of bits in the quotient are more than the storage capacity ofthe register. The overflow flip-flop
DVF is set to 1 if the overflow occurs.
The division overflow can occur if the value of the half most significant bits of the dividend is equal to or
greater than the value of the divisor. Similarly, the overflow can occue=r if the dividend is divided by a 0.
The overflow may cause an error in the result or sometimes it may stop the operation. When the overflow
stops the operation of the system, then it is called divide stop.
Arithmetic Operations on Floating-Point Numbers
The rules apply to the single-precision IEEE standard format. These rulesspecify only the
major steps needed to perform the four operations. Intermediate results for both mantissas and
exponents might require more than 24 and 8 bits, respectively & overflow or an underflow may occur.
These and other aspects of the operations must be carefully considered in designing an arithmetic unit that
meets the standard. If their exponents differ, the mantissas of floating-point numbers must be shifted
with respect to each other before they are added or subtracted. Consider a
decimal example in which we wish to add 2.9400 x to 4.3100 x . We rewrite 2.9400 x
as 0.0294 x and then perform addition of the mantissas to get 4.3394 x . The rule for
addition and subtraction can be stated as follows

Add/Subtract Rule

The steps in addition (FA) or subtraction (FS) of floating-point numbers (s1, eˆ , f1) fad{s2, eˆ 2, f2) are as
follows.

1. Unpack sign, exponent, and fraction fields. Handle special operands such as zero,infinity, or
NaN(not a number).

2. Shift the significand of the number with the smaller exponent right by bits.
3. Set the result exponent er to max(e1,e2).
4. If the instruction is FA and s1= s2 or if the instruction is FS and s1 ≠ s2 then add the
significands; otherwise subtract them.
5. Count the number z of leading zeros. A carry can make z = -1. Shift the resultsignificand left z
bits or right 1 bit if z = -1
6. Round the result significand, and shift right and adjust z if there is rounding overflow,which is
a carry-out of the leftmost digit upon rounding.
7. Adjust the result exponent by er = er - z, check for overflow or underflow, and packthe result
sign, biased exponent, and fraction bits into the result word.

Multiplication and division are somewhat easier than addition and subtraction, in thatno alignment
of mantissas is needed.
BCD Adder:
BCD adder A 4-bit binary adder that is capable of adding two 4-bit words having a BCD (binary-coded
decimal) format. The result of the addition is a BCD-format 4-bit output word, representing the
decimal sum of the addend and augend, and a carry that is generated if this sum exceeds a decimal
value of 9. Decimal addition is thus possible using these device

Input-Output Organization: Peripheral Devices, Input-Output Interface, Asynchronous data transfer, modes of
Transfer,Priority Interrupt, Direct memory Access, Input –Output Processor (IOP),Intel 8089 IOP

Input-output subsystems
The Input/output organization of computer depends upon the size of computer and the peripherals connected to
it. The I/O Subsystem of the computer provides an efficient mode of communication between the central
system and the outside environment.
The most common input output devices are: Monitor, Keyboard, Mouse, Printer, Magnetic tapes
Input Output Interface provides a method for transferring information between internal storage and external
I/O devices. Peripherals connected to a computer need special communication links for interfacing them with
the central processing unit. The purpose of communication link is to resolve the differences that exist between
the central computer and each peripheral

I/O device interface


The I/O Bus consists of data lines, address lines and control lines. The I/O bus from the processor is attached to
all peripherals interface. To communicate with a particular device, the processor places a device address on
address lines. Each Interface decodes the address and control received from the I/O bus, interprets them for
peripherals and provides signals for the peripheral controller. It is also synchronizes the data flow and
supervises the transfer between peripheral and processor. Each peripheral has its own controller.
For example, the printer controller controls the paper motion, the print timing. The control lines are referred as
I/O command. The commands are as following:
Control command- A control command is issued to activate the peripheral and to inform it what to do.
Status command- A status command is used to test various status conditions in the interface and the peripheral.
Data Output command- A data output command causes the interface to respond by transferring data from the
bus into one of its registers.
Data Input command- The data input command is the opposite of the data output.

In this case the interface receives on item of data from the peripheral and places it in its buffer register. I/O
Versus Memory Bus

To communicate with I/O, the processor must communicate with the memory unit. Like the I/O bus,
the memory bus contains data, address and read/write control lines. There are 3 ways that computer buses can
be used to communicate with memory and I/O:
1. Use two Separate buses, one for memory and other for I/O.
2. Use one common bus for both memory and I/O but separate control lines for each.
3. Use one common bus for memory and I/O with common control lines.

Asynchronous Data Transfer.

But, the Asynchronous Data Transfer between two independent units requires that control signals be transmitted
between the communicating units so that the time can be indicated at which they send data. These two methods
can achieve this asynchronous way of data transfer:

o Strobe control: A strobe pulse is supplied by one unit to indicate to the other unit when the transfer has to
occur.
o Handshaking: This method is commonly used to accompany each data item being transferred with a
control signal that indicates data in the bus. The unit receiving the data item responds with another signal to
acknowledge receipt of the data.

The strobe pulse and handshaking method of asynchronous data transfer is not restricted to I/O transfer.
They are used extensively on numerous occasions requiring the transfer of data between two independent
units. So, here we consider the transmitting unit as a source and receiving unit as a destination.

Asynchronous Data Transfer Methods

The asynchronous data transfer between two independent units requires that control signals be transmitted
between the communicating units to indicate when they send the data. Thus, the two methods can achieve the
asynchronous way of data transfer.

1. Strobe Control Method

The Strobe Control method of asynchronous data transfer employs a single control line to time each transfer.
This control line is also known as a strobe, and it may be achieved either by source or destination, depending
on which initiate the transfer.

Source initiated strobe: In the below block diagram, you can see that strobe is initiated by source, and as
shown in the timing diagram, the source unit first places the data on the data bus.

Destination initiated strobe: In the below block diagram, you see that the strobe initiated by destination, and in
the timing diagram, the destination unit first activates the strobe pulse,informing the source to provide the data.
The source unit responds by placing the requested binary information on the data bus. The data must be valid and
remain on the bus long enough for the destination unit to accept it.
The falling edge of the strobe pulse can use again to trigger a destination register. The destination unit then
disables the strobe. Finally, and source removes the data from the data bus after a determined time interval.
In this case, the strobe may be a memory read control from the CPU to a memory unit. The CPU initiates the read
operation to inform the memory, which is a source unit, to place the selected word into the data bus.

Modes of Transfer

Programmed I/O Mode:


In this mode of data transfer the operations are the results in I/O instructions which is a part
of computer program. Each data transfer is initiated by a instruction in the program. Normally
the transfer is from a CPU register to peripheral device or vice-versa. Once the data is
initiated the CPU starts monitoring the interface to see when next transfer can made. The
instructions of the program keep close tabs on everything that takes place in the interface unit
and the I/O devices.

The transfer of data requires three instructions:


 Read the status register.
 Check the status of the flag bit and branch to step 1 if not set or to step 3 if set.
 Read the data register.

In this technique CPU is responsible for executing data from the memory for output and storing data in memory
for executing of Programmed I/O as shown in Fig

Drawback of the Programmed I/O:


The main drawback of the Program Initiated I/O was that the CPU has to monitor the
units all the times when the program is executing. Thus the CPU stays in a program loop
until the I/O unit indicates that it is ready for data transfer. This is a time consuming process
and the CPU time is wasted a lot in keeping an eye to the executing of program.
Interrupt-Initiated I/O:
In this method an interrupt facility an interrupt command is used to inform the device
about the start and end of transfer. In the meantime the CPU executes other program. When
the interface determines that the device is ready for data transfer it generates an Interrupt
Request and sends it to the computer.
When the CPU receives such an signal, it temporarily stops the execution of the program and
branches to a service program to process the I/O transfer and after completing it returns back
to task, what it was originally performing.
In this type of IO, computer does not check the flag. It continues to perform its task.
Whenever any device wants the attention, it sends the interrupt signal to the CPU.CPU then
deviates from what it was doing, store the return address from PC and branch to the address
of the subroutine.
There are two ways of choosing the branch address:

Vectored Interrupt: In vectored interrupt the source that interrupts the CPU
provides the branch information. This information is called interrupt
vectored.
Non-vectored Interrupt: In non-vectored interrupt, the branch address is
assigned to the fixed address in the memory.

Direct Memory Access (DMA):


In the Direct Memory Access (DMA) the interface transfer the data into and out of the memory unit
through the memory bus. The transfer of data between a fast storage device such as magnetic disk
and memory is often limited by the speed of the CPU. Removing the CPU from the path and
letting the peripheral device manage the memory buses directly would improve the speed of
transfer. This transfer technique is called Direct Memory Access (DMA)During the DMA transfer,
the CPU is idle and has no control of the memory buses. A DMA Controller takes over the buses
to manage the transfer directly between the I/O device and memory.

The CPU may be placed in an idle state in a variety of ways. One common method extensively used in
microprocessor is to disable the buses through special control signals such as:
Bus Request (BR)
Bus Grant (BG)
These two control signals in the CPU that facilitates the DMA transfer. The Bus
is used by the DMA controller to request the CPU. When this input is active, the CPU terminates the execution of
the current instruction and places the address bus, data bus and read write lines into a high Impedance state. High
Impedance state means that the output is disconnected.
The CPU activates the Bus Grant (BG) output to inform the external DMA that the Bus Request (BR)
can now take control of the buses to conduct memory transfer without processor.
When the DMA terminates the transfer, it disables the Bus Request (BR) line. The CPU disables the Bus
Grant (BG), takes control of the buses and return to its normal operation.

The transfer can be made in several ways that are:


DMA Burst
Cycle Stealing
DMA Burst: In DMA Burst transfer, a block sequence consisting of a number of memory words is transferred
in continuous burst while the DMA controller is master of the memory buses.
Cycle Stealing: Cycle stealing allows the DMA controller to transfer one data word at a time, after which
it must returns control of the buses to the CPU.

DMA Controller:
The DMA controller needs the usual circuits of an interface to communicate with the CPU and I/O
device. The DMA controller has three registers:
Address Register
Word Count Register

Control Register
Address Register: Address Register contains an address to specify the desired location in memory.
Word Count Register: WC holds the number of words to be transferred. The register is incre/decre by
one after each word transfer and internally tested for zero.
Control Register: Control Register specifies the mode of transfer

The unit communicates with the CPU via the data bus and control lines. The registers in the DMA are selected
by the CPU through the address bus by enabling the DS (DMA select) and RS (Register select) inputs. The RD
(read) and WR (write) inputs are bidirectional.
When the BG (Bus Grant) input is 0, the CPU can communicate with the DMA registers through the data bus
to read from or write to the DMA registers. When BG =1, the DMA can communicate directly with the
memory by specifying an address in the address bus and activating the RD or WR control.

DMA Transfer:
The CPU communicates with the DMA through the address and data buses as with any interface
unit. The DMA has its own address, which activates the DS and RS lines. The CPU initializes
the DMA through the data bus. Once the DMA receives the start control command, it can
transfer between the peripheral and the memory.When BG = 0 the RD and WR are input lines
allowing the CPU to communicate with the internal DMA registers. When BG=1, the RD and
WR are output lines from the DMA controller to the random access memory to specify the read
or write operation of data.

Intel 8089 Processor:


The Intel 8089 l/0 processor is contained in a 40-pin integrated circuit package. Within the 8089 are two
independent units called channels. Each channel combines the general characteristics of a processor unit with
those of a direct memory access controller.
The 8089 is designed to function as an IOP in a microcomputer system where the Intel 8086 microprocessor is
used as the CPU. The 8086 CPU initiates an l/0 operation by building a message in memory that describes the
function to be performed. The 8089 IOP reads the message from memory, carries out the operation, and notifies
the CPU when it has finished.
In contrast to the IBM 370 channel, which has only six basic l/0 commands, the 8089 IOP has 50 basic
instructions that can operate on individual bits, on bytes, or 16-bit words. The IOP can execute programs like a
CPU except that the instruction set is specifically chosen to provide efficient input-output processing.
The instruction set includes general data transfer instructions, basic arithmetic and logic operations, conditional
and unconditional branch operations, and subroutine call and return capabilities. The set also includes special
instructions to initiate DMA transfers and issue an interrupt request to the CPU. It provides efficient data
transfer between any two components attached to the system bus, such as l/O to memory, memory to memory,
or l/O to l/O.
The 8086 functions as the CPU and the 8089 as the IOP. The two units share a common memory through a bus
controller connected to a system bus, which is called a "multibus" by Intel. The IOP uses a local bus to
communicate with various interface units connected to l/O devices. The CPU communicates with the IOP by
enabling the channel attention line. The select line is used by the CPU to select one of two channels in the 8089.
The IOP gets the attention of the CPU by sending an interrupt request.
The CPU and IOP communicate with each other by writing messages for one another in system memory. The
CPU prepares the message area and signals the IOP by enabling the channel attention line. The IOP reads the
message, performs the required l/0 functions, and executes the appropriate channel program. When the channel
has completed its program, it issues an interrupt request to the CPU.
UNIT-V
Memory Organization
Memory Hierarchy

A memory unit is an essential component in any digital computer since it is needed for storing
programs and data.

Typically, a memory unit can be classified into two categories:

1. The memory unit that establishes direct communication with the CPU is called Main
Memory. The main memory is often referred to as RAM (Random Access Memory).
2. The memory units that provide backup storage are called Auxiliary Memory. For
instance, magnetic disks and magnetic tapes are the most commonly used auxiliary
memories.

Apart from the basic classifications of a memory unit, the memory hierarchy consists all of the
storage devices available in a computer system ranging from the slow but high-capacity
auxiliary memory to relatively faster main memory.

The following image illustrates the components in a typical memory hierarchy.

Auxiliary Memory:

Auxiliary memory is known as the lowest-cost, highest-capacity and slowest-access storage in


a computer system. Auxiliary memory provides storage for programs and data that are kept for
long-term storage or when not in immediate use. The most common examples of auxiliary
memories are magnetic tapes and magnetic disks.

A magnetic disk is a digital computer memory that uses a magnetization process to write,
rewrite and access data. For example, hard drives, zip disks, and floppy disks.
Magnetic tape is a storage medium that allows for data archiving, collection, and backup for
different kinds of data.

Main Memory:

The main memory in a computer system is often referred to as Random Access Memory
(RAM). This memory unit communicates directly with the CPU and with auxiliary memory
devices through an I/O processor.

The programs that are not currently required in the main memory are transferred into auxiliary
memory to provide space for currently used programs and data.

I/O Processor:

The primary function of an I/O Processor is to manage the data transfers between auxiliary
memories and the main memory.

Cache Memory:

The data or contents of the main memory that are used frequently by CPU are stored in the
cache memory so that the processor can easily access that data in a shorter time. Whenever the
CPU requires accessing memory, it first checks the required data into the cache memory. If the
data is found in the cache memory, it is read from the fast memory. Otherwise, the CPU moves
onto the main memory for the required data.

Main Memory
The main memory acts as the central storage unit in a computer system. It is a relatively large
and fast memory which is used to store programs and data during the run time operations.

The primary technology used for the main memory is based on semiconductor integrated
circuits. The integrated circuits for the main memory are classified into two major units.

1. RAM (Random Access Memory) integrated circuit chips


2. ROM (Read Only Memory) integrated circuit chips

RAM integrated circuit chips

The RAM integrated circuit chips are further classified into two possible operating
modes, static and dynamic.

The primary compositions of a static RAM are flip-flops that store the binary information. The
nature of the stored information is volatile, i.e. it remains valid as long as power is applied to
the system. The static RAM is easy to use and takes less time performing read and write
operations as compared to dynamic RAM.

The dynamic RAM exhibits the binary information in the form of electric charges that are
applied to capacitors. The capacitors are integrated inside the chip by MOS transistors. The
dynamic RAM consumes less power and provides large storage capacity in a single memory
chip.

RAM chips are available in a variety of sizes and are used as per the system requirement. The
following block diagram demonstrates the chip interconnection in a 128 * 8 RAM chip.

 A 128 * 8 RAM chip has a memory capacity of 128 words of eight bits (one byte) per
word. This requires a 7-bit address and an 8-bit bidirectional data bus.
 The 8-bit bidirectional data bus allows the transfer of data either from memory to CPU
during a read operation or from CPU to memory during a write operation.
 The read and write inputs specify the memory operation, and the two chip select (CS)
control inputs are for enabling the chip only when the microprocessor selects it.
 The bidirectional data bus is constructed using three-state buffers.
 The output generated by three-state buffers can be placed in one of the three possible
states which include a signal equivalent to logic 1, a signal equal to logic 0, or a high-
impedance state.

Note: The logic 1 and 0 are standard digital signals whereas the high-impedance state
behaves like an open circuit, which means that the output does not carry a signal and has
no logic significance.
The following function table specifies the operations of a 128 * 8 RAM chip.

From the functional table, we can conclude that the unit is in operation only when CS1 = 1
and CS2 = 0. The bar on top of the second select variable indicates that this input is enabled
when it is equal to 0.

ROM integrated circuit

The primary component of the main memory is RAM integrated circuit chips, but a portion of
memory may be constructed with ROM chips.

A ROM memory is used for keeping programs and data that are permanently resident in the
computer.

Apart from the permanent storage of data, the ROM portion of main memory is needed for
storing an initial program called a bootstrap loader. The primary function of the bootstrap
loader program is to start the computer software operating when power is turned on.

ROM chips are also available in a variety of sizes and are also used as per the system
requirement. The following block diagram demonstrates the chip interconnection in a 512 * 8
ROM chip.
 A ROM chip has a similar organization as a RAM chip. However, a ROM can only
perform read operation; the data bus can only operate in an output mode.
 The 9-bit address lines in the ROM chip specify any one of the 512 bytes stored in it.
 The value for chip select 1 and chip select 2 must be 1 and 0 for the unit to operate.
Otherwise, the data bus is said to be in a high-impedance state.

Auxiliary Memory
An Auxiliary memory is known as the lowest-cost, highest-capacity and slowest-access storage
in a computer system. It is where programs and data are kept for long-term storage or when not
in immediate use. The most common examples of auxiliary memories are magnetic tapes and
magnetic disks.

Magnetic Disks

A magnetic disk is a type of memory constructed using a circular plate of metal or plastic
coated with magnetized materials. Usually, both sides of the disks are used to carry out
read/write operations. However, several disks may be stacked on one spindle with read/write
head available on each surface.

The following image shows the structural representation for a magnetic disk.

 The memory bits are stored in the magnetized surface in spots along the concentric
circles called tracks.
 The concentric circles (tracks) are commonly divided into sections called sectors.
Magnetic Tape

Magnetic tape is a storage medium that allows data archiving, collection, and backup for
different kinds of data. The magnetic tape is constructed using a plastic strip coated with a
magnetic recording medium.

The bits are recorded as magnetic spots on the tape along several tracks. Usually, seven or nine
bits are recorded simultaneously to form a character together with a parity bit.

Magnetic tape units can be halted, started to move forward or in reverse, or can be rewound.
However, they cannot be started or stopped fast enough between individual characters. For this
reason, information is recorded in blocks referred to as records.

Associative Memory

An associative memory can be considered as a memory unit whose stored data can be identified
for access by the content of the data itself rather than by an address or memory location.

Associative memory is often referred to as Content Addressable Memory (CAM).

When a write operation is performed on associative memory, no address or memory location


is given to the word. The memory itself is capable of finding an empty unused location to store
the word.
On the other hand, when the word is to be read from an associative memory, the content of the
word, or part of the word, is specified. The words which match the specified content are located
by the memory and are marked for reading.

The following diagram shows the block representation of an Associative memory.

From the block diagram, we can say that an associative memory consists of a memory array
and logic for 'm' words with 'n' bits per word.

The functional registers like the argument register A and key register K each have n bits, one
for each bit of a word. The match register M consists of m bits, one for each memory word.

The words which are kept in the memory are compared in parallel with the content of the
argument register.

The key register (K) provides a mask for choosing a particular field or key in the argument
word. If the key register contains a binary value of all 1's, then the entire argument is compared
with each memory word. Otherwise, only those bits in the argument that have 1's in their
corresponding position of the key register are compared. Thus, the key provides a mask for
identifying a piece of information which specifies how the reference to memory is made.

The following diagram can represent the relation between the memory array and the external
registers in an associative memory.
The cells present inside the memory array are marked by the letter C with two subscripts. The
first subscript gives the word number and the second specifies the bit position in the word. For
instance, the cell Cij is the cell for bit j in word i.

A bit Aj in the argument register is compared with all the bits in column j of the array provided
that Kj = 1. This process is done for all columns j = 1, 2, 3 , n.

If a match occurs between all the unmasked bits of the argument and the bits in word i, the
corresponding bit M i in the match register is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is cleared to 0.

Cache Memory
The data or contents of the main memory that are used frequently by CPU are stored in the
cache memory so that the processor can easily access that data in a shorter time. Whenever the
CPU needs to access memory, it first checks the cache memory. If the data is not found in
cache memory, then the CPU moves into the main memory.

Cache memory is placed between the CPU and the main memory. The block diagram for a
cache memory can be represented as:
The cache is the fastest component in the memory hierarchy and approaches the speed of CPU
components.

The basic operation of a cache memory is as follows:

 When the CPU needs to access memory, the cache is examined. If the word is found in
the cache, it is read from the fast memory.
 If the word addressed by the CPU is not found in the cache, the main memory is
accessed to read the word.
 A block of words one just accessed is then transferred from main memory to cache
memory. The block size may vary from one word (the one just accessed) to about 16
words adjacent to the one just accessed.
 The performance of the cache memory is frequently measured in terms of a quantity
called hit ratio.
 When the CPU refers to memory and finds the word in cache, it is said to produce a hit.
 If the word is not found in the cache, it is in main memory and it counts as a miss.
 The ratio of the number of hits divided by the total CPU references to memory (hits
plus misses) is the hit ratio.

Pipeline and Vector Processing


Parallel Processing
Parallel processing can be described as a class of techniques which enables the system to
achieve simultaneous data-processing tasks to increase the computational speed of a computer
system.

A parallel processing system can carry out simultaneous data-processing to achieve faster
execution time. For instance, while an instruction is being processed in the ALU component of
the CPU, the next instruction can be read from memory.

The primary purpose of parallel processing is to enhance the computer processing capability
and increase its throughput, i.e. the amount of processing that can be accomplished during a
given interval of time.
A parallel processing system can be achieved by having a multiplicity of functional units that
perform identical or different operations simultaneously. The data can be distributed among
various multiple functional units.

The following diagram shows one possible way of separating the execution unit into eight
functional units operating in parallel.

The operation performed in each functional unit is indicated in each block if the diagram:

 The adder and integer multiplier performs the arithmetic operation with integer
numbers.
 The floating-point operations are separated into three circuits operating in parallel.
 The logic, shift, and increment operations can be performed concurrently on different
data. All units are independent of each other, so one number can be shifted while
another number is being incremented.
Pipelining

The term Pipelining refers to a technique of decomposing a sequential process into sub -
operations, with each sub-operation being executed in a dedicated segment that operates
concurrently with all other segments.

The most important characteristic of a pipeline technique is that several computations can be
in progress in distinct segments at the same time. The overlapping of computation is made
possible by associating a register with each segment in the pipeline. The registers provide
isolation between each segment so that each can operate on distinct data simultaneously.

The structure of a pipeline organization can be represented simply by including an input


register for each segment followed by a combinational circuit.

Let us consider an example of combined multiplication and addition operation to get a better
understanding of the pipeline organization.

The combined multiplication and addition operation is done with a stream of numbers such as:

Ai * Bi + Ci for i = 1, 2, 3, ....... , 7

The operation to be performed on the numbers is decomposed into sub-operations with each
sub-operation to be implemented in a segment within a pipeline.

The sub-operations performed in each segment of the pipeline are defined as:

R1 ← Ai, R2 ← Bi Input Ai, and Bi


R3 ← R1 * R2, R4 ← Ci Multiply, and input Ci
R5 ← R3 + R4 Add Ci to product

The following block diagram represents the combined as well as the sub-operations performed
in each segment of the pipeline.
Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a particular
segment.

The output generated by the combinational circuit in a given segment is applied as an input
register of the next segment. For instance, from the block diagram, we can see that the register
R3 is used as one of the input registers for the combinational adder circuit.

In general, the pipeline organization is applicable for two areas of computer design which
includes:

1. Arithmetic Pipeline
2. Instruction Pipeline

Arithmetic Pipeline

Arithmetic Pipelines are mostly used in high-speed computers. They are used to implement
floating-point operations, multiplication of fixed-point numbers, and similar computations
encountered in scientific problems.
To understand the concepts of arithmetic pipeline in a more convenient way, let us consider an
example of a pipeline unit for floating-point addition and subtraction.

The inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers defined as:

X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102

Where A and B are two fractions that represent the mantissa and a and b are the exponents.

The combined operation of floating-point addition and subtraction is divided into four
segments. Each segment contains the corresponding suboperation to be performed in the given
pipeline. The suboperations that are shown in the four segments are:

1. Compare the exponents by subtraction.


2. Align the mantissas.
3. Add or subtract the mantissas.
4. Normalize the result.

We will discuss each suboperation in a more detailed manner later in this section.

The following block diagram represents the suboperations performed in each segment of the
pipeline.
Note: Registers are placed after each suboperation to store the intermediate results.
1. Compare exponents by subtraction:

The exponents are compared by subtracting them to determine their difference. The larger
exponent is chosen as the exponent of the result.

The difference of the exponents, i.e., 3 - 2 = 1 determines how many times the mantissa
associated with the smaller exponent must be shifted to the right.

2. Align the mantissas:

The mantissa associated with the smaller exponent is shifted according to the difference of
exponents determined in segment one.

X = 0.9504 * 103
Y = 0.08200 * 103
3. Add mantissas:

The two mantissas are added in segment three.

Z = X + Y = 1.0324 * 103
4. Normalize the result:

After normalization, the result is written as:

Z = 0.1324 * 104
Instruction Pipeline

Pipeline processing can occur not only in the data stream but in the instruction stream as well.

Most of the digital computers with complex instructions require instruction pipeline to carry
out operations like fetch, decode and execute instructions.

In general, the computer needs to process each instruction with the following sequence of steps.

1. Fetch instruction from memory.


2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.

Each step is executed in a particular segment, and there are times when different segments may
take different times to operate on the incoming information. Moreover, there are times when
two or more segments may require memory access at the same time, causing one segment to
wait until another is finished with the memory.

The organization of an instruction pipeline will be more efficient if the instruction cycle is
divided into segments of equal duration. One of the most common examples of this type of
organization is a Four-segment instruction pipeline.

A four-segment instruction pipeline combines two or more different segments and makes it
as a single one. For instance, the decoding of the instruction can be combined with the
calculation of the effective address into one segment.

The following block diagram shows a typical example of a four-segment instruction pipeline.
The instruction cycle is completed in four segments.

Segment 1:

The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.

Segment 2:

The instruction fetched from memory is decoded in the second segment, and eventually, the
effective address is calculated in a separate arithmetic circuit.

Segment 3:

An operand from memory is fetched in the third segment.

Segment 4:

The instructions are finally executed in the last segment of the pipeline organization.
RISC Pipeline
RISC stands for Reduced Instruction Set Computers. It was introduced to execute as fast as
one instruction per clock cycle. This RISC pipeline helps to simplify the computer
architecture’s design.
It relates to what is known as the Semantic Gap, that is, the difference between the operations
provided in the high-level languages (HLLs) and those provided in computer architectures.
To avoid these consequences, the conventional response of the computer architects is to add
layers of complexity to newer architectures. This also increases the number and complexity of
instructions together with an increase in the number of addressing modes. The architecture
which resulted from the adoption of this “add more complexity” are known as Complex
Instruction Set Computers (CISC).
The main benefit of RISC to implement instructions at the cost of one per clock cycle is
continually not applicable because each instruction cannot be fetched from memory and
implemented in one clock cycle correctly under all circumstances.
The method to obtain the implementation of an instruction per clock cycle is to initiate each
instruction with each clock cycle and to pipeline the processor to manage the objective of
single-cycle instruction execution.
RISC compiler gives support to translate the high-level language program into a machine
language program. There are various issues in managing complexity about data conflicts and
branch penalties are taken care of by the RISC processors, which depends on the adaptability
of the compiler to identify and reduce the delays encountered with these issues.
Principles of RISCs Pipeline
There are various principles of RISCs pipeline which are as follows −

 Keep the most frequently accessed operands in CPU registers.


 It can minimize the register-to-memory operations.
 It can use a high number of registers to enhance operand referencing and decrease the
processor memory traffic. 
 It can optimize the design of instruction pipelines such that minimum compiler code
generation can be achieved. 
 It can use a simplified instruction set and leave out those complex and unnecessary
instructions. 
Let us consider a three-segment instruction pipeline that shows how a compiler can optimize
the machine language program to compensate for pipeline conflicts.
A frequent collection of instructions for a RISC processor is of three types are as follows −

 Data Manipulation Instructions − Manage the data in processor registers. 


 Data Transfer Instructions − These are load and store instructions that use an
effective address that is obtained by adding the contents of two registers or a register
and a displacement constant provided in the instruction. 
 Program Control Instructions − These instructions use register values and a constant
to evaluate the branch address, which is transferred to a register or the program counter
(PC).

Vector Processor

Vector processor is basically a central processing unit that has the ability to execute the
complete vector input in a single instruction.
The elements of the vector are ordered properly so as to have successive addressing format of
the memory to implement the data sequentially.
It holds a single control unit but has multiple execution units that perform the same operation
on different data elements of the vector.

Unlike scalar processors that operate on only a single pair of data, a vector processor operates
on multiple pair of data. However, one can convert a scalar code into vector code. This
conversion process is known as vectorization.

These instructions are said to be single instruction multiple data or vector instructions.

Architecture:

The functional units of a vector computer are as follows:

 IPU or instruction processing unit


 Vector register
 Scalar register
 Scalar processor
 Vector instruction controller
 Vector access controller
 Vector processor
As it has several functional pipes thus it can execute the instructions over the operands. We
know that both data and instructions are present in the memory at the desired memory location.
So, the instruction processing unit i.e., IPU fetches the instruction from the memory.

Once the instruction is fetched then IPU determines either the fetched instruction is scalar or
vector in nature. If it is scalar in nature, then the instruction is transferred to the scalar register
and then further scalar processing is performed.

While, when the instruction is a vector in nature then it is fed to the vector instruction
controller. This vector instruction controller first decodes the vector instruction then
accordingly determines the address of the vector operand present in the memory.

Then it gives a signal to the vector access controller about the demand of the respective
operand. This vector access controller then fetches the desired operand from the memory. Once
the operand is fetched then it is provided to the instruction register so that it can be processed
at the vector processor.

At times when multiple vector instructions are present, then the vector instruction controller
provides the multiple vector instructions to the task system. And in case the task system shows
that the vector task is very long then the processor divides the task into subvectors.

These subvectors are fed to the vector processor that makes use of several pipelines in order to
execute the instruction over the operand fetched from the memory at the same time. The various
vector instructions are scheduled by the vector instruction controller.

Vector processor classification


According to from where the operands are retrieved in a vector processor, pipe lined vector
computers are classified into two architectural configurations:
Memory to memory architecture :

In memory to memory architecture, source operands, intermediate and final results are
retrieved (read) directly from the main memory. For memory to memory vector instructions,
the information of the base address, the offset, the increment, and the the vector length must be
specified in order to enable streams of data transfers between the main memory and pipelines.
The processors like TI-ASC, CDC STAR-100, and Cyber-205 have vector instructions in
memory to memory formats. The main points about memory to memory architecture are:
 There is no limitation of size
 Speed is comparatively slow in this architecture

Register to register architecture:

In register to register architecture, operands and results are retrieved indirectly from the main
memory through the use of large number of vector registers or scalar registers. The
processors like Cray-1 and the Fujitsu VP-200 use vector instructions in register to register
formats. The main points about register to register architecture are:
 Register to register architecture has limited size.
 Speed is very high as compared to the memory to memory architecture.
 The hardware cost is high in this architecture.
Array Processors
Array Processor performs computations on large array of data. These are two types of Array
Processors: Attached Array Processor, and SIMD Array Processor. T hese are explained as
following below.
Attached Array Processor:

To improve the performance of the host computer in numerical computational tasks auxiliary
processor is attatched to it.

Attached array processor has two interfaces:


1. Input output interface to a common processor.
2. Interface with a local memory.

Here local memory interconnects main memory. Host computer is general purpose computer.
Attached processor is back end machine driven by the host computer.
The array processor is connected through an I/O controller to the computer & the computer
treats it as an external interface.
SIMD array processor:
This is computer with multiple process unit operating in parallel Both types of array
processors, manipulate vectors but their internal organization is different.
SIMD is a computer with multiple processing units operating in parallel.
The processing units are synchronized to perform the same operation under the control of a
common control unit. Thus providing a single instruction stream, multiple data stream
(SIMD) organization. As shown in figure, SIMD contains a set of identical processing
elements (PES) each having a local memory M.
Each PE includes:
 ALU
 Floating point arithmetic unit
 Working registers

Master control unit controls the operation in the PEs. The function of master control unit is
to decode the instruction and determine how the instruction to be executed. If the instruction
is scalar or program control instruction then it is directly executed within the master control
unit.
Main memory is used for storage of the program while each PE uses operands stored in its
local memory.

You might also like