You are on page 1of 31

Computer Architecture

DIGITAL NOTES
ON
COMPUTER ARCHITECTURE

TOPIC COVERED
 I/O Organization
 Data Transfer to Array Processing

MICROTEK COLLEGE OF MANAGEMENT & TECHNOLOGY


DEPARTMENT OF COMPUTER SCIENCE

PREPARED BY: SHUBHANSHU KUSHWAHA

1|Page
Computer Architecture

INDEX
1. I/O Organization 03 - 19
2. Data Transfer to Array Processing 20 - 31

2|Page
Computer Architecture

Peripherals Devices
A Peripheral Device is defined as the device which provides input/output functions for a computer
and serves as an auxiliary computer device without computing-intensive functionality.
Generally peripheral devices, however, are not essential for the computer to perform its basic tasks,
they can be thought of as an enhancement to the user’s experience. A peripheral device is a device
that is connected to a computer system but is not part of the core computer system architecture.
Generally, more people use the term peripheral more loosely to refer to a device external to the
computer case.

Classification of Peripheral devices

It is generally classified into 3 basic categories which are given below:


1. Input Devices:
The input devices is defined as it converts incoming data and instructions into a pattern of
electrical signals in binary code that are comprehensible to a digital computer.
Example:
Keyboard, mouse, scanner, microphone etc.
2. Output Devices:
An output device is generally reverse of the input process and generally translating the digitized
signals into a form intelligible to the user. The output device is also performed for sending data
from one computer system to another. For some time punched-card and paper-tape readers were
extensively used for input, but these have now been supplanted by more efficient devices.
Example:
Monitors, headphones, printers etc.
3. Storage Devices:
Storage devices are used to store data in the system which is required for performing any
operation in the system. The storage device is one of the most requirement devices and also
provide better compatibility.
Example:
Hard disk, magnetic tape, Flash memory etc.

Advantage of Peripherals Devices

Peripherals devices provides more feature due to this operation of the system is easy. These are
given below:
 It is helpful for taking input very easily.
 It is also provided a specific output.
 It has a storage device for storing information or data
 It also improves the efficiency of the system.

3|Page
Computer Architecture

ASCII Alphanumeric Characters


ASCII represents American Standard Code for Information Interchange. It is the standard binary
code used to represent alphanumeric characters. Alphanumeric characters are used for the transfer
of information to and from the I/O devices and the computer. This standard helps seven bits to code
128 characters. However, there is an additional bit on the left that is always assigned 0. Therefore,
there are 8 bits in total.
The ASCII code consists of 34 nonprinting characters and 94 characters used for various control
operations. There are 26 uppercase letters A through Z, 26 lowercase letters a through z, numerals
from 0 to 9, and 32 printable characters including %,*.
The control characters are used to route the data and arrange the printed text into a prescribed
format.
A list of control characters is shown in the table.
Control Character and Description

Control Character Description

NUL Null

SOH Start of Heading

STX Start of text

EOT End of the transmission

ENQ Enquiry

ACK Acknowledge

DLE Data Link Escape

ETB End of the transmission block

EM End of medium

Types of Control Characters


There are three types of control characters that are as follows −

 Format Effectors − It can control the design of printing. It contains familiar typewrite controls
including Back Space (BS), Horizontal Tabulation (HT), and Carriage Return (CR).

4|Page
Computer Architecture

 Information Separators − It can separate the information into divisions including paragraphs and
pages. It includes Record Separator (RS) and File Separator (FS).
 Communication Control − It can be used during the transmission of text between remote terminals.

I/O Interface
The I/O interface supports a method by which data is transferred between internal storage and
external I/O devices. All the peripherals connected to a computer require special communication
connections for interfacing them with the CPU.

I/O Bus and Interface Modules


The I/O bus is the route used for peripheral devices to interact with the computer processor. A typical
connection of the I/O bus to I/O devices is shown in the figure.

The I/O bus includes data lines, address lines, and control lines. In any general-purpose computer,
the magnetic disk, printer, and keyboard, and display terminal are commonly employed. Each
peripheral unit has an interface unit associated with it. Each interface decodes the control and
address received from the I/O bus.
It can describe the address and control received from the peripheral and supports signals for the
peripheral controller. It also conducts the transfer of information between peripheral and processor
and also integrates the data flow.
The I/O bus is linked to all peripheral interfaces from the processor. The processor locates a device
address on the address line to interact with a specific device. Each interface contains an address
decoder attached to the I/O bus that monitors the address lines.
When the address is recognized by the interface, it activates the direction between the bus lines
and the device that it controls. The interface disables the peripherals whose address does not
equivalent to the address in the bus.
An interface receives any of the following four commands −

5|Page
Computer Architecture

 Control − A command control is given to activate the peripheral and to inform its next task. This
control command depends on the peripheral, and each peripheral receives its sequence of control
commands, depending on its mode of operation.
 Status − A status command can test multiple test conditions in the interface and the peripheral.
 Data Output − A data output command creates the interface counter to the command by sending
data from the bus to one of its registers.
 Data Input − The data input command is opposite to the data output command. In data input, the
interface gets an element of data from the peripheral and places it in its buffer register.

Asynchronous Data Transfer


The internal operations in an individual unit of a digital system are synchronized using clock pulse. It
means clock pulse is given to all registers within a unit. And all data transfer among internal registers
occurs simultaneously during the occurrence of the clock pulse. Now, suppose any two units of a
digital system are designed independently, such as CPU and I/O interface.

If the registers in the I/O interface share a common clock with CPU registers, then transfer between
the two units is said to be synchronous. But in most cases, the internal timing in each unit is
independent of each other, so each uses its private clock for its internal registers. In this case, the
two units are said to be asynchronous to each other, and if data transfer occurs between them, this
data transfer is called Asynchronous Data Transfer.

But, the Asynchronous Data Transfer between two independent units requires that control signals be
transmitted between the communicating units so that the time can be indicated at which they send
data. These two methods can achieve this asynchronous way of data transfer:

o Strobe control: A strobe pulse is supplied by one unit to indicate to the other unit when the transfer
has to occur.
o Handshaking: This method is commonly used to accompany each data item being transferred with a
control signal that indicates data in the bus. The unit receiving the data item responds with another
signal to acknowledge receipt of the data.

The strobe pulse and handshaking method of asynchronous data transfer is not restricted to I/O
transfer. They are used extensively on numerous occasions requiring the transfer of data between
two independent units. So, here we consider the transmitting unit as a source and receiving unit as
a destination.

Skip Ad

6|Page
Computer Architecture
For example, the CPU is the source during output or write transfer and the destination unit during
input or read transfer.

Therefore, the control sequence during an asynchronous transfer depends on whether the transfer is
initiated by the source or by the destination.

So, while discussing each data transfer method asynchronously, you can see the control sequence in
both terms when it is initiated by source or by destination. In this way, each data transfer method
can be further divided into parts, source initiated and destination initiated.

Asynchronous Data Transfer Methods


The asynchronous data transfer between two independent units requires that control signals be
transmitted between the communicating units to indicate when they send the data. Thus, the two
methods can achieve the asynchronous way of data transfer.

1. Strobe Control Method

The Strobe Control method of asynchronous data transfer employs a single control line to time each
transfer. This control line is also known as a strobe, and it may be achieved either by source or
destination, depending on which initiate the transfer.

a. Source initiated strobe: In the below block diagram, you can see that strobe is initiated by source,
and as shown in the timing diagram, the source unit first places the data on the data bus.

After a brief delay to ensure that the data resolve to a stable value, the source activates a strobe pulse. The
information on the data bus and strobe control signal remains in the active state for a sufficient time to allow
the destination unit to receive the data.
The destination unit uses a falling edge of strobe control to transfer the contents of a data bus to one of its
internal registers. The source removes the data from the data bus after it disables its strobe pulse. Thus, new
valid data will be available only after the strobe is enabled again.
In this case, the strobe may be a memory-write control signal from the CPU to a memory unit. The CPU places
the word on the data bus and informs the memory unit, which is the destination.
7|Page
Computer Architecture

b. Destination initiated strobe: In the below block diagram, you see that the strobe initiated by
destination, and in the timing diagram, the destination unit first activates the strobe pulse, informing
the source to provide the data.

The source unit responds by placing the requested binary information on the data bus. The data must
be valid and remain on the bus long enough for the destination unit to accept it.
The falling edge of the strobe pulse can use again to trigger a destination register. The destination unit
then disables the strobe. Finally, and source removes the data from the data bus after a determined
time interval.
In this case, the strobe may be a memory read control from the CPU to a memory unit. The CPU initiates
the read operation to inform the memory, which is a source unit, to place the selected word into the
data bus.

2. Handshaking Method

The strobe method has the disadvantage that the source unit that initiates the transfer has no way
of knowing whether the destination has received the data that was placed in the bus. Similarly, a
destination unit that initiates the transfer has no way of knowing whether the source unit has placed
data on the bus.

So this problem is solved by the handshaking method. The handshaking method introduces a second
control signal line that replays the unit that initiates the transfer.

In this method, one control line is in the same direction as the data flow in the bus from the source
to the destination. The source unit uses it to inform the destination unit whether there are valid data
in the bus.

The other control line is in the other direction from the destination to the source. This is because the
destination unit uses it to inform the source whether it can accept data. And in it also, the sequence
of control depends on the unit that initiates the transfer. So it means the sequence of control depends
on whether the transfer is initiated by source and destination.

8|Page
Computer Architecture

o Source initiated handshaking: In the below block diagram, you can see that two handshaking lines
are "data valid", which is generated by the source unit, and "data accepted", generated by the
destination unit.

The timing diagram shows the timing relationship of the exchange of signals between the two units.
The source initiates a transfer by placing data on the bus and enabling its data valid signal. The
destination unit then activates the data accepted signal after it accepts the data from the bus.
The source unit then disables its valid data signal, which invalidates the data on the bus.
After this, the destination unit disables its data accepted signal, and the system goes into its initial
state. The source unit does not send the next data item until after the destination unit shows readiness
to accept new data by disabling the data accepted signal.
This sequence of events described in its sequence diagram, which shows the above sequence in which
the system is present at any given time.
o Destination initiated handshaking: In the below block diagram, you see that the two handshaking
lines are "data valid", generated by the source unit, and "ready for data" generated by the destination
unit.
Note that the name of signal data accepted generated by the destination unit has been changed to
ready for data to reflect its new meaning.

9|Page
Computer Architecture

The destination transfer is initiated, so the source unit does not place data on the data bus until it
receives a ready data signal from the destination unit. After that, the handshaking process is the same
as that of the source initiated.
The sequence of events is shown in its sequence diagram, and the timing relationship between signals
is shown in its timing diagram. Therefore, the sequence of events in both cases would be identical.

Advantages of Asynchronous Data Transfer


Asynchronous Data Transfer in computer organization has the following advantages, such as:

o It is more flexible, and devices can exchange information at their own pace. In addition, individual data
characters can complete themselves so that even if one packet is corrupted, its predecessors and
successors will not be affected.
o It does not require complex processes by the receiving device. Furthermore, it means that inconsistency
in data transfer does not result in a big crisis since the device can keep up with the data stream. It also
makes asynchronous transfers suitable for applications where character data is generated irregularly.

Disadvantages of Asynchronous Data Transfer


There are also some disadvantages of using asynchronous data for transfer in computer organization,
such as:

o The success of these transmissions depends on the start bits and their recognition. Unfortunately, this
can be easily susceptible to line interference, causing these bits to be corrupted or distorted.

10 | P a g e
Computer Architecture

o A large portion of the transmitted data is used to control and identify header bits and thus carries no
helpful information related to the transmitted data. This invariably means that more data packets need
to be sent.

11 | P a g e
Computer Architecture

I/O Interface Mode of Transfer


The method that is used to transfer information between internal storage and external I/O devices is
known as I/O interface. The CPU is interfaced using special communication links by the peripherals
connected to any computer system. These communication links are used to resolve the differences
between CPU and peripheral. There exists special hardware components between CPU and
peripherals to supervise and synchronize all the input and output transfers that are called interface
units.
Mode of Transfer:
The binary information that is received from an external device is usually stored in the memory unit.
The information that is transferred from the CPU to the external device is originated from the
memory unit. CPU merely processes the information but the source and target is always the memory
unit. Data transfer between CPU and the I/O devices may be done in different modes.
Data transfer to and from the peripherals may be done in any of the three possible ways
1. Programmed I/O.
2. Interrupt- initiated I/O.
3. Direct memory access( DMA).
Now let’s discuss each mode one by one.
Programmed I/O
It is due to the result of the I/O instructions that are written in the computer program. Each data item
transfer is initiated by an instruction in the program. Usually the transfer is from a CPU register and
memory. In this case it requires constant monitoring by the CPU of the peripheral devices.
Example of Programmed I/O: In this case, the I/O device does not have direct access to the
memory unit. A transfer from I/O device to memory requires the execution of several instructions
by the CPU, including an input instruction to transfer the data from device to the CPU and store
instruction to transfer the data from CPU to memory. In programmed I/O, the CPU stays in the
program loop until the I/O unit indicates that it is ready for data transfer. This is a time
consuming process since it needlessly keeps the CPU busy. This situation can be avoided by
using an interrupt facility. This is discussed below.

Interrupt- initiated I/O


Since in the above case we saw the CPU is kept busy unnecessarily. This situation can very well
be avoided by using an interrupt driven method for data transfer. By using interrupt facility and
special commands to inform the interface to issue an interrupt request signal whenever data is
available from any device. In the meantime the CPU can proceed for any other program
execution. The interface meanwhile keeps monitoring the device. Whenever it is determined that
the device is ready for data transfer it initiates an interrupt request signal to the computer. Upon
detection of an external interrupt signal the CPU stops momentarily the task that it was already
performing, branches to the service program to process the I/O transfer, and then return to the
task it was originally performing.

12 | P a g e
Computer Architecture
Note: Both the methods programmed I/O and Interrupt-driven I/O require the active intervention of
the
processor to transfer data between memory and the I/O module, and any data transfer must
transverse
a path through the processor. Thus both these forms of I/O suffer from two inherent drawbacks.
 The I/O transfer rate is limited by the speed with which the processor can test and service a
device.
 The processor is tied up in managing an I/O transfer; a number of instructions must be executed
for each I/O transfer.

Direct Memory Access


The data transfer between a fast storage media such as magnetic disk and memory unit is limited
by the speed of the CPU. Thus we can allow the peripherals directly communicate with each
other using the memory buses, removing the intervention of the CPU. This type of data transfer
technique is known as DMA or direct memory access. During DMA the CPU is idle and it has no
control over the memory buses. The DMA controller takes over the buses to manage the transfer
directly between the I/O devices and the memory unit.

Bus Request : It is used by the DMA controller to request the CPU to relinquish the control of the
buses.
Bus Grant : It is activated by the CPU to Inform the external DMA controller that the buses are in
high impedance state and the requesting DMA can take control of the buses. Once the DMA has
taken the control of the buses it transfers the data. This transfer can take place in many ways.
Types of DMA transfer using DMA controller:
Burst Transfer :
DMA returns the bus after complete data transfer. A register is used as a byte count,
being decremented for each byte transfer, and upon the byte count reaching zero, the DMAC will
release the bus. When the DMAC operates in burst mode, the CPU is halted for the duration of the
data

13 | P a g e
Computer Architecture
transfer.
Steps involved are:
3. Bus grant request time.
4. Transfer the entire block of data at transfer rate of device because the device is usually slow than
the
speed at which the data can be transferred to CPU.
5. Release the control of the bus back to CPU
So, total time taken to transfer the N bytes
= Bus grant request time + (N) * (memory transfer rate) + Bus release control time.
Where,
X µsec =data transfer time or preparation time (words/block)
Y µsec =memory cycle time or cycle time or transfer time (words/block)
% CPU idle (Blocked)=(Y/X+Y)*100
% CPU Busy=(X/X+Y)*100
Cyclic Stealing :
An alternative method in which DMA controller transfers one word at a time after which it must
return the control of the buses to the CPU. The CPU delays its operation only for one memory cycle
to allow the direct memory I/O transfer to “steal” one memory cycle.
Steps Involved are:
6. Buffer the byte into the buffer
7. Inform the CPU that the device has 1 byte to transfer (i.e. bus grant request)
8. Transfer the byte (at system bus speed)
9. Release the control of the bus back to CPU.
Before moving on transfer next byte of data, device performs step 1 again so that bus isn’t tied up
and
the transfer won’t depend upon the transfer rate of device.
So, for 1 byte of transfer of data, time taken by using cycle stealing mode (T).
= time required for bus grant + 1 bus cycle to transfer data + time required to release the bus, it will
be
NxT
In cycle stealing mode we always follow pipelining concept that when one byte is getting transferred
then Device is parallel preparing the next byte. “The fraction of CPU time to the data transfer time”
if asked then cycle stealing mode is used. Where,
X µsec =data transfer time or preparation time
(words/block)
Y µsec =memory cycle time or cycle time or transfer
time (words/block)
% CPU idle (Blocked) =(Y/X)*100
% CPU busy=(X/Y)*100

14 | P a g e
Computer Architecture

Priority Interrupts | (S/W Polling and Daisy


Chaining)
In I/O Interface (Interrupt and DMA Mode), we have discussed the concept behind the Interrupt-
initiated I/O. To summarize, when I/O devices are ready for I/O transfer, they generate an interrupt
request signal to the computer. The CPU receives this signal, suspends the current instructions it is
executing, and then moves forward to service that transfer request. But what if multiple devices
generate interrupts simultaneously. In that case, we have a way to decide which interrupt is to be
serviced first. In other words, we have to set a priority among all the devices for systemic interrupt
servicing. The concept of defining the priority among devices so as to know which one is to be
serviced first in case of simultaneous requests is called a priority interrupt system. This could be
done with either software or hardware methods.
SOFTWARE METHOD – POLLING
In this method, all interrupts are serviced by branching to the same service program. This program
then checks with each device if it is the one generating the interrupt. The order of checking is
determined by the priority that has to be set. The device having the highest priority is checked first
and then devices are checked in descending order of priority. If the device is checked to be
generating the interrupt, another service program is called which works specifically for that
particular device. The structure will look something like this-
if (device[0].flag)
device[0].service();
else if (device[1].flag)
device[1].service();
.
.
.
.
.
.
else
//raise error
The major disadvantage of this method is that it is quite slow. To overcome this, we can use
hardware solution, one of which involves connecting the devices in series. This is called Daisy-
chaining method.

15 | P a g e
Computer Architecture
Direct Memory Access (DMA) :
DMA Controller is a hardware device that allows I/O devices to directly access memory with
less participation of the processor. DMA controller needs the same old circuits of an
interface to communicate with the CPU and Input/Output devices.
Fig-1 below shows the block diagram of the DMA controller. The unit communicates with
the CPU through data bus and control lines. Through the use of the address bus and
allowing the DMA and RS register to select inputs, the register within the DMA is chosen
by the CPU. RD and WR are two-way inputs. When BG (bus grant) input is 0, the CPU can
communicate with DMA registers. When BG (bus grant) input is 1, the CPU has relinquished
the buses and DMA can communicate directly with the memory.
DMA controller registers :
The DMA controller has three registers as follows.
 Address register – It contains the address to specify the desired location in memory.
 Word count register – It contains the number of words to be transferred.
 Control register – It specifies the transfer mode.
Note –
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU
can both read and write into the DMA registers under program control via the data bus.

Fig 1- Block Diagram

Explanation :
The CPU initializes the DMA by sending the given information through the data bus.
 The starting address of the memory block where the data is available (to read) or where
data are to be stored (to write).
 It also sends word count which is the number of words in the memory block to be read
or write.
 Control to define the mode of transfer such as read or write.
 A control to begin the DMA transfer.

16 | P a g e
Computer Architecture

Input-Output Processor
The DMA mode of data transfer reduces CPU’s overhead in handling I/O operations. It
also allows parallelism in CPU and I/O operations. Such parallelism is necessary to avoid
wastage of valuable CPU time while handling I/O devices whose speeds are much slower
as compared to CPU. The concept of DMA operation can be extended to relieve the CPU
further from getting involved with the execution of I/O operations. This gives rises to the
development of special purpose processor called Input-Output Processor (IOP) or IO
channel.
The Input Output Processor (IOP) is just like a CPU that handles the details of I/O
operations. It is more equipped with facilities than those are available in typical DMA
controller. The IOP can fetch and execute its own instructions that are specifically
designed to characterize I/O transfers. In addition to the I/O – related tasks, it can perform
other processing tasks like arithmetic, logic, branching and code translation. The main
memory unit takes the pivotal role. It communicates with processor by the means of DMA.
The block diagram –

The Input Output Processor is a specialized processor which loads and stores data into
memory along with the execution of I/O instructions. It acts as an interface between
system and devices. It involves a sequence of events to executing I/O operations and
then store the results into the memory.
Advantages –
 The I/O devices can directly access the main memory without the intervention by the
processor in I/O processor based systems.
 It is used to address the problems that are arises in Direct memory access method.

17 | P a g e
Computer Architecture

Communication
Digital communication can be considered as the communication happening between
two (or more) devices in terms of bits. This transferring of data, either wirelessly or
through wires, can be either one bit at a time or the entire data (depending on the size
of the processor inside i.e., 8 bit, 16 bit etc.) at once. Based on this, we can have the
following classification namely, Serial Communication and Parallel
Communication.

Serial Communication
Serial Communication implies transferring of data bit by bit, sequentially. This is the
most common form of communication used in the digital word. Contrary to the parallel
communication, serial communication needs only one line for the data transfer.
Thereby, the cost for the communication line as well as the space required is reduced.

Parallel Communication
Parallel communication implies transferring of the bits in a parallel fashion at a time.
This communication comes for rescue when speed rather than space is the main
objective. The transfer of data is at high speed owing to the fact that no bus buffer is
present.

18 | P a g e
Computer Architecture
Parallel and Serial Communication(Interface)
MSB:Most Significant Bit
LSB:Least Significant Bit

Example:

For a 8 bit data transfer in Serial communication one bit will be sent at a time. The
entire data is first fed into the serial port buffer. From this buffer one bit will be sent at
a time. Only after the last bit is received the data transferred can be forwarded for
processing. While in the Parallel Communication a serial port buffer is not required.
According to the length of the data, the number of bus lines are available plus a
synchronization line for synchronized transmission of data.

Thus we can state that for the same frequency of data transmission Serial
communication is slower than parallel communication

So, the question naturally arises-

Why serial communication is preferred over


parallel?
While parallel communication is faster when the frequency of transmission is same, it
is cumbersome when the transmission is long distance. Also with the number of data
channels it should also have a synchronous channel or a clock channel to keep the data
synchronized.

In Serial the data is sent sequentially and latched up at the receiving end thus procuring
the entire data from the data bus using USART/UART (Universal Synchronous
Asynchronous Receiver Transmitter) without any loss in synchronization but in
parallel even if one wire takes more time to recover the received data will be faulty.

19 | P a g e
Computer Architecture

Data Transfer Instructions


Data transfer instructions transfer the data between memory and processor registers, processor
registers, and I/O devices, and from one processor register to another. There are eight commonly
used data transfer instructions. Each instruction is represented by a mnemonic symbol.
The table shows the eight data transfer instructions and their respective mnemonic symbols.

Name Mnemonic Symbols

Load LD

Store ST

Move MOV

Exchange XCH

Input In

Output OUT

Push PUSH

Pop POP
The instructions can be described as follows −

 Load − The load instruction is used to transfer data from the memory to a processor register,
which is usually an accumulator.
 Store − The store instruction transfers data from processor registers to memory.
 Move − The move instruction transfers data from processor register to memory or memory
to processor register or between processor registers itself.
 Exchange − The exchange instruction swaps information either between two registers or
between a register and a memory word.
 Input − The input instruction transfers data between the processor register and the input
terminal.
 Output − The output instruction transfers data between the processor register and the
output terminal.
 Push and Pop − The push and pop instructions transfer data between a processor register
and memory stack.
All these instructions are associated with a variety of addressing modes. Some assembly language
instructions use different mnemonic symbols just to differentiate between the different addressing
modes.

20 | P a g e
Computer Architecture

Data Manipulation Instructions


Data manipulation instructions perform operations on data and provide the computational
capabilities for the computer. The data manipulation instructions in a typical computer usually
divided into three basic types as follows.
1. Arithmetic instructions
2. Logical and bit manipulation instructions
3. Shift instructions

Arithmetic instructions

The four basic arithmetic operations are addition, subtraction, multiplication, and division. Most
computers provide instructions for all four operations.
Typical Arithmetic Instructions –
Name MnemonicExample Explanation
It will increment the register B by 1
B<-B+1
Increment INC INC B

It will decrement the register B by 1


B<-B-1
Decrement DEC DEC B

It will add contents of register B to the contents of


the accumulator
and store the result in the accumulator
AC<-AC+B
Add ADD ADD B

It will subtract the contents of register B from the


contents of the
accumulator and store the result in the
accumulator
AC<-AC-B
Subtract SUB SUB B

21 | P a g e
Computer Architecture

It will multiply the contents of register B with the


contents of the
accumulator and store the result in the
accumulator
AC<-AC*B
Multiply MUL MUL B

It will divide the contents of register B with the


contents of the
accumulator and store the quotient in the
accumulator
AC<-AC/B
Divide DIV DIV B

It will add the contents of register B and the carry


flag with the
contents of the accumulator and store the result in
the
accumulator
ADDC
AC<-AC+B+Carry flag
Add with carry ADDC B

It will subtract the contents of register B and the


carry flag from
the contents of the accumulator and store the
result in the
accumulator
Subtract with
AC<-AC-B-Carry flag
borrow SUBB SUBB B

It will negate a value by finding 2’s complement of


its single operand.
This means simply operand by -1.
Negate(2’s
B<-B’+1
complement) NEG NEG B

22 | P a g e
Computer Architecture

Logical and Bit Manipulation Instructions

Logical instructions perform binary operations on strings of bits stored in registers. They are
useful for manipulating individual bits or a group of bits.
Typical Logical and Bit Manipulation Instructions –
Name MnemonicExample Explanation
It will set the accumulator to 0
AC<-0
Clear CLR CLR

It will complement the accumulator


COM
AC<-(AC)’
Complement COM A

It will AND the contents of register B with the contents


of accumulator and store
it in the accumulator
AC<-AC AND B
AND AND AND B

It will OR the contents of register B with the contents of


accumulator and store it
in the accumulator
AC<-AC OR B
OR OR OR B

It will XOR the contents of register B with the contents


of the accumulator and
store it in the accumulator
AC<-AC XOR B
Exclusive-OR XOR XOR B

It will set the carry flag to 0


Carry flag<-0
Clear carry CLRC CLRC

It will set the carry flag to 1


Set carry SETC SETC

23 | P a g e
Computer Architecture

Carry flag<-1

It will complement the carry flag


Complement
Carry flag<- (Carry flag)’
carry COMC COMC

Enable interrupt EI EI It will enable the interrupt

Disable
interrupt DI DI It will disable the interrupt

Shift Instructions

Shifts are operations in which the bits of a word are moved to the left or right. Shift instructions
may specify either logical shifts, arithmetic shifts, or rotate-type operations.
Typical Shift Instructions –
Name Mnemonic
Logical shift right SHR

Logical shift left SHL

Arithmetic shift right SHRA

Arithmetic shift left SHLA

Rotate right ROR

Rotate left ROL

Rotate right through carry RORC

Rotate left through carry ROLC

24 | P a g e
Computer Architecture

Program Control Instructions


Instructions of the computer are always stored in consecutive memory locations. These instructions
are fetched from successive memory locations for processing and executing.
When an instruction is fetched from the memory, the program counter is incremented by 1 so that it
points to the address of the next consecutive instruction in the memory. Once a data transfer and
data manipulation instruction are executed, the program control along with the program counter,
which holds the address of the next instruction to be fetched, is returned to the fetch cycle.
Data transfer and manipulation instructions specify the conditions for data processing operations,
whereas the program control instructions specify the conditions that can alter the content of the
program counter.
The change in the content of the program counter can cause an interrupt/break in the instruction
execution. However, the program control instructions control the flow of program execution and are
capable of branching to different program segments.
Some of the program control instructions are listed in the table.
Program Control Instructions

Name Mnemonics

Branch BR

Jump JMP

Skip SKP

Call Call

Return RET

Compare (by Subtraction) CMP

Test (by ANDing) TST


The branch is a one-address instruction. It is represented as BR ADR, where ADR is a mnemonic
for an address. The branch instruction transfers the value of ADR into the program counter. The
branch and jump instructions are interchangeably used to mean the same. However, sometimes
they denote different addressing modes.
The conditional branch instructions such as ‘branch if positive’, or ‘branch if zero’ specifies the
condition to transfer the flow of execution. When the condition is met, the branch address is loaded
in the program counter.

Types of Program Control Instructions:


There are different types of Program Control Instructions:

25 | P a g e
Computer Architecture
1. Compare Instruction:
Compare instruction is specifically provided, which is similar to a subtract instruction
except the result is not stored anywhere, but flags are set according to the result.
Example:
CMP R1, R2 ;

2. Unconditional Branch Instruction:


It causes an unconditional change of execution sequence to a new location.
Example:
JUMP L2
Mov R3, R1 goto L2

3. Conditional Branch Instruction:


A conditional branch instruction is used to examine the values stored in the condition
code register to determine whether the specific condition exists and to branch if it does.
Example:
Assembly Code : BE R1, R2, L1
Compiler allocates R1 for x and R2 for y
High Level Code: if (x==y) goto L1;

4. Subroutines:
A subroutine is a program fragment that lives in user space, performs a well-defined task.
It is invoked by another user program and returns control to the calling program when
finished.
Example:
CALL and RET

5. Halting Instructions:
 NOP Instruction – NOP is no operation. It cause no change in the processor state
other than an advancement of the program counter. It can be used to synchronize
timing.

 HALT – It brings the processor to an orderly halt, remaining in an idle state until
restarted by interrupt, trace, reset or external action.

6. Interrupt Instructions:
Interrupt is a mechanism by which an I/O or an instruction can suspend the normal
execution of processor and get itself serviced.

 RESET – It reset the processor. This may include any or all setting registers to an
initial value or setting program counter to standard starting location.
 TRAP – It is non-maskable edge and level triggered interrupt. TRAP has the highest
priority and vectored interrupt.
 INTR – It is level triggered and maskable interrupt. It has the lowest priority. It can be
disabled by resetting the processor.

26 | P a g e
Computer Architecture

RISC and CISC


Reduced Instruction Set Architecture (RISC) –
The main idea behind this is to make hardware simpler by using an instruction set composed of a
few basic steps for loading, evaluating, and storing operations just like a load command will load
data, a store command will store the data.

Complex Instruction Set Architecture (CISC) –


The main idea is that a single instruction will do all loading, evaluating, and storing operations just
like a multiplication command will do stuff like loading data, evaluating, and storing it, hence it’s
complex.

Both approaches try to increase the CPU performance


 RISC: Reduce the cycles per instruction at the cost of the number of instructions per program.

 CISC: The CISC approach attempts to minimize the number of instructions per program but at
the cost of an increase in the number of cycles per instruction.

Earlier when programming was done using assembly language, a need was felt to make instruction
do more tasks because programming in assembly was tedious and error-prone due to which CISC
architecture evolved but with the uprise of high-level language dependency on assembly reduced
RISC architecture prevailed.
Characteristic of RISC –
1. Simpler instruction, hence simple instruction decoding.
2. Instruction comes undersize of one word.
3. Instruction takes a single clock cycle to get executed.
4. More general-purpose registers.
5. Simple Addressing Modes.
6. Fewer Data types.
7. A pipeline can be achieved.

Characteristic of CISC –
1. Complex instruction, hence complex instruction decoding.
2. Instructions are larger than one-word size.
3. Instruction may take more than a single clock cycle to get executed.
4. Less number of general-purpose registers as operations get performed in memory itself.
5. Complex Addressing Modes.
6. More Data types.
27 | P a g e
Computer Architecture

Example – Suppose we have to add two 8-bit numbers:


 CISC approach: There will be a single command or instruction for this like ADD which will
perform the task.

 RISC approach: Here programmer will write the first load command to load data in registers
then it will use a suitable operator and then it will store the result in the desired location.

So, add operation is divided into parts i.e. load, operate, store due to which RISC programs are
longer and require more memory to get stored but require fewer transistors due to less complex
command.

Difference between RISC & CISC


RISC CISC
Focus on software Focus on hardware

Uses both hardwired and microprogrammed control


Uses only Hardwired control unit unit

Transistors are used for storing complex


Transistors are used for more registers Instructions

Fixed sized instructions Variable sized instructions

Can perform only Register to Register Arithmetic Can perform REG to REG or REG to MEM or MEM to
operations MEM

Requires more number of registers Requires less number of registers

Code size is large Code size is small

An instruction executed in a single clock cycle Instruction takes more than one clock cycle

An instruction fit in one word Instructions are larger than the size of one word

28 | P a g e
Computer Architecture

Pipelining
Pipelining defines the temporal overlapping of processing. Pipelines are emptiness greater than
assembly lines in computing that can be used either for instruction processing or, in a more general
method, for executing any complex operations. It can be used efficiently only for a sequence of the
same task, much similar to assembly lines.
A basic pipeline processes a sequence of tasks, including instructions, as per the following principle
of operation −
Each task is subdivided into multiple successive subtasks as shown in the figure. For instance, the
execution of register-register instructions can be broken down into instruction fetch, decode,
execute, and writeback(fetch, decode, execute, memory & writeback).

A pipeline phase related to each subtask executes the needed operations.


A similar amount of time is accessible in each stage for implementing the needed subtask.
All pipeline stages work just as an assembly line that is, receiving their input generally from the
previous stage and transferring their output to the next stage.
Finally, it can consider the basic pipeline operates clocked, in other words synchronously. This
defines that each stage gets a new input at the beginning of the clock cycle, each stage has a single
clock cycle available for implementing the needed operations, and each stage produces the result
to the next stage by the starting of the subsequent clock cycle.

29 | P a g e
Computer Architecture

Advantages of Pipelining
 The cycle time of the processor is decreased. It can improve the instruction throughput.
Pipelining doesn't lower the time it takes to do an instruction. Rather than, it can raise the
multiple instructions that can be processed together ("at once") and lower the delay between
completed instructions (known as 'throughput').
 If pipelining is used, the CPU Arithmetic logic unit can be designed quicker, but more complex.
 Pipelining increases execution over an un-pipelined core by an element of the multiple stages
(considering the clock frequency also increases by a similar factor) and the code is optimal for
pipeline execution.
 Pipelined CPUs frequently work at a higher clock frequency than the RAM clock frequency,
(as of 2008 technologies, RAMs operate at a low frequency correlated to CPUs frequencies)
increasing the computer’s global implementation.

Vector/Array Processing
Array processors are also known as multiprocessors or vector processors. They perform
computations on large arrays of data. Thus, they are used to improve the performance of the
computer.
Vector processing is a central processing unit that can perform the complete vector input in individual
instruction. It is a complete unit of hardware resources that implements a sequential set of similar
data elements in the memory using individual instruction.
The scientific and research computations involve many computations which require extensive and
high-power computers. These computations when run in a conventional computer may take days or
weeks to complete. The science and engineering problems can be specified in methods of vectors
and matrices using vector processing.

Features of Vector/Array Processing


There are various features of Vector Processing which are as follows −
 A vector is a structured set of elements. The elements in a vector are scalar quantities. A
vector operand includes an ordered set of n elements, where n is known as the length of the
vector.
 Each clock period processes two successive pairs of elements. During one single clock period,
the dual vector pipes and the dual sets of vector functional units allow the processing of two
pairs of elements.
As the completion of each pair of operations takes place, the results are delivered to
appropriate elements of the result register. The operation continues just before the various
elements processed are similar to the count particularized by the vector length register.
 In parallel vector processing, more than two results are generated per clock cycle. The parallel
vector operations are automatically started under the following two circumstances −
o When successive vector instructions facilitate different functional units and multiple vector
registers.

30 | P a g e
Computer Architecture

o When successive vector instructions use the resulting flow from one vector register as the
operand of another operation utilizing a different functional unit. This phase is known as
chaining.
 A vector processor implements better with higher vectors because of the foundation delay in
a pipeline.
 Vector processing decrease the overhead related to maintenance of the loop-control variables
which creates it more efficient than scalar processing.

31 | P a g e

You might also like