You are on page 1of 21

AMBO UNIVERSITY WOLISO CAMPUS

Department of Computer science


Computer organization and architecture
Group- assignment

 Course Title: Computer organization and architecture


 Course Code: CoSc2022

Student name I.D. Number


 NATNAEL TSEGAYE ----------------------------------------------------- UGR/50257/13
 TAJU HASSEN ----------------------------------------------------- UGR/
 MIKIYAS MESELE ------------------------------------------------------ UGR/50226/13
 MEKDES ZERIHUN ------------------------------------------------------ UGR/
 MIKIYAS BEHAILU ------------------------------------------------------ UGR/50282/13
 TAMIRAT TADESSE ----------------------------------------------------- UGR/

Submitted to
Mr. - Abdisa
feb /2023

Introduction
In the assignment below we tried to search out topics and gather information.
And we briefly explained what is asked in the assignment. It covers what it means
by multiprocessor, which is a system with more than one processor and different
topics about multiprocessor is covered below and about symmetric
multiprocessor system and tried to cover vector processor, array and we defined
communication arithmetic pipelines etc.… explained below.

1|Page
Multiprocessor and its Characteristics
Multiprocessor
A multiprocessor system is defined as "a system with more than one processor", and, more
precisely, "a number of central processing units linked together to enable parallel processing to
take place".
The key objective of a multiprocessor is to boost a system's execution speed. The other
objectives are fault tolerance and application matching.
The term "multiprocessor" can be confused with the term "multiprocessing". While
multiprocessing is a type of processing in which two or more processors work together to
execute multiple programs simultaneously, multiprocessor refers to a hardware architecture
that allows multiprocessing.
Multiprocessor systems are classified according to how processor memory access is handled
and whether system processors are of a single type or various ones.

Multiprocessor system types


There are many types of multiprocessor systems:

 Loosely coupled multiprocessor system


 Tightly coupled multiprocessor system
 Homogeneous multiprocessor system
 Heterogeneous multiprocessor system
 Shared memory multiprocessor system
 Distributed memory multiprocessor system
 Uniform memory access (UMA) system
 cc–NUMA system
 Hybrid system – shared system memory for global data and local memory for local data

Loosely-coupled (distributed memory) multiprocessor system


In loosely-coupled multiprocessor systems, each processor has its own local memory,
input/output (I/O) channels, and operating system.
Processors exchange data over a high- speed communication network by sending messages via
a technique known as "message passing". Loosely- coupled multiprocessor systems are also
known as distributed-memory systems, as the processors do not share physical memory and
have individual I/O channels.

2|Page
System characteristics
 These systems are able to perform multiple-instructions-on-multiple-data (MIMD)
programming.
 This type of architecture allows parallel processing.
 The distributed memory is highly scalable.

Tightly-coupled (shared memory) multiprocessor system


Multiprocessor system with a shared memory closely connected to the processors.
A symmetric multiprocessing system is a system with centralized shared memory called main
memory (MM) operating under a single operating system with two or more homogeneous
processors.
There are two types of systems:
 Uniform memory-access (UMA) system
o Heterogeneous multiprocessing system
o Symmetric multiprocessing system (SMP)
 NUMA system
Heterogeneous multiprocessor system
A heterogeneous multiprocessing system contains multiple, but not homogeneous, processing units –
central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), or any
type of application-specific integrated circuits (ASICs). The system architecture allows any accelerator –
for instance, a graphics processor – to operate at the same processing level as the system's CPU.

Symmetric multiprocessor system


A symmetric multiprocessor system (SMP) is a system with a pool of homogeneous processors running
under a single OS with a centralized, shared main memory. Each processor, executing different programs
and working on different sets of data, has the ability to share common resources (memory, I/O device,
interrupt system, and so on) that are connected using a system bus, a crossbar, or a mix of the two, or
an address bus and data crossbar.

3|Page
cc-NUMA system
It is known that the SMP system has limited scalability. To overcome this limitation, the architecture
called "cc-NUMA" (cache coherency–non-uniform memory access) is normally used. The main
characteristic of a cc-NUMA system is having shared global memory that is distributed to each node,
although the effective "access" a processor has to the memory of a remote component subsystem, or
"node", is slower compared to local memory access, which is why the memory access is "non-uniform".
A cc–NUMA system is a cluster of SMP systems – each called a "node", which can have a single
processor, a multi-core processor, or a mix of the two, of one or other kinds of architecture –connected
via a high-speed "connection network" that can be a "link" that can be a single or double-reverse ring, or
multi- ring, point-to-point connections, or a mix of these.

4|Page
Interconnection Structures for Multiprocessor
The processor mest be able to share a set of main memory modules and I/O devices in
multyprocessor.this sharing capablity can be provided through inter connection structures.
The inter connection structure that are commenly used can be given as follows-
 Time-shared/common bus
 Cross bar switch
 Multiport memory
 Multi stage switching network
 Hypercube sistem

Time-shared/common bus
In amultyprocessor system, the time-shared bus interconnection provides a common
comunication path connecting all the functional units like processor, I/O
processor,memory unit etc…

Cross bar switch


A point is reached at which there is a separet path available for each memory module, if the
number of buses in common bus system is increased. Cross bar switch (for multyprocessor)
provides separet path for each modules.

Multiport memory
In multy port memory systemthe control, switching and priority arbitration logic are
distributed throghout the crossbar switching matrix which is distributed at the interface to
the memorymodules.

Hypercube sistem
This is the binery n-cube arctechture. Here we can connect 2n processors and each of the
processor here forms a node the cube. A node can be memory module, I/O interface also,

5|Page
not necesserly processor. The processor at a node has a communication path that is direct
gose to n other nodes (total 2n nodes). There are total 2n distnict n-bit binery addresses.

Conclusion
Interconnection structure can diside overall system’s perfomance in a multiprocessor
environment. Althogh using common bus system is much easy and simple, but the
availablity of only one path is its major drowback and if the bus fails, whole system fails. To
overcome this and improve overall performance, crssbar, multiport,hypercube and then
multy stage switch networks evolved.

Inter Processor Communication and Synchronization


inter-process communication or interprocess communication (IPC) refers specifically to the
mechanisms an operating system provides to allow the processes to manage shared data.
IPC is very important to the design process for microkernels and nanokernels, which reduce
the number of functionalities provided by the kernel. Those functionalities are then
obtained by communicating with servers via IPC, leading to a large increase in
communication when compared to a regular monolithic kernel. IPC interfaces generally
encompass variable analytic framework structures. These processes ensure compatibility
between the multi-vector protocols upon which IPC models rely.

Interactions among processes


In a multi-process application these are the various degrees of interaction:
1. Competing processes: Processes themselves do not share anything. But OS has to share
the system resources among these processes “competing” for system resources such as
disk, file or printer.
Co-operating processes : Results of one or more processes may be needed for another
process.
2. Co-operation by sharing : Example: Sharing of an IO buffer. Concept of critical section.
(indirect)
3. Co-operation by communication : Example: typically no data sharing, but co-ordination
thru’ synchronization becomes essential in certain applications. (direct)
Among the three kinds of interactions indicated by 1, 2 and 3 above:
1 is at the system level: potential problems : deadlock and starvation.
2 is at the process level : significant problem is in realizing mutual exclusion.
3 is more a synchronization problem.

6|Page
We will study mutual exclusion and synchronization here, and defer deadlock, and
starvation for a later time.
Race condition: The situation where several processes access – and manipulate shared data
concurrently. The final value of the shared data depends upon which process finishes last.
To prevent race conditions, concurrent processes must be synchronized.

Mutual exclusion problem


Successful use of concurrency among processes requires the ability to define critical sections
and enforce mutual exclusion.
• Critical section : is that part of the process code that affects the shared resource.
• Mutual exclusion: in the use of a shared resource is provided by making its access
mutually exclusive among the processes that share the resource. This is also known as
the Critical Section (CS) problem.

Mutual exclusion
Any facility that provides mutual exclusion should meet these requirements:
1. No assumption regarding the relative speeds of the processes.
2. A process is in its CS for a finite time only.
3. Only one process allowed in the CS.
4. Process requesting access to CS should not wait indefinitely.
5. A process waiting to enter CS cannot be blocking a process in CS or any other processes.

Pipeline, Vector Processing and Array Processing


Pipeline (computing)
A pipeline, also known as a data pipeline, is a set of data processing elements connected in
series, where the output of one element is the input of the next one. The elements of a
pipeline are often executed in parallel or in time-sliced fashion. Some amount of buffer
storage is often inserted between elements.

Computer-related pipelines include:


 Instruction pipelines, such as the classic RISC pipeline, which are used in central processing
units (CPUs) and other microprocessors to allow overlapping execution of multiple
instructions with the same circuitry. The circuitry is usually divided up into stages and each
stage processes a specific part of one instruction at a time, passing the partial results to the
next stage. Examples of stages are instruction decode, arithmetic/logic and register fetch.
They are related to the technologies of superscalar execution, operand forwarding,
speculative execution and out-of-order execution.

7|Page
 Graphics pipelines, found in most graphics processing units (GPUs), which consist of multiple
arithmetic units, or complete CPUs, that implement the various stages of common rendering
operations (perspective projection, window clipping, color and light calculation, rendering,
etc.).
 Software pipelines, which consist of a sequence of computing processes (commands,
program runs, tasks, threads, procedures, etc.), conceptually executed in parallel, with the
output stream of one process being automatically fed as the input stream of the next one.
The Unix system call pipe is a classic example of this concept.
 HTTP pipelining, the technique of issuing multiple HTTP requests through the same TCP
connection, without waiting for the previous one to finish before issuing a new one.

Some operating systems may provide UNIX-like syntax to string several program runs in a
pipeline, but implement the latter as simple serial execution, rather than true pipelining—
namely, by waiting for each program to finish before starting the next one.

Vector processoings
A vector processor or array processor is a central processing unit (CPU) that implements an
instruction set where its instructions are designed to operate efficiently and effectively on large
one-dimensional arrays of data called vectors. This is in contrast to scalar processors, whose
instructions operate on single data items only, and in contrast to some of those same scalar
processors having additional single instruction, multiple data (SIMD) or SWAR Arithmetic Units.
Vector processors can greatly improve performance on certain workloads notably numerical
simulation and similar tasks. Vector processing techniques also operate I video-game console
hardware and in graphics accelerators.
Vector machines appeared in the early 1970s and dominated supercomputer design through the 1970s
into the 1990s, notably the various Cray platforms. The rapid fall in the price-to- performance ratio of
conventional microprocessor designs led to a decline in vector supercomputers during the 1990s.

GPU- Modern graphics processing units (GPUs) include an array of shader pipelines which may be
driven by compute kernels, and can be considered vector processors (using a similar strategy for hiding
memory latencies). As shown in Flynn's 1972 paper the key distinguishing factor of SIMT- based GPUs is
that it has a single instruction decoder-broadcaster but that the cores receiving and executing that same
instruction are otherwise reasonably normal: their own ALUs, their own register files, their own
Load/Store units and their own independent L1 data caches. Thus although all cores simultaneously
execute the exact same instruction in lock-step with each other they do so with completely different
data from completely different memory locations.

Pure (fixed) SIMD - also known as "Packed SIMD",[4] SIMD within a Register (SWAR), and Pipelined
Processor in Flynn's Taxonomy. Common examples using SIMD with features inspired by Vector
processors include Intel x86's MMX, SSE and AVX instructions, AMD's 3DNow! extensions, ARM NEON,
Sparc's VIS extension, PowerPC's AltiVec and MIPS' MSA. In 2000, IBM, Toshiba and Sony collaborated to
create the Cell processor, which is also SIMD.

8|Page
Predicated SIMD - also known as associative processing. Two notable examples which have per-element
(lane-based) predication are ARM SVE2 and AVX-512

Pure Vectors - as categorised in Duncan's taxonomy -these include the original Cray-1, RISC-V RVV and
SX-Aurora TSUBASA. Although memory-based the STAR-100 was also a Vector Processor.

other CPU designs include some multiple instructions for vector processing on multiple (vectorised) data
sets, typically known as MIMD (Multiple Instruction, Multiple Data) and realized with VLIW (Very Long
Instruction Word). The Fujitsu FR-V VLIW/vector processor combines both technologies.

Array processing
Array processing is a wide area of research in the field of signal processing that extends from the
simplest form of 1 dimensional line arrays to 2 and 3 dimensional array geometries. Array structure can
be defined as a set of sensors that are spatially separated, e.g. radio antenna and seismic arrays. The
sensors used for a specific problem may vary widely, for example microphones, accelerometers and
telescopes. However, many similarities exist, the most fundamental of which may be an assumption of
wave propagation. Wave propagation means there is a systemic relationship between the signal
received on spatially separated sensors. By creating a physical model of the wave propagation, or in
machine learning applications a training data set, the relationships between the signals received on
spatially separated sensors can be leveraged for many applications.

Some common problem that are solved with array processing techniques are:

 determine number and locations of energy-radiating sources


 enhance the signal to noise ratio SNR "signal-to-interference-plus-noise ratio (SINR)"
 track moving sources

Array processing metrics are often assessed noisy environments. The model for noise may be either one
of spatially incoherent noise, or one with interfering signals following the same propagation physics.

There are four assumptions in array processing. The first assumption is that there is uniform propagation
in all directions of isotropic and non-dispersive medium. The second assumption is that for far field array
processing, the radius of propagation is much greater than size of the array and that there is plane wave
propagation. The third assumption is that there is a zero mean white noise and signal, which shows
uncorrelation. Finally, the last assumption is that there is no coupling and the calibration is perfect.

Radar and Sonar Systems:


array processing concept was closely linked to radar and sonar systems which represent the classical applications
of array processing. The antenna array is used in these systems to determine location(s) of source(s), cancel
interference, suppress ground clutter. Radar systems used basically to detect objects by using radio waves. The
range, altitude, speed and direction of objects can be specified. Radar systems started as military equipments then
entered the civilian world. In radar applications, different modes can be used, one of these modes is the active
mode. In this mode the antenna array based system radiates pulses and listens for the returns. By using the
returns, the estimation of parameters such as velocity, range and DOAs (direction of arrival) of target of interest
become possible. Using the passive far-field listening arrays, only the DOAs can be estimated. Sonar systems
(Sound Navigation and Ranging) use the sound waves that propagate under the water to detect objects on or
under the water surface. Two types of sonar systems can be defined the active one and the passive one. In active

9|Page
sonar, the system emits pulses of sound and listens to the returns that will be used to estimate parameters. In the
passive sonar, the system is essentially listening for the sounds made by the target objects.

Communications (wireless)
Communication can be defined as the process of exchanging of information between two or more parties. The last
two decades witnessed a rapid growth of wireless communication systems. This success is a result of advances in
communication theory and low power dissipation design process. In general, communication (telecommunication)
can be done by technological means through either electrical signals (wired communication) or electromagnetic
waves (wireless communication). Antenna arrays have emerged as a support technology to increase the usage
efficiency of spectral and enhance the accuracy of wireless communication systems by utilizing spatial dimension in
addition to the classical time and frequency dimensions. Array processing and estimation techniques have been
used in wireless communication.

Medical applications
Array processing techniques got on much attention from medical and industrial applications. In medical
applications, the medical image processing field was one of the basic fields that use array processing. Other
medical applications that use array processing: diseases treatment, tracking waveforms that have information
about the condition of internal organs e.g. the heart, localizing and analyzing brain activity by using bio-magnetic
sensor arrays.

Array Processing for Speech Enhancement


Speech enhancement and processing represents another field that has been affected by the new era of array
processing. Most of the acoustic front end systems became fully automatic systems (e.g. telephones). However,
the operational environment of these systems contains a mix of other acoustic sources; external noises as well as
acoustic couplings of loudspeaker signals overwhelm and attenuate the desired speech signal. In addition to these
external sources, the strength of the desired signal is reduced due to the relatively distance between speaker and
microphones.

Array Processing in Astronomy Applications


Astronomical environment contains a mix of external signals and noises that affect the quality of the desired
signals. Most of the arrays processing applications in astronomy are related to image processing. The array used to
achieve a higher quality that is not achievable by using a single channel. The high image quality facilitates
quantitative analysis and comparison with images at other wavelengths. In general, astronomy arrays can be
divided into two classes: the beamforming class and the correlation class. Beamforming is a signal processing
techniques that produce summed array beams from a direction of interest – used basically in directional signal
transmission or reception- the basic idea is to combine elements in a phased array such that some signals
experience destructive inference and other experience constructive inference. Correlation arrays provide images
over the entire single-element primary beam pattern, computed off- line from records of all the possible
correlations between the antennas, pairwise.

Other applications
In addition to these applications, many applications have been developed based on array processing techniques:
Acoustic Beamforming for Hearing Aid Applications, Under-determined Blind Source Separation Using Acoustic
Arrays, Digital 3D/4D Ultrasound Imaging Array, Smart Antennas, Synthetic aperture radar, underwater acoustic
imaging, and Chemical sensor arrays...etc.

10 | P a g e
Parallel Processing, arithmetic Pipeline and Instruction
Pipeline
Parallel processing (DSP implementation)
In digital signal processing (DSP), parallel processing is a technique duplicating function units to
operate different tasks (signals) simultaneously. Accordingly, we can perform the same
processing for different signals on the corresponding duplicated function units. Further, due to
the features of parallel processing, the parallel DSP design often contains multiple outputs,
resulting in higher throughput than not parallel.
arallel processing is a method of simultaneously breaking up and running program tasks on
multiple microprocessors in order speed up performance time. Parallel processing may be
accomplished with a single computer that has two or more processors (CPUs) or with multiple
computer processors connected over a computer network. Parallel processing may also be
referred to as parallel computing.

arithmetic Pipeline
an artimetic pipeline dvides an arthimatic problem ini to various sub problems for excution in
various pipeline segments. It is used for floting point operations, multiplication and various
other computations. The process or flow chart arthimatic pipline for floting point addition is
shown in the diagram.

11 | P a g e
Instruction Pipeline
In this streem of instructions can be excuted by overlapping fetch,decode and excute of an
instruction this type of instructions are being excuted in other segments of the pipeline.thuse
we can excute multiple instruction simontaniously. The pipeline will be more efficient if the
instruction cycle is divided in to segments of equal duration.
In the most general casecomputers need to process each instruction in the following sequence
of steps.
 Fetch the instruction frommemory (FI)
 Decode the instruction (DI)
 Calculate the efficient address
 Fetch the oprand from the memory (FO)
 Execute the instruction (EX)
 Store the result in the proper place

12 | P a g e
Input-Output Organization
The Input / output organization of computer depends upon the size of computer and the
peripherals connected to it. The I/O Subsystem of the computer, provides an efficient mode of
communication between the central system and the outside environment.

Input output subsystem


The I/O subsystem of a computer provides an efficient mode of comminucation between the
central system the outside environmental handels all the input-output operation of the
computer system.

Peripherial divices
Input or output divices that are connected to computer are called peripherial divices.these
devices are designed to readinformation in to or out of the memory unit upon command from
the CPU and are considered to be the part of computer system.these devises are also called
peripherials.
There are three typts of peripherials
 Input peripherials – allows user input from the outside world to the computer. Example
keybord , mouse etc.
 Output peripherial – allows information output from the computer to the outside world.
Example printer, moniter etc.
 Input-output peripherials – allows both input (from outside world to computers) as well
as, output (from computer to the outside world). Example touch screen etc.

Interfaces
Interface is a shared boundary btween two separate components of the computer system
which can be used to attach two or more components to the system for communication
purposes.
There are two types of interface:
 CPU Inteface
 I/O Interface

Input-Output Interface
Peripherals connected to a computer need special communication links for interfacing with
CPU. In computer system, there are special hardware components between the CPU and
peripherals to control or manage the input-output transfers. These components are called
input-output interface units because they provide communication links between processor bus

13 | P a g e
and peripherals. They provide a method for transferring information between internal system
and input-output devices.

Modes of I/O Data Transfer


Data transfer between the central unit and I/O devices can be handled in generally three types
of modes which are given below: Programmed I/O , Interrupt Initiated I/O , Direct Memory
Access.

Programmed I/O
Programmed I/O instructions are the result of I/O instructions written in computer program.
Each data item transfer is initiated by the instruction in the program.
Usually the program controls data transfer to and from CPU and peripheral. Transferring data
under programmed I/O requires constant monitoring of the peripherals by the CPU.

Interrupt Initiated I/O


In the programmed I/O method the CPU stays in the program loop until the I/O unit indicates
that it is ready for data transfer. This is time consuming process because it keeps the processor
busy needlessly. This problem can be overcome by using interrupt initiated I/O. In this when the
interface determines that the peripheral is ready for data transfer, it generates an interrupt.
After receiving the interrupt signal, the CPU stops the task which it is processing and service the
I/O transfer and then returns back to its previous processing task.

Direct Memory Access


Removing the CPU from the path and letting the peripheral device manage the memory buses
directly would improve the speed of transfer. This technique is known as DMA. In this, the
interface transfer data to and from the memory through memory bus. A DMA controller
manages to transfer data between peripherals and memory unit.
Many hardware systems use DMA such as disk drive controllers, graphic cards, network cards
and sound cards etc. It is also used for intra chip data transfer in multicore processors. In DMA,
CPU would initiate the transfer, do other operations while the transfer is in progress and
receive an interrupt from the DMA controller when the transfer has been completed.

Asynchronous data transfer and Mode of transfer


The internal operations in an individual unit of a digital system are synchronized using clock
pulse. It means clock pulse is given to all registers within a unit. And all data transfer among
internal registers occurs simultaneously during the occurrence of the clock pulse. Now, suppose
any two units of a digital system are designed independently, such as CPU and I/O interface.

14 | P a g e
If the registers in the I/O interface share a common clock with CPU registers, then transfer
between the two units is said to be synchronous. But in most cases, the internal timing in each
unit is independent of each other, so each uses its private clock for its internal registers. In this
case, the two units are said to be asynchronous to each other, and if data transfer occurs
between them, this data transfer is called Asynchronous Data Transfer.
But, the Asynchronous Data Transfer between two independent units requires that control
signals be transmitted between the communicating units so that the time can be indicated at
which they send data. These two methods can achieve this asynchronous way of data transfer:
 Strobe control: A strobe pulse is supplied by one unit to indicate to the other unit when the
transfer has to occur.
 Handshaking: This method is commonly used to accompany each data item being
transferred with a control signal that indicates data in the bus. The unit receiving the data
item responds with another signal to acknowledge receipt of the data.

Strobe Control Method


The Strobe Control method of asynchronous data transfer employs a single control line to time
each transfer. This control line is also known as a strobe, and it may be achieved either by
source or destination, depending on which initiate the transfer.
 Source initiated strobe: In the below block diagram, you can see that strobe is initiated by source, and as
shown in the timing diagram, the source unit first places the data on the data bus.

The asynchronous data transfer between two independent units requires that control signals be transmitted
between the communicating units to indicate when they send the data. Thus, the two methods can achieve the
asynchronous way of data transfer. 1. Strobe Control Method The Strobe Control method of asynchronous data
transfer employs a single control line to time each transfer. This control line is also known as a strobe, and it may
be achieved either by source or destination, depending on which initiate the transfer. a. Source initiated strobe: In
the below block diagram, you can see that strobe is initiated by source, and as shown in the timing diagram, the
source unit first places the data on the data bus. After a brief delay to ensure that the data resolve to a stable

15 | P a g e
value, the source activates a strobe pulse. The information on the data bus and strobe control signal remains in the
active state for a sufficient time to allow the destination unit to receive the data. The destination unit uses a falling
edge of strobe control to transfer the contents of a data bus to one of its internal registers. The source removes
the data from the data bus after it disables its strobe pulse. Thus, new valid data will be available only after the
strobe is enabled again. In this case, the strobe may be a memory-write control signal from the CPU to a memory
unit. The CPU places the word on the data bus and informs the memory unit, which is the destination.

Advantages of Asynchronous Data Transfer


 It is more flexible, and devices can exchange information at their own pace. In addition,
individual data characters can complete themselves so that even if one packet is corrupted,
its predecessors and successors will not be affected.
 It does not require complex processes by the receiving device. Furthermore, it means that
inconsistency in data transfer does not result in a big crisis since the device can keep up
with the data stream. It also makes asynchronous transfers suitable for applications where
character data is generated irregularly.

Disadvantages of Asynchronous Data Transfer


 The success of these transmissions depends on the start bits and their recognition.
Unfortunately, this can be easily susceptible to line interference, causing these bits to be
corrupted or distorted.
 A large portion of the transmitted data is used to control and identify header bits and thus
carries no helpful information related to the transmitted data. This invariably means that
more data packets need to be sent.

Modes of Transfer
 Binary information received from an external device is usually stored in memory for later processing.
Information transferred from the central computer into an external device originates in the memory unit.
 The CPU merely executes the I/O instructions and may accept the data temporarily, but the ultimate
source or destination is the memory unit.
 Data transfer between the central computer and I/O devices may be handled in a variety of modes.
 Some modes use the CPU as an intermediate path; others transfer the data directly to and from the
memory unit.
 Data transfer to and from peripherals may be handled in one of three possible modes.
 Programmed I/O operations are the result of I/O instructions written in the computer program.
 Each data item transfer is initiated by an instruction in the program.
 Usually, the transfer is to and from a CPU register and peripheral. Other instructions are needed to
transfer the data to and from CPU and memory. Transferring data under program control requires
constant monitoring of the peripheral by the CPU.
 Once a data transfer is initiated, the CPU is required to monitor the interface to see when a transfer can
again be made. It is up to the programmed instructions executed in the CPU to keep close tabs on
everything that is taking place in the interface unit and the I/O device.
 In the programmed I/O method, the CPU stays in a program loop until the I/O unit indicates that it is
ready for data transfer. This is a time-consuming process since it keeps the processor busy needlessly.

16 | P a g e
 It can be avoided by using an interrupt facility and special commands to inform the interface to issue an
interrupt request signal when the data are available from the device. In the meantime the CPU can
proceed to execute another program.
 The interface meanwhile keeps monitoring the device. When the interface determines that the device is
ready for data transfer, it generates an interrupt request to the computer. Upon detecting the external
interrupt signal, the CPU momentarily stops the task it is processing, branches to a service program to
process the I/O transfer, and then returns to the task it was originally performing.
 Transfer of data under programmed I/O is between CPU and peripheral.
 In direct memory access (DMA), the interface transfers data into and out of the memory unit through the
memory bus. The CPU initiates the transfer by supplying the interface with the starting address and the
number of words needed to be transferred and then proceeds to execute other tasks.
 When the transfer is made, the DMA requests memory cycles through the memory bus.
 When the request is granted by the memory controller, the DMA transfers the data directly into memory.
The CPU merely delays its memory access operation to allow the direct memory I/O transfer.
 Since peripheral speed is usually slower than processor speed, I/O-memory transfers are infrequent
compared to processor access to memory.

Priority interrupts
 Data transfer between the CPU and an I/O device is initiated by the CPU. However, the CPU cannot
start the transfer unless the device is ready to communicate with the CPU.
 The readiness of the device can be determined from an interrupt signal. The CPU responds to the
interrupt request by storing the return address from PC into a memory stack and then the program
branches to a service routine that processes the required transfer.
 Some processors also push the current PSW (program status word) onto the stack and load a new
PSW for the service routine.
 In a typical application a number of I/O devices are attached to the computer, with each device
being able to originate an interrupt request. The first task of the interrupt system is to identify the
source of the interrupt.
 There is also the possibility that several sources will request service simultaneously. In this case the
system must also decide which device to service first.
 A priority interrupt is a system that establishes a priority over the various sources to determine
which condition is to be serviced first when two or more requests arrive simultaneously.
 The system may also determine which conditions are permitted to interrupt the computer while
another interrupt is being serviced.
 Higher-priority interrupt levels are assigned to requests which, if delayed or interrupted, could have
serious consequences.
 Devices with high speed transfers such as magnetic disks are given high priority, and slow devices
such as keyboards receive low priority.
 When two devices interrupt the computer at the same time, the computer services the device, with
the higher priority first.
 Establishing the priority of simultaneous interrupts can be done by software or hardware.
 A polling procedure is used to identify the highest-priority source by software means.
 In this method there is one common branch address for all interrupts.
 The program that takes care of interrupts begins at the branch address and polls the interrupt
sources in sequence. The order in wJ;tich they are tested determines the priority of each interrupt.
 The highest-priority source is tested first, and if its interrupt signal is on, control branches to a
service routine for this source.

17 | P a g e
 Otherwise, the next-lower-priority source is tested, and so on. Thus the initial service routine for all
interrupts consists of a program that tests the interrupt sources in sequence and branches to one of
many possible service routines.
 The particular service routine reached belongs to the highest-priority device among all devices that
interrupted the computer.
 The disadvantage of the software method is that if there are many interrupts, the time required to
poll them can exceed the time available to service the I/O device. In this situation a hardware
priority interrupt unit can be used to speed up the operation
 A hardware priority-interrupt unit functions as an overall manager in an interrupt system
environment.
 It accepts interrupt requests from many sources, determines which of the incoming requests has the
highest priority, and issues an interrupt request to the computer based on this determination.
 To speed up the operation, each interrupt source has its own interrupt vector to access its own
service routine directly. Thus no polling is required because all the decisions are established by the
hardware priority-interrupt unit.
 The hardware priority function can be established by either a serial or a parallel connection of
interrupt lines. The serial connection is also known as the daisychaining method.
Direct memory access (DMA)
Direct memory access (DMA) is a feature of computer systems and allows certain
hardware subsystems to access main system memory independently of the
central processing unit (CPU).
Without DMA, when the CPU is using programmed input/output, it is typically fully occupied
for the entire duration of the read or write operation, and is thus unavailable to perform other
work. With DMA, the CPU first initiates the transfer, then it does other operations while the
transfer is in progress, and it finally receives an interrupt from the DMA controller (DMAC)
when the operation is done. This feature is useful at any time that the CPU cannot keep up with
the rate of data transfer, or when the CPU needs to perform work while waiting for a relatively
slow I/O data transfer. Many hardware systems use DMA, including disk drive controllers,
graphics cards, network cards and sound cards.

DMA is also used for intra-chip data transfer in multi-core processors. Computers that have
DMA channels can transfer data to and from devices with much less CPU overhead than
computers without DMA channels. Similarly, a processing circuitry inside a multi-core processor
can transfer data to and from its local memory without occupying its processor time, allowing
computation and data transfer to proceed in parallel.

DMA can also be used for "memory to memory" copying or moving of data within memory.
DMA can offload expensive memory operations, such as large copies or scatter-gather
operations, from the CPU to a dedicated DMA engine. An implementation example is the I/O
Acceleration Technology. DMA is of interest in network-on-chip and in-memory computing
architectures.

18 | P a g e
Serial communication
serial communication is the process of sending data one bit at a time, sequentially, over a
communication channel or computer bus. This is in contrast to parallel communication, where
several bits are sent as a whole, on a link with several parallel channels.

Serial communication is used for all long-haul communication and most computer networks,
where the cost of cable and synchronization difficulties make parallel communication
impractical. Serial computer buses are becoming more common even at shorter distances, as
improved signal integrity and transmission speedsin newer serial technologies have begun to
outweigh the parallel bus's advantage of simplicity (no need for serializer and deserializer, or
SerDes) and to outstrip its disadvantages (clock skew, interconnect density). The migration from
PCI to PC Express is an example.

Many serial communication systems were originally designed to transfer data over relatively
large distances through some sort of data cable.

Practically all long-distance communication transmits data one bit at a time, rather than in
parallel, because it reduces the cost of the cable. The cables that carry this data (other than
"the" serial cable) and the computer ports they plug into are usually referred to with a more
specific name, to reduce confusion.

Serial buses
Many communication systems were generally designed to connect two integrated circuits on the same
printed circuit board, connected by signal traces on that board (rather than external cables).

Integrated circuits are more expensive when they have more pins. To reduce the number of pins in a
package, many ICs use a serial bus to transfer data when speed is not important. Some examples of such
low-cost serial buses include RS-232, SPI, I²C, UNI/O, 1-Wire and PCI Express.

19 | P a g e
Conclusion
 A multiprocessor system is defined as "a system with more than one processor", and, more
precisely, "a number of central processing units linked together to enable parallel processing to take
place".
 In loosely-coupled multiprocessor systems, each processor has its own local memory, input/output
(I/O) channels, and operating system.
 A heterogeneous multiprocessing system contains multiple, but not homogeneous, processing units
– central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs),
or any type of application-specific integrated circuits (ASICs). The system architecture allows any
accelerator – for instance, a graphics processor – to operate at the same processing level as the
system's CPU.
 A symmetric multiprocessor system (SMP) is a system with a pool of homogeneous processors
running under a single OS with a centralized, shared main memory.
 Interconnection structure can diside overall system’s perfomance in a multiprocessor environment.
Althogh using common bus system is much easy and simple, but the availablity of only one path is its
major drowback and if the bus fails, whole system fails. To overcome this and improve overall
performance, crssbar, multiport,hypercube and then multy stage switch networks evolved.
 inter-process communication or interprocess communication (IPC) refers specifically to the
mechanisms an operating system provides to allow the processes to manage shared data.
 A pipeline, also known as a data pipeline, is a set of data processing elements connected in series,
where the output of one element is the input of the next one.
 A vector processor or array processor is a central processing unit (CPU) that implements an
instruction set where its instructions are designed to operate efficiently and effectively on large one-
dimensional arrays of data called vectors.
 Array processing is a wide area of research in the field of signal processing that extends from the
simplest form of 1 dimensional line arrays to 2 and 3 dimensional array geometries.
 Communication can be defined as the process of exchanging of information between two or more
parties. The last two decades witnessed a rapid growth of wireless communication systems.
 parallel processing is a technique duplicating function units to operate different tasks (signals)
simultaneously.
 an artimetic pipeline dvides an arthimatic problem ini to various sub problems for excution in
various pipeline segments.
 The Input / output organization of computer depends upon the size of computer and the peripherals
connected to it.
 Input or output divices that are connected to computer are called peripherial divices.these devices
are designed to readinformation in to or out of the memory unit upon command from the CPU and
are considered to be the part of computer system.these devises are also called peripherials.
 The internal operations in an individual unit of a digital system are synchronized using clock pulse. It
means clock pulse is given to all registers within a unit.
 Direct memory access (DMA) is a feature of computer systems and allows certain hardware
subsystems to access main system memory independently of the central processing unit (CPU).

References
 Google / wikipidia
 https://www.geeksforgeeks.org/time-shared-bus-interconnection-structure .
 https://en.wikipedia.org/wiki/Multiprocessing
 https://www.studytonight.com/computer-architecture/input-output-organi

20 | P a g e

You might also like