You are on page 1of 20

Foundation of sequential programming

CSC 210
Lecturer in charge: ‘Bola Orogun(Mtech, MITPA)
BASIC MACHINE ARCHITECTURE

What is a computer?

A computer is a machine (hardware) usually electronic that is capable of executing


a sequence of instructions from a stored program (software)

 A micro-processor based computer is called a microcomputer.



 Major components of a computer:

1- Microprocessor (CPU)
2- Memory (RAM and/or ROM)
3- input/output devices
4- Clock. (Controls all devices)

Block Diagram of a Typical Machine

A microprocessor contains the following main components:

  ALU
Carries out the arithmetic functions of the computer. Also it performs the logical
functions.

 Control Unit
Is the “brain” of the computer. It determines which function to be done on data,
decodes instructions, and controls the flow of the program

 Register File
These are high speed storage locations. Hold internal data that the
processor currently using while executing programs.

Two kinds of registers exist:

1- Special purpose registers


Used by CPU. Most importantly are the program counter (PC), and
the processor status register.

2- General purpose registers


Used by programmers

MEMORY

Memory refers to the computer hardware integrated circuits that store information for
immediate use in a computer; it is synonymous with the term "primary storage".
Computer memory operates at a high speed, for example random-access memory (RAM),
as a distinction from storage that provides slow-to-access information but offers higher
capacities. If needed, contents of the computer memory can be transferred to secondary
storage, through a memory management technique called "virtual memory". An archaic
synonym for memory is storage.

What does computer memory look like?


Below is an example picture of a 512 MB DIMM computer memory module. This
memory module connects to the memory slot on a computer motherboard.

How is memory used


When a program such as your Internet browser is open, it is loaded from your hard drive
and placed into RAM, which allows that program to communicate with the processor at
higher speeds. Anything you save to your computer, such as a picture or video, is sent to
your hard drive for storage.

Why is memory important or needed for a computer?


All of devices on a computer do not operate at the same speed and computer memory
gives your computer a place to quickly access data. If the CPU had to wait for a
secondary storage device like a hard disk drive the computer would be much slower.

Main Memory (Primary storage)

A linear list of memory cells. Each cell holds a data word.


A word could be a byte, 2 bytes, 4 bytes .....

address
0 bp-1 ...................b0

1 .
2 .
.
.
.

n-3 .
n-2 .

n-1 .

memory unit of n cells

 All memories share two organizational features:-

1) Each information unit is the same size.

2) An information unit has a numbered address associated with it by which it can


be uniquely referenced.

A memory cell is characterized by two things:-

1) An address
2) Content

 Layers of memory:

1. Registers (1 clock cycle): the fastest possible access (usually 1 CPU


cycle). A few thousand bytes in size
2. Primary Cache: ~1 nano-second cache primary
Secondary Cache: 10 nano-seconds
3. fast RAM: 70 nanoseconds or less

4. fast disk: 10 milliseconds (10,000,000 nanoseconds)

5. CD/DVD: 160 milliseconds (160,000,000


nanoseconds) Memory Acronyms you should know:
1. ROM - Read Only Memory

ROM is memory that is "hard coded" from the manufacturer and never
changes, such as the instruction to "boot" up your computer. A ROM chip
is a non-volatile storage medium, which means it does not require a
constant source of power to retain the information stored on it.
A good example of ROM is the computer BIOS, a PROM chip that stores the
programming needed to begin the initial computer start up process.
ROM-type storage is still used and continues to be improved upon for better
performance and storage capacity.
2. RAM - Random Access Memory

Ram is the core memory you buy for any machine. It is often sold (today)
in .5, 1, 2 and 4 GIG boards. Random Access Memory (RAM), is a
volatile memory (it loses any information it is holding when the power is
turned off) that stores information on an integrated circuit used by the
operating system, software, and hardware.

 The basic memory cell on many computers including holds one byte (8 bits). Such
machines are called byte-addressable. Other machines have larger storage units.

 Components of the computer are connected by Buses.
A bus in the simplest form is a set of wires that used to carry information in the form of
electrical signals between CPU and memory and CPU and I/O

There are 3 buses in the system


- Data Bus
- Address Bus
- Control (signal) Bus
Computer Architecture

Computer architecture, is the internal structure of a digital computer, encompassing


the design and layout of its instruction set and storage registers. The architecture of a
computer is chosen with regard to the types of programs that will be run on it
(business, scientific, general-purpose, etc.).
 The architecture of a computer system is the user-visible interface: The structure and
operation of the system as seen by the programmer. It includes

- The instruction set the computer can obey
- The ways in which the instructions can specify the locations of data to be
processed
- The type and representation of data
- The format in which the instructions are stored in memory

History

The first documented computer architecture was in the correspondence between Charles
Babbage and Ada Lovelace, describing the analytical engine. When building the
computer Z1 in 1936, Konrad Zuse described in two patent applications for his future
projects that machine instructions could be stored in the same storage used for data, i.e.
the stored-program concept.

The term “architecture” in computer literature can be traced to the work of Lyle R.
Johnson, Frederick P. Brooks, Jr., and Mohammad Usman Khan, all members of the
Machine Organization department in IBM’s main research center in 1959. Johnson had
the opportunity to write a proprietary research communication about the Stretch, an IBM-
developed supercomputer for Los Alamos National Laboratory (at the time known as Los
Alamos Scientific Laboratory). To describe the level of detail for discussing the
luxuriously embellished computer, he noted that his description of formats, instruction
types, hardware parameters, and speed enhancements were at the level of “system
architecture”.

Advantages of a Microprocessor
  Low Cost
Microprocessors are available at low cost due to integrated circuit technology. Which will
reduce the cost of a computer system.

  High Speed
Microprocessor chips can work at very high speed due to the technology involved in it. It
is capable of executing millions of instructions per second.

  Small Size
Due to very large scale and ultra large scale integration technology, a microprocessor is fabricated
in a very less footprint. This will reduce the size of the entire computer system.

  Versatile
Microprocessors are very versatile, the same chip can be used for a number of applications by
simply changing the program (instructions stored in the memory).
  Low Power Consumption
Microprocessors are usually manufactured using metal oxide semiconductor technology, in which
MOSFETs (Metal Oxide Semiconductor Field Effect Transistors) are working in saturation and
cut off modes. So the power consumption is very low compared to others.

 Less Heat Generation
Compared to vacuum tube devices, semiconductor devices won’t emit that much heat.

 Reliable
Microprocessors are very reliable, failure rate is very less as semiconductor technology is used.

  Portable
Devices or computer system made with microprocessors can be made portable due to the small
size and low power consumption.

Clock
Also called clock rate, the speed at which a microprocessor executes instructions. ... The
CPU requires a fixed number of clock ticks (or clock cycles) to execute each instruction.
The faster the clock, the more instructions the CPU can execute per second. Clock speeds
are expressed in megahertz (MHz) or gigahertz ((GHz). The clock in a microprocessor serves to
coordinate the operations of the different parts of a microprocessor.

Computer architecture, like other architecture, is the art of determining the needs of the
user of a structure and then designing to meet those needs as effectively as possible
within economic and technological constraints.

Brooks went on to help develop the IBM System/360 (now called the IBM zSeries) line
of computers, in which “architecture” became a noun defining “what the user needs to
know”. Later, computer users came to use the term in many less-explicit ways.

The earliest computer architectures were designed on paper and then directly built into
the final hardware form. Later, computer architecture prototypes were physically built in
the form of a transistor–transistor logic (TTL) computer—such as the prototypes of the
6800 and the PA-RISC—tested, and tweaked, before committing to the final hardware
form. As of the 1990s, new computer architectures are typically "built", tested, and
tweaked—inside some other computer architecture in a computer architecture simulator;
or inside a FPGA as a soft microprocessor; or both—before committing to the final
hardware form.

Subcategories
The discipline of computer architecture has three main subcategories:

1. Instruction Set Architecture, or ISA. The ISA defines the machine code that a
processor reads and acts upon as well as the word size, memory address modes,
processor registers, and data type.

2. Microarchitecture, or computer organization describes how a particular processor


will implement the ISA. The size of a computer's CPU cache for instance, is an
issue that generally has nothing to do with the ISA.
3. System Design (implementation) includes all of the other hardware components
within a computing system. These include:
1. Data processing other than the CPU, such as direct memory access (DMA)
2. Other issues such as virtualization, multiprocessing, and software features.

Computer Architecture principal components or subsystems, each of which could be said


to have an architecture of its own, are:

 input/output,
 storage
 communication
 control, and
 processing.

Why computer architecture?


The purpose is to design a computer that maximizes performance while keeping power
consumption in check, costs low relative to the amount of expected performance, and is
also very reliable. For this, many aspects are to be considered which includes:

 instruction set design



 functional organization

 logic design, and

 implementation: The implementation involves integrated circuit design,
packaging, power, and cooling. Optimization of the design requires familiarity
with compilers, operating systems to logic design, and packaging.

Instruction set architecture


An instruction set architecture (ISA) is the interface between the computer's software and
hardware and also can be viewed as the programmer's view of the machine. Computers
do not understand high-level programming languages such as Java, C++, or most
programming languages used. A processor only understands instructions encoded in
some numerical fashion, usually as binary numbers. Software tools, such as compilers,
translate those high level languages into instructions that the processor can understand.

Besides instructions, the ISA defines items in the computer that are available to a
program—e.g. data types, registers, addressing modes, and memory. Instructions locate
these available items with register indexes (or names) and memory addressing modes.
ISAs vary in quality and completeness, which are:
 A good ISA compromises between programmer convenience (how easy the code
is to understand)

 size of the code (how much code is required to do a specific action)

 cost of the computer to interpret the instructions (more complexity means more
hardware needed to decode and execute the instructions), and

 speed of the computer (with more complex decoding hardware comes longer
decode time).

 A computer’s hardware has a small, fixed number of operations, or instructions that it
can perform. This is called the instruction set of the machine. It is defined by the
manufacturer.

 We will be concerned with the following issues:

- Operand storage in CPU


Where are operands kept other than in memory?

- Number of explicit operands named per instruction


How many operands are named explicitly in a typical instruction?

- Operand location (addressing modes)


Can any ALU instruction operand be located in
memory or must some or all of the operands be in internal storage in the CPU?
If an operand is located in memory, how is the memory location specified?

- Operations
What operations are provided in the instruction set?

- Type and size of operands


What is the type and size of each operand and how is it specified?

Instruction Set

 The instruction set of a microprocessor is the collection of all the


machine instructions that the processor can obey.

 Types of instructions

- Data movements instructions
- Arithmetic/ logic instructions
- Control (modify the execution order) instructions
Instruction Execution (that is how instructions are executed or performed in a
computer system)

 A computer is controlled by a program, which is a sequence of instructions.


Each instruction specifies a single operation to be performed by the CPU

ex.
ADD R7, ALPHA

  Instructions are stored in memory along with the data on which they operate.
 The processor FETCHES and instruction from memory, INTERPRETS what
 function is it to be performed, and EXECUTES the function on its operands (data)
 This is called the instruction fetch/execute cycle

Operands store
fetch result

Instruction Instruction Interrupt


fetch execution processing

Instruction execution Cycle

All instructions are composed of two components

1. An operation code (OPCODE) that specifies the function to be performed.

2. One or more operand specifiers that describe the locations of the information units on
which the operation is performed.

 A computer by itself can’t do anything. It needs software. A sequence of


instructions called a program, makes the computer work.

 Microprocessors by themselves only react to patterns of electrical signals.

 These patterns comprise what we call a machine language, which consists of the
binary patterns of 0’s and 1’s that represent the signals the CPU understand.
 Machine language is machine dependent and varies from model to model of
computers.
ex.
0000 0100 0110 0111 1001 111 000 0100 1100 0000

on the register, it means

add the integer 6 to the integer stored at memory location 467

“1100 0000” is the code for add instruction on the register

 The machine designers decide on these codes and patterns when designing
the machine.


 Programming in machine language is very difficult and tedious.


 A better way is needed to write programs, so Assembly language and high-
level language were introduced
Memory organization defines how instructions interact with the memory, and how
memory interacts with itself.

During design emulation software (emulators) can run programs written in a proposed
instruction set. Modern emulators can measure size, cost, and speed to determine if a
particular ISA is meeting its goals.
Computer organization (Microarchitecture)

Computer organization helps optimize performance-based products. For example,


software engineers need to know the processing power of processors. They may need to
optimize software in order to gain the most performance for the lowest price. This can
require quite detailed analysis of the computer's organization. For example, in a SD card,
the designers might need to arrange the card so that the most data can be processed in the
fastest possible way.

Computer organization also helps plan the selection of a processor for a particular
project. Multimedia projects may need very rapid data access, while virtual machines
may need fast interrupts. Sometimes certain tasks need additional components as well.
For example, a computer capable of running a virtual machine needs virtual memory
hardware so that the memory of different virtual computers can be kept separated.
Computer organization and features also affect power consumption and processor cost.

Implementation
Once an instruction set and micro-architecture are designed, a practical machine must be
developed. This design process is called the implementation. Implementation is usually
not considered architectural design, but rather hardware design engineering.
Implementation can be further broken down into several steps:

 Logic Implementation designs the circuits required at a logic gate level



Circuit Implementation does transistor-level designs of basic elements (gates,
multiplexers, latches etc.) as well as of some larger blocks (ALUs, caches etc.)
that may be implemented at the log gate level, or even at the physical level if the
design calls for it.

Physical Implementation draws physical circuits. The different circuit


components are placed in a chip floor plan or on a board and the wires connecting
them are created.

Design Validation tests the computer as a whole to see if it works in all situations
and all timings. Once the design validation process starts, the design at the logic
level are tested using logic emulators. However, this is usually too slow to run
realistic test. So, after making corrections based on the first test, prototypes are
constructed using Field-Programmable Gate-Arrays (FPGAs). Most hobby
projects stop at this stage. The final step is to test prototype integrated circuits.
Integrated circuits may require several redesigns to fix problems.

For CPUs, the entire implementation process is organized differently and is often referred
to as CPU design.

Design goals
The exact form of a computer system depends on the constraints and goals. Computer
architectures usually trade off standards-

 power versus performance – the amount of work accomplished by a computer


system depending on the context.

 cost- the amount of money, time and power for production

 memory capacity – the amount of memory that can be used for a computer device

latency - the amount of time that it takes for information from one node to travel
to the source)

 throughput – amount of work that a computer can do in a given period of time.

Sometimes other considerations, such as features, size, weight, reliability, and


expandability are also factors.

The most common scheme does an in depth power analysis and figures out how to keep
power consumption low, while maintaining adequate performance.
Performance
Modern computer performance is often described in IPC (instructions per cycle). This
measures the efficiency of the architecture at any clock frequency. Since a faster rate can
make a faster computer, this is a useful measurement. Older computers had IPC counts as
low as 0.1 instructions per cycle. Simple modern processors easily reach near 1.
Superscalar processors may reach three to five IPC by executing several instructions per
clock cycle.

Counting machine language instructions would be misleading because they can do


varying amounts of work in different ISAs. The "instruction" in the standard
measurements is not a count of the ISA's actual machine language instructions, but a unit
of measurement, usually based on the speed of the register computer architecture.

Many people used to measure a computer's speed by the clock rate (usually in MHz or
GHz). This refers to the cycles per second of the main clock of the CPU. However, this
metric is somewhat misleading, as a machine with a higher clock rate may not necessarily
have greater performance. As a result, manufacturers have moved away from clock speed
as a measure of performance.

Other factors influence speed, such as the mix of functional units, bus speeds, available
memory, and the type and order of instructions in the programs.
There are two main types of speed:

1. latency, and 2. throughput.


Latency is the time between the start of a process and its completion.

Throughput is the amount of work done per unit time. Interrupt latency is the guaranteed
maximum response time of the system to an electronic event (like when the disk drive
finishes moving some data).

Performance is affected by a very wide range of design choices — for example,


pipelining a processor usually makes latency worse, but makes throughput better.
Computers that control machinery usually need low interrupt latencies. These computers
operate in a real-time environment and fail if an operation is not completed in a specified
amount of time. For example, computer-controlled anti-lock brakes must begin braking
within a predictable, short time after the brake pedal is sensed or else failure of the brake
will occur.

Benchmarking takes all these factors into account by measuring the time a computer
takes to run through a series of test programs. Although benchmarking shows strengths, it
shouldn't be how you choose a computer. Often the measured machines split on different
measures. For example, one system might handle scientific applications quickly, while
another might render video games more smoothly. Furthermore, designers may target and
add special features to their products, through hardware or software that permit a specific
benchmark to execute quickly but don't offer similar advantages to general tasks.
Power efficiency
Power efficiency is another important measurement in modern computers. A higher
power efficiency can often be traded for lower speed or higher cost. The typical
measurement when referring to power consumption in computer architecture is MIPS/W
(millions of instructions per second per watt).

Modern circuits have less power required per transistor as the number of transistors per
chip grows. This is because each transistor that is put in a new chip requires its own
power supply and requires new pathways to be built to power it. However the number of
transistors per chip is starting to increase at a slower rate. Therefore, power efficiency is
starting to become as important, if not more important than fitting more and more
transistors into a single chip. Recent processor designs have shown this emphasis as they
put more focus on power efficiency rather than cramming as many transistors into a
single chip as possible. In the world of embedded computers, power efficiency has long
been an important goal next to throughput and latency.

Von Neumann architecture


Most modern computers follow the Von Neuman Architecture: This architecture is
based on the stored program concept. The program to be executed along with the data on
which it operates are stored in a memory device attached to the processor.

The von Neumann architecture, which is also known as the von Neumann model and
Princeton architecture, is a computer architecture based on the 1945 description by the
mathematician and physicist John von Neumann and others in the First Draft of a Report
on the EDVAC. This describes a design architecture for an electronic digital computer
with parts consisting of:
a processing unit containing an arithmetic logic unit and processor registers;
a control unit containing an instruction register and program counter;
a memory to store both data and instructions;
external mass storage;
input and output mechanisms.
The meaning has evolved to be any stored-program computer in which an instruction
fetch and a data operation cannot occur at the same time because they share a common
bus. This is referred to as the von Neumann bottleneck and often limits the performance
of the system.

The design of a von Neumann architecture machine is simpler than that of a Harvard
architecture machine, which is also a stored-program system but has one dedicated set
of address and data buses for reading data from and writing data to memory, and another
set of address and data buses for instruction fetching.

A stored-program digital computer is one that keeps its program instructions, as well
as its data, in read-write, random-access memory (RAM). Stored-program computers
were an advancement over the program-controlled computers of the 1940s, such as the
Colossus and the ENIAC, which were programmed by setting switches and inserting
patch cables to route data and to control signals between various functional units. In the
vast majority of modern computers, the same memory is used for both data and program
instructions, and the von Neumann vs. Harvard distinction applies to the cache
architecture, not the main memory (split cache architecture).

History
The earliest computing machines had fixed programs. Some very simple computers still
use this design, either for simplicity or training purposes. For example, a desk calculator
(in principle) is a fixed program computer. It can do basic mathematics, but it cannot be
used as a word processor or a gaming console. Changing the program of a fixed-program
machine requires rewiring, restructuring, or redesigning the machine. The earliest
computers were not so much "programmed" as they were "designed". "Reprogramming",
when it was possible at all, was a laborious process, starting with flowcharts and paper
notes, followed by detailed engineering designs, and then the often-arduous process of
physically rewiring and rebuilding the machine. It could take three weeks to set up a
program on ENIAC and get it working.
With the proposal of the stored-program computer, this changed. A stored-program
computer includes, by design, an instruction set and can store in memory a set of
instructions (a program) that details the computation.

A stored-program design also allows for self-modifying code. One early motivation for
such a facility was the need for a program to increment or otherwise modify the address
portion of instructions, which had to be done manually in early designs. This became less
important when index registers and indirect addressing became usual features of machine
architecture. Another use was to embed frequently used data in the instruction stream
using immediate addressing. Self-modifying code has largely fallen out of favor, since it
is usually hard to understand and debug, as well as being inefficient under modern
processor pipelining and caching schemes.

Capabilities
On a large scale, the ability to treat instructions as data is what makes assemblers,
compilers, linkers, loaders, and other automated programming tools possible. One can
"write programs which write programs". This has allowed a sophisticated self-hosting
computing ecosystem to flourish around von Neumann architecture machines.

Some high level languages such as LISP leverage the von Neumann architecture by
providing an abstract, machine-independent way to manipulate executable code at
runtime, or by using runtime information to tune just-in-time compilation (e.g. in the case
of languages hosted on the Java virtual machine, or languages embedded in web
browsers).

Development of the stored-program concept


The mathematician Alan Turing, who had been alerted to a problem of mathematical
logic by the lectures of Max Newman at the University of Cambridge, wrote a paper in
1936 entitled On Computable Numbers, with an Application to the
Entscheidungsproblem, which was published in the Proceedings of the London
Mathematical Society. In it he described a hypothetical machine which he called a
"universal computing machine", and which is now known as the "Universal Turing
machine". The hypothetical machine had an infinite memory that contained both
instructions and data. John von Neumann became acquainted with Turing while he was a
visiting professor at Cambridge in 1935, and also during Turing's PhD year at the
Institute for Advanced Study in Princeton, New Jersey during 1936 – 1937. In 1936,
Konrad Zuse also anticipated in two patent applications that machine instructions could
be stored in the same storage used for data.

Independently, J. Presper Eckert and John Mauchly, who were developing the ENIAC at
the Moore School of Electrical Engineering, at the University of Pennsylvania, wrote
about the stored-program concept in December 1943. In planning a new machine,
EDVAC, Eckert wrote in January 1944 that they would store data and programs in a new
addressable memory device, a mercury metal delay line memory. This was the first time
the construction of a practical stored-program machine was proposed. At that time, he
and Mauchly were not aware of Turing's work.
Von Neumann was involved in the Manhattan Project at the Los Alamos National
Laboratory, which required huge amounts of calculation. This drew him to the ENIAC
project, during the summer of 1944. There he joined into the ongoing discussions on the
design of this stored-program computer, the EDVAC. As part of that group, he wrote up a
description titled First Draft of a Report on the EDVAC based on the work of Eckert and
Mauchly. It was unfinished when his colleague Herman Goldstine circulated it with only
von Neumann's name on it, to the consternation of Eckert and Mauchly. The paper was
read by dozens of von Neumann's colleagues in America and Europe, and influenced the
next round of computer designs.

Jack Copeland considers that it is "historically inappropriate, to refer to electronic stored-


program digital computers as 'von Neumann machines'". His Los Alamos colleague Stan
Frankel said of von Neumann's regard for Turing's ideas:

Early von Neumann-architecture computers are


The First Draft described a design that was used by many universities and corporations to
construct their computers. Among these various computers, only ILLIAC and ORDVAC
had compatible instruction sets.

ARC2 (Birkbeck, University of London) officially came online on May 12, 1948.

Manchester Small-Scale Experimental Machine (SSEM), nicknamed "Baby"


(University of Manchester, England) made its first successful run of a stored-
program on June 21, 1948.

EDSAC (University of Cambridge, England) was the first practical stored-


program electronic computer (May 1949)

Manchester Mark 1 (University of Manchester, England) Developed from the


SSEM (June 1949)

CSIRAC (Council for Scientific and Industrial Research) Australia (November


1949)

EDVAC (Ballistic Research Laboratory, Computing Laboratory at Aberdeen


Proving Ground 1951)

ORDVAC (U-Illinois) at Aberdeen Proving Ground, Maryland (completed


November 1951)
IAS machine at Princeton University (January 1952)

MANIAC I at Los Alamos Scientific Laboratory (March 1952)


ILLIAC at the University of Illinois, (September 1952)
BESM-1 in Moscow (1952)
AVIDAC at Argonne National Laboratory (1953)
ORACLE at Oak Ridge National Laboratory (June 1953)
BESK in Stockholm (1953)
JOHNNIAC at RAND Corporation (January 1954)

DASK in Denmark (1955)


WEIZAC at the Weizmann Institute of Science in Rehovot, Israel (1955)
PERM in Munich (1956?)
SILLIAC in Sydney (1956)

Von Neumann bottleneck

The shared bus between the program memory and data memory leads to the von
Neumann bottleneck, the limited throughput (data transfer rate) between the central
processing unit (CPU) and memory compared to the amount of memory. Because the
single bus can only access one of the two classes of memory at a time, throughput is
lower than the rate at which the CPU can work. This seriously limits the effective
processing speed when the CPU is required to perform minimal processing on large
amounts of data. The CPU is continually forced to wait for needed data to be transferred
to or from memory. Since CPU speed and memory size have increased much faster than
the throughput between them, the bottleneck has become more of a problem, a problem
whose severity increases with every newer generation of CPU.

The von Neumann bottleneck was described by John Backus in his 1977 ACM Turing
Award lecture. According to Backus:

Surely there must be a less primitive way of making big changes in the store than by
pushing vast numbers of words back and forth through the von Neumann bottleneck. Not
only is this tube a literal bottleneck for the data traffic of a problem, but, more
importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking
instead of encouraging us to think in terms of the larger conceptual units of the task at
hand. Thus programming is basically planning and detailing the enormous traffic of
words through the von Neumann bottleneck, and much of that traffic concerns not
significant data itself, but where to find it.

Mitigations
There are several known methods for mitigating the Von Neumann performance
bottleneck. For example, the following all can improve performance:
Providing a cache between the CPU and the main memory

providing separate caches or separate access paths for data and instructions (the
so-called Modified Harvard architecture)
using branch predictor algorithms and logic
providing a limited CPU stack or other on-chip scratchpad memory to reduce
memory access

The problem can also be sidestepped somewhat by using parallel computing, using for
example the non-uniform memory access (NUMA) architecture—this approach is
commonly employed by supercomputers. It is less clear whether the intellectual
bottleneck that Backus criticized has changed much since 1977. Backus's proposed
solution has not had a major influence. Modern functional programming and object-
oriented programming are much less geared towards "pushing vast numbers of words
back and forth" than earlier languages like FORTRAN were, but internally, that is still
what computers spend much of their time doing, even highly parallel supercomputers.

As of 1996, a database benchmark study found that three out of four CPU cycles were
spent waiting for memory. Researchers expect that increasing the number of
simultaneous instruction streams with multithreading or single-chip multiprocessing will
make this bottleneck even worse.

Self-modifying code
Aside from the von Neumann bottleneck, program modifications can be quite harmful,
either by accident or design. In some simple stored-program computer designs, a
malfunctioning program can damage itself, other programs, or the operating system,
possibly leading to a computer crash. Memory protection and other forms of access
control can usually protect against both accidental and malicious program modification.

Program modifications can be beneficial. The Von Neumann architecture allows for
encryption.

You might also like