A silicon chip that contains a CPU. In the world of personal computers, the terms microprocessor and CPU are used interchangeably. At the heart of all personal computers and most workstations sits a microprocessor. Microprocessors also control the logic of almost all digital devices, from clock radios to fuel-injection systems for automobiles. Three basic characteristics differentiate microprocessors: Instruction set: The set of instructions that the microprocessor can execute. bandwidth : The number of bits processed in a single instruction. clock speed : Given in megahertz (MHz), the clock speed determines how many instructions per second the processor can execute. In both cases, the higher the value, the more powerful the CPU. For example, a 32-bit microprocessor that runs at 50MHz is more powerful than a 16bit microprocessor that runs at 25MHz. In addition to bandwidth and clock speed, microprocessors are classified as being either RISC (reduced instruction set computer) or CISC (complex instruction set computer).

Microprocessors made by Intel Corporation form the foundation of all PCs. Models after the 8086 are often referred to by the last three digits (for example, the 286, 386, and 486 ). Many of the A MICROPROCESSOR microprocessors come in different varieties that run at various clock rates. The 80486 architecture, for example, supports clock rates of from 33 to 66 MHz's Because Intel discovered that it couldn't trademark its CPU numbers, it shifted to a naming scheme, starting with the Pentium processors. Intel's latest and sixth-generation chip is called the Pentium Pro. All Intel microprocessors are backward compatible, which means that they can run programs written for a less powerful processor. The 80386, for example, can run programs written for the 8086, 8088, and 80286. The 80386 and later models, however, offer special programming features not available on previous models. Software written specifically for these processors, therefore, may not run on older microprocessors. The common architecture behind all Intel microprocessors is known as the x86 architecture.

Slot 1 processors have the microprocessor and level 2 cache memory mounted on a circuit board. Two of the leading manufacturers of Intelcompatible chips are Cyrix and AMD. CACHE :A special high-speed storage mechanism.Until the late 80s. These chips support the Intel instruction set and are often less expensive than Intel chips. Intel was essentially the only producer of PC microprocessors. Intel is facing competition from other manufacturers who produce "Intel-compatible " chips. It can be either a reserved section of main memory or an independent high-speed storage device. . however. (or card). Below is the processor. Increasingly. which is enclosed inside of a protective shell. being inserted into a slot 1 motherboard connection. they also offer better performance. In some cases.

a Sparc or any of the many other brands and types of microprocessors.is a complete computation engine that is fabricated on a single chip. a PowerPC.all it could do was add and subtract. and it could only do that 4 bits at a time.The computer you are using to read this page uses a microprocessor to do its work. engineers built computers either from collections of chips or from discrete components (transistors wired one at a time). whether it is a desktop machine. but they all do approximately the same thing in approximately the same way. introduced in 1971. The 4004 was not very powerful -. But it was amazing that everything was on one chip. Prior to the 4004. The 4004 powered one of the first portable electronic calculators. a K6. The microprocessor you are using might be a Pentium.also known as a CPU or central processing unit -. a server or a laptop. . A microprocessor -. The microprocessor is the heart of any normal computer. The first microprocessor was the Intel 4004.

000 times faster! The table on the next slide will help you to understand the differences between the different processors that Intel has introduced over the years. . All of these microprocessors are made by Intel and all of them are improvements on the basic design of the 8088. The first microprocessor to make a The Intel 8080 was the real splash in the market was the Intel 8088. The Pentium 4 can execute any piece of code that ran on the original 8088. introduced in first microprocessor in a home computer. introduced in 1974. you know that the PC market moved from the 8088 to the 80286 to the 80386 to the 80486 to the Pentium to the Pentium II to the Pentium III to the Pentium 4. but it does it about 5.MICROPROCESSOR PROGRESSION The first microprocessor to make it into a home computer was the Intel 8080. a complete 8-bit computer on one chip. 1979 and incorporated into the IBM PC (which first appeared around 1982). If you are familiar with the PC market and its history.

000 134.000 Pentium II Pentium III Pentium 4 Pentium 4 "Prescott" 7.6 GHz 8 bits 16 bits 8-bit bus 16 bits 32 bits 32 bits 32 bits 64-bit bus 32 bits 64-bit bus 32 bits 64-bit bus 32 bits 64-bit bus 32 bits 64-bit bus 0.000.000 3.000 9.000 0.000 29.33 1 5 20 100 ~300 ~510 ~1.000.700 ~7.000 42.Name 8080 8088 80286 80386 80486 Pentium Date 1974 1979 1982 1985 1989 1993 1997 1999 2000 2004 Transistors 6.64 0.35 0.5 1.000 125.000 275.000 Microns Clock speed Data width MIPS 6 3 1.5 1 0.100.500.18 0.200.8 2 MHz 5 MHz 6 MHz 16 MHz 25 MHz 60 MHz 233 MHz 450 MHz 1.09 .25 0.000 1.500.5 GHz 3.

As the feature size on the chip goes down. in microns. the external data bus is the same width as the ALU. two 8-bit numbers. while a 32-bit ALU can do it in one instruction. An 8-bit ALU can add/subtract/multiply/etc. Transistors is the number of transistors on the chip. The 8088 had a 16-bit ALU and an 8-bit bus. An 8-bit ALU would have to execute four instructions to add two 32-bit numbers. Microns is the width. while a 32-bit ALU can manipulate 32-bit numbers. In many cases. of the smallest wire on the chip. the number of transistors rises. while the modern Pentiums fetch data 64 bits at a time for their 32-bit ALUs. Many processors are reintroduced at higher clock speeds for many years after the original release date.Information about this table: The date is the year that the processor was first introduced. Clock speed is the maximum rate that the chip can be clocked at. For comparison. You can see that the number of transistors on a single chip has risen steadily over the years. . Clock speed will make more sense in the next section. a human hair is 100 microns thick. Data Width is the width of the ALU. but not always.

the 8088 clocked at 5 MHz but only executed at 0. The maximum clock speed is a function of the manufacturing process and delays within the chip. That improvement is directly related to the number of transistors on the chip . Modern CPUs can do so many different things that MIPS ratings lose a lot of their meaning. but you can get a general sense of the relative power of the CPUs from this column. There is also a relationship between the number of transistors and MIPS. From this table you can see that. For example.33 MIPS (about one instruction per 15 clock cycles). in general.MIPS stands for "millions of instructions per second" and is a rough measure of the performance of a CPU. Modern processors can often execute at a rate of two instructions per clock cycle. there is a relationship between clock speed and MIPS.

subtraction. . ‡A microprocessor can make decisions and jump to a new set of instructions based on those decisions. A microprocessor executes a collection of machine instructions that tell the processor what to do. Based on the instructions. In the process you can also learn about assembly language -. Modern microprocessors contain complete floating point processors that can perform extremely sophisticated operations on large floating point numbers. multiplication and division.and many of the things that engineers can do to boost the speed of a processor. it is helpful to look inside and learn about the logic used to create one.the native language of a microprocessor -. ‡A microprocessor can move data from one memory location to another. but those are its three basic activities. a microprocessor can perform mathematical operations like addition. a microprocessor does three basic things: ‡Using its ALU (Arithmetic/Logic Unit). There may be very sophisticated things that a microprocessor does.MICROPROCESSORS LOGIC To understand how a microprocessor works.

The following diagram shows an extremely simple microprocessor capable of doing those three things: .

Here are the components of this simple microprocessor: ‡Registers A. B and C. B and C are simply latches made out of flip-flops. ‡The address latch is just like registers A. 16 or 32 bits wide) that can send data to memory or receive data from memory ‡An RD (read) and WR (write) line to tell the memory whether it wants to set or get the addressed location ‡A clock line that lets a clock pulse sequence the processor ‡A reset line that resets the program counter to zero (or whatever) and restarts execution Let's assume that both the address and data buses are 8 bits wide in this example. This microprocessor has: ‡An address bus (that may be 8.This is about as simple as a microprocessor gets. 16 or 32 bits wide) that sends an address to memory ‡A data bus (that may be 8. . ‡The program counter is a latch with the extra ability to increment by 1 when told to do so. and also to reset to zero when told to do so.

A tri-state buffer allows multiple outputs to connect to a wire. etc. These are tri-state ‡The instruction register and instruction decoder are responsible for controlling all of the other components. if one is greater than the other. ‡The ALU could be as simple as an 8-bit adder (see the section on adders in performed in the ALU. An ALU can normally compare two numbers and determine if they are equal. . A tri-state buffer can pass a 1. but only one of them to actually drive a 1 or a 0 onto the line. The test register can also normally hold a carry bit from the last stage of the adder.How Boolean Logic Works for details). Let's assume the latter here. or it might be able to add. ‡The test register is a special latch that can hold values from comparisons buffers. ‡There are six boxes marked "3-State" in the diagram. multiply and divide 8-bit values. subtract. a 0 or it can essentially disconnect its output (imagine a switch that totally disconnects the output line from the wire that the output is heading toward). It stores these values in flip-flops and then the instruction decoder can use the values to make decisions.

Although they are not shown in this diagram. . as well as the bits from the instruction register. there would be control lines from the instruction decoder that would: ‡Tell the A register to latch the value currently on the data bus ‡Tell the B register to latch the value currently on the data bus ‡Tell the C register to latch the value currently output by the ALU ‡Tell the program counter register to latch the value currently on the data bus ‡Tell the address register to latch the value currently on the data bus ‡Tell the instruction register to latch the value currently on the data bus ‡Tell the program counter to increment ‡Tell the program counter to reset to zero ‡Activate any of the six tri-state buffers (six separate lines) ‡Tell the ALU what operation to perform ‡Tell the test register to latch the ALU's test bits ‡Activate the RD line ‡Activate the WR line Coming into the instruction decoder are the bits from the test register and the clock line.

.That means that the microprocessor can address (28) 256 bytes of memory. A ROM CHIP ROM stands for read-only memory.generally both. When the RD line changes state. These buses and lines connect either to RAM or ROM -. Let's assume that this simple microprocessor has 128 bytes of ROM starting at address 0 and 128 bytes of RAM starting at address 128. as well as the RD and WR lines. In our sample microprocessor. we have an address bus 8 bits wide and a data bus 8 bits wide. and it can read or write 8 bits of the memory at a time.MICROPROCESSOR MEMORY The previous section talked about the address and data buses. A ROM chip is programmed with a permanent collection of pre-set bytes. the ROM chip presents the selected byte onto the data bus. The address bus tells the ROM chip which byte to get and place on the data bus.

and the microprocessor can read or write to those bytes depending on whether the RD or WR line is signaled.many microcontrollers do this by placing a handful of RAM bytes on the processor chip itself -. and so on. That is why the computer needs ROM. which the microprocessor then executes. One problem with today's RAM chips is that they forget everything once the power goes off. it begins executing instructions it finds in the BIOS. RAM contains bytes of information. This is how the microprocessor loads and executes the entire operating system. .but generally impossible to create one that contains no ROM). and the BIOS stores it in RAM after reading it off the disk. The microprocessor then begins executing the boot sector's instructions from RAM. and then it goes to the hard disk to fetch the boot sector (see How Hard Disks Work for details). The boot sector program will tell the microprocessor to fetch something else from the hard disk into RAM. The BIOS instructions do things like test the hardware in the machine. nearly all computers contain some amount of ROM (it is possible to create a simple computer that contains no RAM -. the ROM is called the BIOS (Basic Input/Output System). This boot sector is another small program. When A RAM CHIP the microprocessor starts. On a PC. By the way.RAM stands for random-access memory.

each one of which has a different meaning when loaded into the instruction register.Save register C to memory address ADD . so a set of short words are defined to represent the different bit patterns.Load register A from memory address LOADB mem .HOW MICROPROCESSOR WORK Even the incredibly simple microprocessor shown in the previous example will have a fairly large set of instructions that it can perform.Load register B from memory address CONB con . Humans are not particularly good at remembering bit patterns.Multiply A and B and store the result in C . An assembler can translate the words into their bit patterns very easily.Save register B to memory address SAVEC mem . and then the output of the assembler is placed in memory for the microprocessor to execute.Subtract A and B and store the result in C MUL . The collection of instructions is implemented as bit patterns. This collection of words is called the assembly language of the processor.Add A and B and store the result in C SUB . Here's the set of assembly language instructions that the designer might create for the simple microprocessor in our example: LOADA mem .Load a constant value into register B SAVEB mem .

to address JGE addr . if equal. if not equal. if less than. if greater than. f=1. . to address STOP . to address JNEQ addr .Divide A and B and store the result in C COM .Jump. if greater than or equal.Stop execution If you have read How C Programming Works. to address JLE addr . a = a + 1.Jump. then you know that this simple piece of C code will calculate the factorial of 5 : a=1.Jump. to address JL addr .DIV . if less than or equal.Jump. while (a <= 5) { f = f * a. to address JG addr .Jump. } At the end of the program's execution.Compare A and B and store the result in test JUMP addr .Jump to an address JEQ addr .Jump. the variable f contains the factorial of 5.

ASSEMBLY LANGUAGE A C compiler translates this C code into assembly language. then for our simple microprocessor the assembly language might look like this: // Assume a is at address 128 // Assume F is at address 129 0 CONB 1 // a=1. 3 SAVEB 129 4 LOADA 128 // if a > 5 the jump to 17 5 CONB 5 6 COM 7 JG 17 8 LOADA 129 // f=f*a. 9 LOADB 128 10 MUL 11 SAVEC 129 12 LOADA 128 // a=a+1. 13 CONB 1 14 ADD 15 SAVEC 128 16 JUMP 4 // loop back to if 17 STOP . 1 SAVEB 128 2 CONB 1 // f=1. Assuming that RAM starts at address 128 in this processor. and ROM (which contains the assembly language program) starts at address 0.

6 SUB .11 JEQ addr . "How do all of these instructions look in ROM?" Each of these assembly language instructions must be represented by a binary number.ROM So now the question is. like this: LOADA .12 JNEQ addr .5 ADD .13 JG addr . For the sake of simplicity.18 The numbers are known as opcodes.4 SAVEC mem .17 STOP .10 JUMP addr .3 SAVEB .15 JL addr .1 LOADB .8 DIV .16 JLE addr .2 CONB .9 COM .7 MUL . let's assume each assembly language instruction is given a unique number.14 JGE addr . .

In ROM. our little program would look like this: // Assume a is at address 128 // Assume F is at address 129 Addr opcode/value 0 3 // CONB 1 11 2 4 // SAVEB 128 3 128 4 3 // CONB 1 51 6 4 // SAVEB 129 7 129 8 1 // LOADA 128 9 128 10 3 // CONB 5 11 5 12 10 // COM 13 14 // JG 17 14 31 15 1 // LOADA 129 .

and that became 32 bytes in ROM. .16 129 17 2 // LOADB 128 18 128 19 8 // MUL 20 5 // SAVEC 129 21 129 22 1 // LOADA 128 23 128 24 3 // CONB 1 25 1 26 6 // ADD 27 5 // SAVEC 128 28 128 29 11 // JUMP 4 30 8 31 18 // STOP You can see that seven lines of C code became 18 lines of assembly language.

might take two or three clock cycles. Some instructions. It needs to do very little: ‡set the operation of the ALU to addition ‡latch the output of the ALU into the C register 3. . Others might take five or six clock cycles. like this ADD instruction. During the second clock cycle. Therefore the instruction decoder needs to: ‡activate the tri-state buffer for the program counter ‡activate the RD line ‡activate the data-in tri-state buffer ‡latch the instruction into the instruction register 2. During the first clock cycle. Let's take the ADD instruction as an example and look at what it needs to do: 1. the program counter is incremented (in theory this could be overlapped into the second clock cycle).DECODING The instruction decoder needs to turn each of the opcodes into a set of signals that drive the different components inside the microprocessor. Every instruction can be broken down as a set of sequenced operations like these that manipulate the components of the microprocessor in the proper order. the ADD instruction is decoded. we need to actually load the instruction. During the third clock cycle.

More transistors also allow for a technology called pipelining. much more powerful multipliers capable of single-cycle speeds become possible. it took approximately 80 cycles just to do one 16-bit multiplication on the 8088. a typical instruction in a processor like an 8088 took 15 clock cycles to execute.MICROPROCESSOR PERFORMANCE The number of transistors available has a huge effect on the performance of a processor. each with its own pipeline. With more transistors. This allows for multiple instruction streams. Because of the design of the multiplier. instruction execution overlaps. In a pipelined architecture. Many modern processors have multiple instruction decoders. That way it looks like one instruction completes every clock cycle. there can be five instructions in various stages of execution simultaneously. As seen earlier. This technique can be quite complex to implement. which means that more than one instruction can complete during each clock cycle. so it takes lots of transistors. . So even though it might take five clock cycles to execute each instruction.

The newest thing in processor design is 64-bit ALUs. leading to the multi-million transistor powerhouses available today. 64-bit registers. There has also been a tendency toward special instructions (like the MMX instructions) that make certain operations particularly efficient. These processors can execute about one billion instructions per second! 64-BIT PROCESSOR Sixty-four-bit processors have been with us since 1992. and the addition of hardware virtual memory support and L1 caching on the processor chip.MICROPROCESSOR TRENDS The trend in processor design has primarily been toward full 32-bit ALUs with fast floating point processors built in and pipelined execution with multiple instruction streams. All of these trends push up the transistor count. Sixty-four-bit processors have 64-bit ALUs. and people are expected to have these processors in their home PCs in the next decade. Both Intel and AMD have introduced 64-bit chips. 64-bit buses and so on. and in the 21st century they have started to become mainstream. . and the Mac G5 sports a 64-bit processor.

That sounds like a lot. However. Thirty-two-bit chips are often constrained to a maximum of 2 GB or 4 GB of RAM access. tangible benefits at the moment. but what about normal users? Beyond the RAM solution. given that most home computers currently use only 256 MB to 512 MB of RAM. They can process data (very complex data features lots of real numbers) faster.One reason why the world needs 64-bit processors is because of their enlarged address spaces. These features can greatly increase system performance. . And even home machines will start bumping up against the 2 GB or 4 GB limit pretty soon if current trends continue. browsing the Web and editing Word documents is not really using the processor in that way. it is not clear that a 64-bit chip offers "normal users" any real. once they are re-coded to take advantage of 64-bit features. But the average user who is reading e-mail. With a 64-bit address bus and wide. High-end games will also benefit.2^64 bytes of RAM is something on the order of a billion gigabytes of RAM. Servers can definitely benefit from 64 bits. a 4-GB limit can be a severe problem for server machines and machines running large databases. 64-bit machines also offer faster I/O (input/output) speeds to things like hard disk drives and video cards. People doing video editing and people doing photographic editing on very large images benefit from this kind of computing power. high-speed data buses on the motherboard. A 64-bit chip has none of these constraints because a 64-bit RAM address space is essentially infinite for the foreseeable future -.

Cache helps to speed up processors because it works on the principal of locality. This means that if data is in the cache. so that it can be accessed faster. Data is moved from the main memory to the cache. Cache runs at the same speed as the rest of the processor. Data and instructions that are being used frequently. are stored in the cache and can be read quickly without having to access the main memory. accessing it is faster than accessing memory. Increasing chip performance is typically achieved by increasing the speed and efficiency of chip cache. typically out of it's original order. designers often allocate more die area to caches than the CPU itself. which is typically much faster than the external RAM operates at. . Modern chip designers put several caches on the same die as the processor. such as a data array or a small instruction loop.CACHE A cache is a small amount of memory which operates more quickly than main memory. Cache works by storing a small subset of the external memory contents.

Consider the traversal of an array. If is a good idea to keep recently used items in the cache. or the act of storing local variables on a stack. when one data item is accessed. Temporal Locality When data item is accessed. it is likely that the same data item will be accessed again. and therefore we have two rules about locality: Spatial Locality When a data item is accessed. For instance. . and not over-write data that has been recently used. it is a good idea to load the surrounding memory area into the cache at the same time. it is likely that data items in sequential memory locations will also be accessed. Modern computer programs are typically loop-based. variables are typically read and written to in rapid succession.Principal of Locality There are two types of locality. In these cases. spatial and temporal.

.There are a number of factors that affect the size of cache on a chip: 1. More cache means fewer delays in accessing data. This means there is more area on the die for additional cache. chip caches tend to get larger and larger with each generation of chip. and therefore better performance. 2. 3. Processor components become smaller as transistors become smaller. These extra transistors are easily converted to large caches. Because of these factors. Moore's law provides an increasing number of transistors per chip. more transistors are available per chip than a designer can use to make a CPU. After around 1989.

equal to the probability that any particular access will miss. . MPI. We can calculate the amount of time wasted because of cache miss stalls as: Likewise. the processor will need to stall the current instruction until the cache can fetch the correct data from a higher level. The number of memory accesses in a particular program is denoted as Am. The rate of misses. The average amount of time lost for each miss is known as the miss penalty. Read Stall Times To calculate the amount of time lost to cache read misses. and Pr is the time or cycle penalty associated with a read miss. N. if we have the total number of instructions in a program. we can perform the same basic calculations as above: Ar is the average number of read accesses. is denoted rm. rr is the miss rate on reads. we can calculate the lost time as: If instead of lost time we measure the miss penalty in the amount of lost cycles. The amount of time lost by the stall is dependent on a number of factors. and is denoted as Pm. instead of the amount of time lost to memory stalls. and the average number of misses per instruction. the calculation will instead produce the number of cycles lost to memory stalls. and the rest will miss the cache. some of those accesses will hit the cache.If the cache misses.

Write Stall Times Determining the amount of time lost to write stalls is similar. but an additional additive term that represents stalls in the write buffer needs to be included: Where Twb is the amount of time lost because of stalls in the write buffer .

5 MCS-96 Family 4 The bit-slice processor 4.2 MCS-48 Family 3.1 8008 2.2 4040 2 The 8-bit processors 2.4 MCS-51 Family 3.3 Intel 8051 3.1 5000 Family .1 Intel 4004: first single-chip microprocessor 1.2 8080 2.1 Intel 8048 3.1 PLDs Family 6 Signal Processor 6.1 2900 Family 7 Digital Clocks Processor 7.1 3000 Family 5 iPLDs:Intel Programmable Logic Devices 5.3 8085 3 Microcontrollers 3.LIST OF INTEL MICROPROCESSORS 1 The 4-bit processors 1.

2 i960 aka 80960 9.1 80486DX 11.2 8088 8.4 80386SL 10.3 80486DX2 11.3 80376 10.1 8086 8.2 80386SX 10.5 80486DX4 .4 80486SL 11.4 Xscale 10 32-bit processors: the 80386 range 10.6 80286 9 32-bit processors: the non-x86 microprocessors 9.8 The 16-bit processors: origin of x86 8.1 80386DX 10.4 80186 8.2 80486SX 11.5 80386EX 11 32-bit processors: the 80486 range 11.3 MCS-86 Family 8.1 iAPX 432 9.3 i860 aka 80860 9.5 80188 8.

5 Pentium II and III Xeon 13.7 Celeron (Pentium III Tualatin-based) 13.2 Pentium II 13.12 32-bit processors: P5 microarchitecture 12.11 Dual-Core Xeon LV 14 32-bit processors: NetBurst microarchitecture 14.2 Xeon 14.2 Pentium with MMX Technology 13 32-bit processors: P6/Pentium M microarchitecture 13.1 Pentium Pro 13.6 Pentium 4F .8 Pentium M 13.4 Pentium 4 EE 14.5 Pentium 4E 14.6 Celeron (Pentium III Copperminebased) 13.3 Celeron (Pentium II-based) 13.1 Original Pentium 12.4 Pentium III 13.10 Intel Core 13.1 Pentium 4 14.9 Celeron M 13.3 Mobile Pentium 4-M 14.

1 Intel Pentium 19.3 Core i5 19.3 Pentium Extreme Edition 16.2 Intel Core 2 17.5 Xeon .4 Core i7 19.1 Pentium 4F 16.1 Xeon 17.2 Core i3 19.2 Itanium 2 16 64-bit processors: Intel 64 NetBurst microarchitecture 16.3 Pentium Dual Core 17.2 Pentium D 16.15 64-bit processors: IA-64 15.5 Celeron M 18 32-bit processors: Intel 32 Intel Atom 19 64-bit processors: Intel 64 Nehalem microarchitecture 19.4 Xeon 17 64-bit processors: Intel 64 Core microarchitecture 17.4 Celeron 17.1 Itanium 15.