You are on page 1of 20

Advanced Microprocessors

Intel 486 and Pentium

Features of Intel 486
It is a 32 bit µp introduced in 1989 therefore Size of data bus is 32 bits Size of address bus is 32 bits It is second generation of 80x86 32 bit processors. Two versions viz. SX and DX SX – isolated math coprocessor DX – integrated math co processor using VLSI technology First chip with on chip floating point unit (co - processor) Since address bus is of 32 bits, maximum physical memory accessed = 232 = 4Gb Maximum virtual memory accessible = 246 = 64T

BY UBAID SAUDAGAR

2

e.Features of Intel 486 Can execute 22 MIPS It is 3 to 5 times faster than 80386. 75. 50. decode1. It is a 168 pin IC and can be operated at 25. BY UBAID SAUDAGAR 3 . register write back (writing the result). fetch. external cache memory Level 1 cache of 8Kb also added i. Size of prefetch queue is 32 bytes. 33.e. Note : As the prefetch queue size increases. inside the processor (first processor to add on chip cache) Complete MMU on chip. execute. performance increases Level 2 cache added i. 66. and 100 MHz clock.e. 5 level pipelining is introduced i. decode2.

AMD 80386. AMD also launched similar processors named AMD 8086. 80386. 8086. 80486. AMD got separated. 80486. it started producing the processor on its own. 80386. Now as Intel launched its own processors 8086. AMD 80286. which was responsible for producing the processors. After Intel realized the growing market in the processor design field. AMD 80486. 80186. why not 80586 but Pentium ? In the beginning AMD was a part of Intel organization. To prevent this Intel decided to copyright the name of the processor and named the next chip as Pentium.8085. BY UBAID SAUDAGAR 4 . 80186. 80286. 80286. AMD 80186.

BY UBAID SAUDAGAR 5 .e.Features of Pentium It is a 32 bit microprocessor introduced in the year 1993. 32 bits 64 bit data can be processed at the same time 32 bit data can be processed It is a 273 pin PGA (pin grid array). Q: Pentium data bus size is greater than processor size (ALU size) Super scalar architecture (dual 5 stage pipeline) i.e. as register size is just the same i. ‘u’ pipeline and ‘v’ pipeline This allows it to complete more than one instruction per clock cycle ‘u’ pipeline can handle any instructions whereas ’v’ pipeline can handle simple. Size of data bus is 64 bits. most common instructions This does not mean that mean it can execute 64 bit applications.

66 MHz but later available at 90.e. 150.processor) is present on chip with the help of VVLSI technology. Branch prediction logic is an intelligent system in Pentium which identifies the branch instructions in order to avoid flushing of bytes from prefetch queue. 233 MHz Size of address bus is 32 bits. Numeric data processor (Co . 100. 8Kb data and 8Kb code (instruction)) Available at the clock speeds of 60. 120. Level 2 cache outside Level 1 cache (dedicated cache i.Features of Pentium Each pipeline has a separate 64 bytes of prefetch queue. 200. Therefore size of physical memory accessed = 232 = 4 Gb Accessible virtual memory is 64Tb BY UBAID SAUDAGAR 6 .

data cache and code cache (instruction cache) • 64 bit data bus • Parallel two instructions can be executed 80486 • Does not support super scalar architecture • Single 5 stage pipeline • Does not support branch prediction logic • Unified Cache used for both code and data • 32 bit data bus • Parallel two instructions cannot be executed BY UBAID SAUDAGAR 7 .80486 and Pentium PENTIUM • Super scalar architecture • Two pipeline ‘u’ and ’v’ pipeline. both 5 stages • Branch prediction logic used • Two Caches used viz.

‘u’ pipeline is the primary pipeline. The Pentium processor is a superscalar machine. When two instructions are dependent on each other then they cannot be executed simultaneously hence ‘u’ pipeline is given the preference. whereas the v-pipe can execute “simple” instructions. built around two general purpose integer pipelines. Both pipelines operate in parallel. BY UBAID SAUDAGAR 8 . The process of issuing two instructions in parallel is termed “pairing. allowing integer instructions to execute in a single clock in each pipeline. Processors able to execute parallel instructions are known as super scalar machines.” The u-pipe can execute any instruction in the Intel architecture.Super Scalar Execution With the help of super scalar execution Pentium processor can execute 2 instructions simultaneously.

Super Scalar Execution cont. in parallel. BX ADD AX.. BY UBAID SAUDAGAR 9 . Instructions which are dependent on each other cannot be executed simultaneously i.e. AX. therefore even this cannot be executed in parallel..e. Eg : ADD AX. CX In the above instructions what we notice is that opcodes are different but operand is same i. we notice that after the first equation the value of A changes. In the second equation C is dependent on A. Hence the two equations are dependent on each other therefore cannot be executed simultaneously. Eg : A = B + C C=A+B In the above instructions.

Super Scalar Execution cont… ‘u’ pipeline Fetch Instruction stream Fetch D1 D2 Execute RWB ‘v’ pipeline D1 D2 Execute RWB BY UBAID SAUDAGAR 10 .

If one of the fetched instructions is a jump.Generation of control word . It gives the address where to fetch the following instruction .Barrel Shifter is used to provide fast shift mechanism for multiplication and division. which predicts whether the Branch is going to be taken or not. Only ‘u’ pipe has a barrel shifter. else 1 bit/cycle. 2nd stage: (D1) . BY UBAID SAUDAGAR 11 .Checks whether any data dependency . the Branch Target Buffer is activated. Entire 32bits are shifted in one clock cycle.So if a complex instruction enters D1 ‘v’ pipe. or from the external memory.Super scalar execution cont… 1st stage : (prefetch) It consist of 2 prefetch queues of 64 bytes each and only one active at a time. At each clock cycle it collects two instructions either from the code cache. then it cannot be executed by ALU ‘v’. hence D1 decides whether to fwd such instructions or not. .

BY UBAID SAUDAGAR 12 . Hardwired control unit is faster than micro programmed but micro programmed control unit can handle complex instructions.Responsible for address calculation Stage 4 : (Execute) ‘u’ has a micro programmed control unit where as ‘v’ has hardwired control unit.Complete decode (identifies operand) .Super scalar execution cont… Stage 3 : (D2) .Control word is again decoded for final execution. .

Super scalar execution cont… 5th stage : (RWB) It writes back the instruction results in the destination registers. BY UBAID SAUDAGAR 13 . But answer of I1 which comes from ‘u’ pipe is written first. They will also enter ALU ‘u’ and ALU ‘v’ simultaneously. it is not allowed to write the answer on the register till ‘u’ pipeline answer is written because is primary. Although execution of ‘v’ pipeline takes place first. If I1 and I2 are pairable then they will enter D1 ‘u’ and D1 ‘v’ simultaneously. I2 executes faster than I1. But the execution time differs.

Pentium system architecture Code Cache Branch prediction Prefetch Buffers U pipe Integer ALU 32 bits Register Set V pipe Integer ALU 32 bits 2 buffers of 32 bytes each 64 bits Floating point unit Multiplier Adder Data Cache BY UBAID SAUDAGAR Divider 14 .

But due to separate cache in Pentium this problem cannot occur. a program that was data intensive filled the data cache very quickly leaving very little space for instructions. On chip cache is used to feed instructions and data to the CPU’s pipeline. There is an 8K byte instruction cache and 8K byte data cache. The instruction cache stores only instructions. The cache in the Pentium has been changed from the one found in 80486 microprocessor. If it is not available in the internal cache then external cache is searched. This slowed down the execution speed of 80486.Importance of Data and code cache The cache is a high speed RAM (access time less than 10ns) which is used to speed up access to memory and reduce traffic on processors buses. then external memory is accessed. If the instruction or data is found in the cache then a copy is send to the pipeline directly. When an instruction or data is required from the main memory. The Pentium contains two 8K byte cache memories instead of one as in the case of 80486. If not found over there. The problem faced in 80486 was that it was a unified cache. BY UBAID SAUDAGAR 15 . while data cache stores data used by instructions. the on chip cache will be searched first.

Importance of code and data cache cont… Pipeline instruction data On – chip cache External cache Main memory CPU BY UBAID SAUDAGAR 16 .

BTB has 256 entries. we need branch prediction logic which decides from where to fetch the next instruction. 4 way set associative (i. To avoid such a large amount of bytes to be flushed. BY UBAID SAUDAGAR 17 . Method : Imp : Prediction is based on history of instruction and not the instruction itself.e. 4 banks of 64 entries). Pentium uses branch prediction logic done using branch target buffer (BTB).Branch prediction logic Need : Whenever a jump occurs without branch prediction then we will have to flush all the instructions in the pipeline as well as prefetch queue (64 bytes).

In this case we take the decision that there is no jump (therefore we notice that branch prediction logic does not depend upon instruction but depends upon its prediction ). BY UBAID SAUDAGAR 18 . Therefore jump will be actually taken or not will be known at the time of execution. say 3000 : JUMP 8000. If the instruction is coming for the first time then there is no history. Now fetching will continue from the source address. Now branch prediction logic checks if source address is present in BTB. If prediction is wrong then the pipelined has to be flushed.Branch prediction logic cont… BTB stores a look up table: Source address Target address 3000 : JUMP 8000 Assume jump instruction comes in the ‘u’ pipeline.

If again a second time the same instruction comes. BY UBAID SAUDAGAR 19 .… But in Pentium rather than flushing the pipeline.Branch prediction logic cont. this time its available hence directly the next instructions will be fetched from the target address. Now a history is added in the BTB. branch prediction logic will check in the BTB. the pipeline is disabled and second queue is used and the newly activated queue starts fetching from target address (this happens when it is assumed that there is no jump and actually jump takes place which the µp comes across in the execution stage) hence queue A gets de activated and queue B gets activated and now fetching from target address.

which allows further increases in speed. MS-DOS compatibility for exception (interrupt) handling has also been built in. A new input signal.) In MS-DOS compatible mode.Floating point exceptions In the Intel486 and Pentium processors. IGNNE# ( IGNORE Numeric Error). which is the same as the native version of exception handling for the 80286 and 80287 and the Intel386 processors and Intel 387 math coprocessor. which generates Interrupt 16. BY UBAID SAUDAGAR 20 . more enhancements and speedup features have been added to the corresponding FPUs. with the NE bit in control register CR0 selecting the MS-DOS compatible mode if made zero. is provided to allow the FPU exception handler to execute FPU instructions. Also. the FERR# (Floating-point ERROR) output replaces the ERROR# signal from the previous generations. and is connected to a PIC. the FPU is built into the same chip as the processor. (NE=1 selects the native or internal mode.