Advanced Microprocessors

Advanced Microprocessors
Intel 486 and Pentium
Features of Intel 486

It is a 32 bit p introduced in 1989 therefore Size of data bus is 32 bits Size of address bus is 32 bits It is second generation of 80x86 32 bit processors. Two versions viz. SX and DX SX isolated math coprocessor DX integrated math co processor using VLSI technology First chip with on chip floating point unit (co - processor) Since address bus is of 32 bits, maximum physical memory accessed = 232 = 4Gb Maximum virtual memory accessible = 246 = 64T
BY UBAID SAUDAGAR
Features of Intel 486

Can execute 22 MIPS It is 3 to 5 times faster than 80386. It is a 168 pin IC and can be operated at 25, 33, 50, 66, 75, and 100 MHz clock. 5 level pipelining is introduced i.e. fetch, decode1, decode2, execute, register write back (writing the result). Size of prefetch queue is 32 bytes. Note : As the prefetch queue size increases, performance increases Level 2 cache added i.e. external cache memory Level 1 cache of 8Kb also added i.e. inside the processor (first processor to add on chip cache) Complete MMU on chip.
BY UBAID SAUDAGAR
8085, 8086, 80186, 80286, 80386, 80486, why not 80586 but Pentium ?
In the beginning AMD was a part of Intel organization, which was responsible for producing the processors. After Intel realized the growing market in the processor design field, it started producing the processor on its own, AMD got separated.
Now as Intel launched its own processors 8086, 80186, 80286, 80386, 80486. AMD also launched similar processors named AMD 8086, AMD 80186, AMD 80286, AMD 80386, AMD 80486.
To prevent this Intel decided to copyright the name of the processor and named the next chip as Pentium.
BY UBAID SAUDAGAR
Features of Pentium
It is a 32 bit microprocessor introduced in the year 1993. Size of data bus is 64 bits. Q: Pentium data bus size is greater than processor size (ALU size) Super scalar architecture (dual 5 stage pipeline) i.e. u pipeline and v pipeline This allows it to complete more than one instruction per clock cycle u pipeline can handle any instructions whereas v pipeline can handle simple, most common instructions This does not mean that mean it can execute 64 bit applications, as register size is just the same i.e. 32 bits 64 bit data can be processed at the same time 32 bit data can be processed It is a 273 pin PGA (pin grid array).
BY UBAID SAUDAGAR
Features of Pentium
Each pipeline has a separate 64 bytes of prefetch queue. Branch prediction logic is an intelligent system in Pentium which identifies the branch instructions in order to avoid flushing of bytes from prefetch queue. Numeric data processor (Co - processor) is present on chip with the help of VVLSI technology. Level 2 cache outside
Level 1 cache (dedicated cache i.e. 8Kb data and 8Kb code (instruction))
Available at the clock speeds of 60, 66 MHz but later available at 90, 100, 120, 150, 200, 233 MHz Size of address bus is 32 bits.
Therefore size of physical memory accessed = 232 = 4 Gb

Accessible virtual memory is 64Tb
BY UBAID SAUDAGAR
80486 and Pentium

PENTIUM Super scalar architecture Two pipeline u and v pipeline, both 5 stages Branch prediction logic used Two Caches used viz. data cache and code cache (instruction cache) 64 bit data bus Parallel two instructions can be executed 80486 Does not support super scalar architecture Single 5 stage pipeline Does not support branch prediction logic Unified Cache used for both code and data
32 bit data bus

Parallel two instructions cannot be executed
BY UBAID SAUDAGAR
Super Scalar Execution

With the help of super scalar execution Pentium processor can execute 2 instructions simultaneously. u pipeline is the primary pipeline. When two instructions are dependent on each other then they cannot be executed simultaneously hence u pipeline is given the preference.
Processors able to execute parallel instructions are known as super scalar machines.
The Pentium processor is a superscalar machine, built around two general purpose integer pipelines. Both pipelines operate in parallel, allowing integer instructions to execute in a single clock in each pipeline.
The process of issuing two instructions in parallel is termed pairing. The u-pipe can execute any instruction in the Intel architecture, whereas the v-pipe can execute simple instructions.
BY UBAID SAUDAGAR
Super Scalar Execution cont...

Instructions which are dependent on each other cannot be executed simultaneously i.e. in parallel. Eg : ADD AX, BX ADD AX, CX In the above instructions what we notice is that opcodes are different but operand is same i.e. AX, therefore even this cannot be executed in parallel.
Eg : A = B + C
C=A+B In the above instructions, we notice that after the first equation the value of A changes. In the second equation C is dependent on A. Hence the two equations are dependent on each other therefore cannot be executed simultaneously.
BY UBAID SAUDAGAR
Super Scalar Execution cont

u pipeline Fetch Instruction stream Fetch D1 D2 Execute RWB v pipeline D1 D2 Execute RWB
BY UBAID SAUDAGAR
10
Super scalar execution cont

1st stage : (prefetch) It consist of 2 prefetch queues of 64 bytes each and only one active at a time. At each clock cycle it collects two instructions either from the code cache, or from the external memory. If one of the fetched instructions is a jump, the Branch Target Buffer is activated, which predicts whether the Branch is going to be taken or not. It gives the address where to fetch the following instruction . 2nd stage: (D1) - Generation of control word - Checks whether any data dependency - Barrel Shifter is used to provide fast shift mechanism for multiplication and division. Entire 32bits are shifted in one clock cycle,
else 1 bit/cycle. Only u pipe has a barrel shifter.

- So if a complex instruction enters D1 v pipe, then it cannot be executed by ALU v, hence D1 decides whether to fwd such instructions or not.
BY UBAID SAUDAGAR
11

Stage 3 : (D2) - Control word is again decoded for final execution. - Complete decode (identifies operand) - Responsible for address calculation Stage 4 : (Execute) u has a micro programmed control unit where as v has hardwired control unit. Hardwired control unit is faster than micro programmed but micro programmed control unit can handle complex instructions.
BY UBAID SAUDAGAR
12

5th stage : (RWB) It writes back the instruction results in the destination registers. Although execution of v pipeline takes place first, it is not allowed to write the answer on the register till u pipeline answer is written because is primary. If I1 and I2 are pairable then they will enter D1 u and D1 v simultaneously. They will also enter ALU u and ALU v simultaneously. But the execution time differs. I2 executes faster than I1. But answer of I1 which comes from u pipe is written first.
BY UBAID SAUDAGAR
13
Pentium system architecture

Code Cache
Branch prediction
Prefetch Buffers
U pipe
Integer ALU 32 bits Register Set
V pipe Integer ALU 32 bits
2 buffers of 32 bytes each 64 bits
Floating point unit
Multiplier Adder Data Cache

BY UBAID SAUDAGAR
Divider
14
Importance of Data and code cache

The cache is a high speed RAM (access time less than 10ns) which is used to speed up access to memory and reduce traffic on processors buses.
On chip cache is used to feed instructions and data to the CPUs pipeline.
When an instruction or data is required from the main memory, the on chip cache will be searched first. If the instruction or data is found in the cache then a copy is send to the pipeline directly. If it is not available in the internal cache then external cache is searched. If not found over there, then external memory is accessed. The cache in the Pentium has been changed from the one found in 80486 microprocessor. The Pentium contains two 8K byte cache memories instead of one as in the case of 80486. There is an 8K byte instruction cache and 8K byte data cache. The instruction cache stores only instructions, while data cache stores data used by instructions. The problem faced in 80486 was that it was a unified cache, a program that was data intensive filled the data cache very quickly leaving very little space for instructions. This slowed down the execution speed of 80486. But due to separate cache in Pentium this problem cannot occur.
BY UBAID SAUDAGAR
15
Importance of code and data cache cont

Pipeline instruction data
On chip cache
External cache
Main memory
CPU
BY UBAID SAUDAGAR
16
Branch prediction logic

Need : Whenever a jump occurs without branch prediction then we will have to flush all the instructions in the pipeline as well as prefetch queue (64 bytes). To avoid such a large amount of bytes to be flushed, we need branch prediction logic which decides from where to fetch the next instruction. Method : Imp : Prediction is based on history of instruction and not the instruction itself. Pentium uses branch prediction logic done using branch target buffer (BTB). BTB has 256 entries, 4 way set associative (i.e. 4 banks of 64 entries).
BY UBAID SAUDAGAR
17
Branch prediction logic cont

BTB stores a look up table:
Source address Target address
3000 : JUMP 8000
Assume jump instruction comes in the u pipeline; say 3000 : JUMP 8000. Now branch prediction logic checks if source address is present in BTB. If the instruction is coming for the first time then there is no history. In this case we take the decision that there is no jump (therefore we notice that branch prediction logic does not depend upon instruction but depends upon its prediction ). Now fetching will continue from the source address. Therefore jump will be actually taken or not will be known at the time of execution. If prediction is wrong then the pipelined has to be flushed.
BY UBAID SAUDAGAR
18
Branch prediction logic cont.

But in Pentium rather than flushing the pipeline, the pipeline is disabled and second queue is used and the newly activated queue starts fetching from target address (this happens when it is assumed that there is no jump and actually jump takes place which the p comes across in the execution stage) hence queue A gets de activated and queue B gets activated and now fetching from target address. Now a history is added in the BTB. If again a second time the same instruction comes, branch prediction logic will check in the BTB, this time its available hence directly the next instructions will be fetched from the target address.
BY UBAID SAUDAGAR
19
Floating point exceptions

In the Intel486 and Pentium processors, more enhancements and speedup features have been added to the corresponding FPUs. Also, the FPU is built into the same chip as the processor, which allows further increases in speed. MS-DOS compatibility for exception (interrupt) handling has also been built in, with the NE bit in control register CR0 selecting the MS-DOS compatible mode if made zero. (NE=1 selects the native or internal mode, which generates Interrupt 16, which is the same as the native version of exception handling for the 80286 and 80287 and the Intel386 processors and Intel 387 math coprocessor.) In MS-DOS compatible mode, the FERR# (Floating-point ERROR) output replaces the ERROR# signal from the previous generations, and is connected to a PIC. A new input signal, IGNNE# ( IGNORE Numeric Error), is provided to allow the FPU exception handler to execute FPU instructions.
BY UBAID SAUDAGAR
20

Advanced Microprocessors

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advanced Microprocessors

Uploaded by

Copyright:

Available Formats

Advanced Microprocessors

Intel 486 and Pentium

Features of Intel 486

Features of Intel 486

Therefore size of physical memory accessed = 232 = 4 Gb

80486 and Pentium

32 bit data bus

Super Scalar Execution

Super Scalar Execution cont...

Super Scalar Execution cont

Super scalar execution cont

else 1 bit/cycle. Only u pipe has a barrel shifter.

Super scalar execution cont

Super scalar execution cont

Pentium system architecture

V pipe Integer ALU 32 bits

2 buffers of 32 bytes each 64 bits

Floating point unit

Multiplier Adder Data Cache

Importance of Data and code cache

Importance of code and data cache cont

Branch prediction logic

Branch prediction logic cont

3000 : JUMP 8000

Branch prediction logic cont.

Floating point exceptions

You might also like