598 views

Original Title: 8087 NDP Paper p174 Palmer

Uploaded by Hariprakash

- 8086 Objective
- mup
- 310-502
- Bapi Entrysheet Create
- Math Co Processor 8087
- SECOND YEAR 4TH SEM QUESTION BANK FOR MICROPROCESSOR AND MICRO CONTROLLER
- BOFComponentGuidelinesDocumentation
- Exception Tutorial
- lec64
- Microprocessor Interview Questions
- ec6504-mp & mc Qb
- Introduction to 8086 Assembly
- Ec2304 Microprocessors and Micro Controllers l t p c
- Notes
- Question Bank 4
- Dynamic Memory
- c-api
- Swapping of Two Number
- Texenv Unicode Langu Invalid
- Cs2252 Imp Ques

You are on page 1of 8

John Palmer

Intel Corporation

This paper describes a new device, the Intel~ new applications, most notably interval arithmetic

8087 Numeric Data Processor, with unprecedented [1]. The 8087 provides an unprecedented level of

speed, accuracy and capability. Its modified stack capability, safety and r e l i a b i l i t y with high per-

architecture and instruction set are explained and formance and low cost and is a prime example of the

i l l u s t r a t i v e examples are included. The 8087, almost incredible p o s s i b i l i t i e s in combining soft-

which conforms to the proposed IEEE FloAting-Point ware and architectural expertise with VLSI proces-

Standard, is a coprocessor in the Intel~8086 fam- sing capability.

i l y . I t supports seven data types: three REAL,

three INTEGERand one packed BCD format, and per- 2.0 8087 OVERVIEW

forms a l l necessary numeric operations from addi-

tion to logarithmic and trigonometric functions. The 8087 consists of a stack of registers for

holding operands and results, a set of registers

constituting i t s environment and a set of instruc-

tions.

bits wide. Associated with the stack is a three

b i t stack pointer, TOP, and with each stack element

a two b i t tag. (Both the tags and TOP physically

belong to the ENVIRONMENT but w i l l be shown with

the stack.) The stack elements are numbered rela-

tive to TOP (ST(i) means the i t h stack element from

the top of stack) as shown bel~.

TAGS STACK

1.0 INTRODUCTION

¢- SIGN

The Intel~)8087 is a high performance gen- -7,=1 o

eral purpose nume~c data processor. I t is a

part of the InteNJ8086 family and can be used 5 EXPONENT 51GNIFleRND ST{~)

with either the 8086 or the 8088 to extend their

instruction sets by over 120 numeric data manip- ST (a)

ulation operations. The 8087 is not a peripheral ST(K)

but a coprocessor; i t monitors the instruction ST(o) STBCK

stream and when an 8086/8088 ESCAPE instruction

is read, the 8087 takes over the bus and inter-

prets and executes the ESCAPE instruction as one

of i t s own instructions. This t i g h t l y coupled

coprocessing interface permits the 8087 to exe-

i ST(G)

cute numeric instructions while the 8086 executes II 5T(5)

any others. The concurrent instruction execution ST(4)

increases the throughput of the system. Further-

more, the 8087 is the only chip that must be added

to an 8086 (8088) system to provide numeric capa-

b i l i t y that exceeds software in speed by more than The tag f i e l d is used to detect u n i n i t i a l i z e d

a factor of 100. stack elements and to designate special values

(e.g. zero) for microcode optimization.

The 8087 is intended to be general purpose

and satisfy a very wide range of needs for math- The value represented in a register has 64

ematical computation. I t is fast enough for a

great many s c i e n t i f i c and s t a t i s t i c a l calculations; bits of precision and a range of about 10±4900 (15

i t is accurate enough for business and commercial b i t exponent). A more complete description of the

computation; and i t is precise enough for entirely register values w i l l be given in Section 3.

474

The 8087 environment consists of seven words i t causes an exception t h a t generates an i n t e r r u p t .

as i l l u s t r a t e d below.

There are four types of 8087 i n s t r u c t i o n s : the

B Z TOP~ C AISIN -iP U O (~!DI ST/~TU5 CORE set, the EXTENDED set, the SPECIAL FUNCTION

CONTROL set and the ADMINISTRATIVE set. The core set i n -

WORD cludes load and store of the stack values and a-

TAG r i t h m e t i c operators: add, s u b t r a c t , m u l t i p l y , d i -

WOR.D vide and compare. The extended set is f o r loading

~NST~UC- and s t o r i n g three special formats (see Section 3).

T ion

The special f u n c t i o n set includes square root and

RDDIZESS transcendental f u n c t i o n support. The administra-

t i v e i n s t r u c t i o n s are used f o r context switching

DATA and processor c o n t r o l . Most of the i n s t r u c t i o n s

w i l l be described in more d e t a i l as the 8087 de-

sign goals are explained.

The STATUS word consists of the EXCEPTION

f l a g s (0-7) and the STATUS b i t s (8-15) where the 3.0 DESIGN GOALS

meanings are (* indicates a f i e l d reserved f o r

f u t u r e use): The 8087 is designed to achieve several major

goals. F i r s t , the 8087 conforms to an improved

EXCEPTION FLAGS and expanded version of I n t e l ' s standard f o r f l o a t -

i n g - p o i n t a r i t h m e t i c C2]. Second, the 8087 prov-

I : i n v a l i d operation ides s i g n i f i c a n t l y more c a p a b i l i t y than mainframe

D : denormalized operand and minicomputer f l o a t i n g - p o i n t processors and

Q : d i v i s i o n of nonzero by zero consequently has a p p l i c a t i o n s beyond s c i e n t i f i c

0 : overflow computation. T h i r d , the 8087 is convenient to use

U : underflow in assembly language and easy to generate code f o r

P : inexact ( p r e c i s i o n ) in high level language. And f i n a l l y the capabil-

N : indicates a pending i n t e r r u p t i t i e s of VLSI are used to provide a l l t h i s f u n c t i o n -

a l i t y with high performance and e f f i c i e n c y in a

STATUS BITS s i n g l e device.

i n s t r u c t i o n s (e.g. COMPARE)

The I n t e l f l o a t i n g - p o i n t standard, called the

TOP : stack p o i n t e r REALMATH standard, was o r i g i n a l l y specified in

1977 C2] and implemented in several products (FPAL,

B : indicates whether the 8087 is SBC-310, FORTRAN-80, BASIC-80). At about t h a t time

BUSY (used f o r synchronization) an IEEE committee was formed to propose a f l o a t i n g -

p o i n t standard f o r microprocessors. I n t e l was i n -

The CONTROL WORD consists of EXCEPTION MASKS v i t e d to p a r t i c i p a t e and offered i t s standard f o r

and CONTROL BITS. For each exception there is a consideration.

mask which i f reset allows an i n t e r r u p t to be gen-

erated ( i f M = O) but i f set the i n t e r r u p t is sup- At the time t h i s paper was w r i t t e n i t had be-

pressed and the 8087 executes a d e f a u l t exception come apparent t h a t the m a j o r i t y of the committee

handling procedure (on chip) and continues (the had agreed on a revised and expanded version of

procedure w i l l be explained in Section I I I ) . The I n t e l ' s standard [ 3 ] . The standard s p e c i f i e s data

M mask is the 8087 i n t e r r u p t enable/disable b i t . formats, rounding algorithms and exception han-

The CONTROL BITS have the f o l l o w i n g meaning dling.

PC : precision control - r e s u l t s are rounded The standard s p e c i f i e s and the 8087 supports

to one of three p r e c i s i o n s : Temporary three f l o a t i n g - p o i n t data types: Real ( s i n g l e

Real (64 b i t s ) , Long Real (53 b i t s ) , p r e c i s i o n ) , Long Real (double p r e c i s i o n ) and Tem-

Real (24 b i t s ) . porary Real (extended p r e c i s i o n ) . A l l formats are

binary and each has a biased exponent. The values

RC : rounding control - r e s u l t s are rounded represented by the three formats are shown below.

in one of four d i r e c t i o n s : unbiased

round to nearest, round towards + ~ ,

,~m ~ o

round towards - ~ , round towards zero.

IC : i n f i n i t y control - there are two types

of i n f i n i t y a r i t h m e t i c provided: a f f i n e

and p r o j e c t i v e .

'Tq 6~ 0

contents of the corresponding stack elements. The

i n s t r u c t i o n and data pointers are the addresses o f

an i n s t r u c t i o n (and i t s referenced data i f any) i f I~hL I

t75

i. I : i n v a l i d operation

RERL.. LONGR£BL T g M R REAL t h i s exception is signaled by stack

TOTRL. ~4 bi-t-5 80 bit5 overflow or underflow, the use of a

L.E N ~TH 3 E bits NAN as an operand and several other

EXPoNENT '3 bits I I bi'i~ 15 bits cases as l i s t e d in ~3]

LENGTH

EXPO~E~,,I'r p.,1 _ [ ~.,o_ I ~'~- I 2. D: denormalized operand

VALU4 e.-O

<o..F') at least one operand is denormalized

the dividend is f i n i t e and nonzero

INFINITY e.-/l"-I,-F':O e : l l , . . I , ~ : O e.tl...I,i.l,.F':O while the divisor is zero

NOT'-A- e:~l...i ,-C¢O e:ll-"l ,.4~0 e,Jl..-I,i.I,-~-O

NUrvaBEI~. 4. 0 : overflow

C.aN) the exponent of the result is too

The Temporary Real format (identical to the large for the destination's format

8087 register format) is intended to hold inter-

mediates and to support accurate Long Real cal- 5. U : underflow

culations. I t has an e x p l i c i t leading b i t ( i ) in the exponent of the r e s u l t is too

the significand thus allowing unnormalized arith- small f o r the d e s t i n a t i o n ' s format

metic. However, the algorithms are designed so

6. P : inexact result

that normalized operands w i l l always yield nor-

malized results. the delivered result is not equal to

the completely precise result but has

The algorithms specified by the standard re- been rounded

quire that the completely precise result of an

operation be rounded to the nearest representable Since the default response to overflow and

number, breaking ties by rounding to the nearest zero divisor is to set the result t o n , the 8087

even number. This default mode of rounding is supports two modes of i n f i n i t y arithmetic:

called "unbiased round to nearest". There are

,optional "directed rounding" modes that are spec- I. a f f i n e - there are two i n f i n i t e s , one

i f i e d to yield ( - ~ ) less than a l l other numbers and one

(+cx:~) greater

1. the nearest neighbor less than or equal

to the true result. 2. projective - there is only one i n f i n i t y

(the sign o n - - i s ignored) which closes

2. the nearest neighbor greater than or the number system analogous to the point

equal to the true result. a t ~ o n the Reimann sphere.

The 8087 provides these rounding modes as con- These two modes require the representation of

trolled by a f i e l d (RC) in the CONTROLWORD. two zeros (±0) which are "equal" in comparison and

a l l other operations except division where*I/+O=,loc~

The 8087, which does a l l c a l c u l a t i o n s in +I#O:-~. The mode of i n f i n i t y arithmetic is de-

Temporary Real format, has another f i e l d in the termined by a f i e l d (IC) in the CONTROLword.

CONTROL word f o r s p e c i f y i n g the precision to which

a r e s u l t is rounded (PC). Thus, the p r e c i s i o n of There are instructions that support the stand-

r e s u l t s is independent of the p r e c i s i o n of operands ard by controlling rounding, precision and i n f i n -

and, though held in Temporary Real format and ben- i t y arithmetic and by permitting complete exception

e f i t t i n g from extended range, may be forced to handling. These instructions load and store either

Real, Long Real or Temporary Real. This control the control word or the entire environment and

is provided f o r languages t h a t do not a l l o w ex- store the exception flags.

tended p r e c i s i o n intermediates and to allow the

same code to be run under d i f f e r e n t precision set- The features and instructions discussed above

t i n g s as an aid to e r r o r estimation. support the Intel floating-point (REALMATH) stand-

ard but additional capability is also desired.

The standard also specifies that a l l excep-

tions must be detected and that an implementation 3.2 Capability Extension

should permit exception handling. The 8087 sup-

ports this by detecting six types of exceptions The 8087, by supporting the required and op-

and by generating an interrupt i f the exception is tional aspects of the standard and by supporting

not masked. I f an interrupt is generated, the in- several features not mentioned by the standard,

terrupt procedure (exception handler) has avail- s i g n i f i c a n t l y extends the capabilities of the 8086

able the exception flags, a pointer to the instruc- family beyond that expected from a typical floating-

tion causing the interrupt and a pointer to the point processor. These extensions include addi-

tional data types, provision of exact arithmetic,

datum i f memory was addressed. The six exceptions,

each of which has an associated "sticky" flag (once support for interval arithmetic and special func-

set i t remains set until reset by software), are tions.

listed below.

176

The 8087 addresses seven d i f f e r e n t data types Start at a and proceed clockwise u n t i l b is

using a l l of the 8086 addressing modes. These data reached; a l l numbers covered belong to I. The

types are: signs on zero and i n f i n i t y permit us to have open

or closed intervals when zero or i n f i n i t y is an end

1. Real (32 b i t s ) point with the sign denoting which case pertains.

I f an endpoint is neither zero nor i n f i n i t y then

2. Long Real (64 b i t s ) the interval is always closed. A complete d e f i n i -

tion of interval arithmetic cannot be given here;

3. Temporary Real (80 b i t s ) however, we can l i s t some of i t s uses. In addition

to i t s obvious a b i l i t y to bound rounding errors,

4. Integer Word (16 b i t s 2's complement) interval arithmetic can be used to estimate the

effect of noise in data, to compute confidence in-

5. Integer (32 b i t 2's complement) tervals and to do worst-case analysis.

the 8087 provides several special i n s t r u c t i o n s f o r

7. Packed BCD Integer (80 b i t s , 18 d i g i t s and e f f i c i e n t evaluation of many important mathematical

sign) f u n c t i o n s with unprecendented accuracy. One of

these i n s t r u c t i o n s is square root. I t overwrites

A l l of the data types, when used as operands, the contents of the top of stack with i t s c o r r e c t l y

are f i r s t converted (without rounding e r r o r ) to rounded (according to RC and PC) square root. Be-

Temporary Real and the r e s u l t of the operation is sides being c o r r e c t l y rounded the square root op-

also returned as Temporary Real. Thus the 8087 eration is as f a s t as the d i v i d e i n s t r u c t i o n .

a r i t h m e t i c u n i t only has to work with one kind of Thus algorithms need not be contorted to remove

data. When r e s u l t s are desired in one of the other square roots.

formats, they are a u t o m a t i c a l l y converted to t h a t

type before they are stored in memory. There are two i n s t r u c t i o n s to aid in argument

reduction f o r transcendental f u n c t i o n e v a l u a t i o n :

The provision of exact a r i t h m e t i c is accom- DECOMPOSE and REMAINDER. The decompose i n s t r u c t i o n

plished by i n c l u d i n g the inexact exception (P) overwrites the contents of the top of stack with

along with i t s mask. I f a rounding e r r o r is com- the i n t e g r a l value of i t s exponent in Temporary

mitted, the c o r r e c t l y rounded r e s u l t is delivered Real format, decrements the stack p o i n t e r and loads

and the P f l a g is set. I f the mask (PM) is zero i n t o the new top of stack the value of the s i g n i f i -

an i n t e r r u p t is generated, otherwise execution cand of the o r i g i n a l stack top scaled between I and

simply continues. This permits f i n a n c i a l account- 2 (or -1 and -2 i f negative). The operation is i l -

ing functions to be performed w i t h o u t fear of l u s t r a t e d below.

roundoff e r r o r . Exact a r i t h m e t i c is also useful

in doing c o e f f i c i e n t " p r e c o n d i t i o n i n g " [see 4].

ered one of the most important features of the 8087.

As stated by W. Kahan [ 5 ] :

Top sl p lil

"No other feature would enhance safe

numerical computation more than the

provision of INTERVAL as a data type

in FORTRAN as r e a d i l y accessible as ( I f the o r i g i n a l top of stack is zero then both

r e s u l t s are zero.)

INTEGER or REAL."

This new INTERVAL data type, which the 8087 The remainder i n s t r u c t i o n is f o r reducing ar-

supports through the rounding modes (RC) and the guments of periodic f u n c t i o n s to a primary range.

signed zeros and i n f i n i t i e s , can be represented I t c a l c u l a t e s the exact remainder (no roundoff er-

as an ordered pa~r: INTERVAL, I = [a,b~. If a~b ror) of the top two stack elements:

then I includes a l l numbers between a and b; but

REM = (TOP) modulo (next-of-TOP)

i f a > b then I includes a l l numbers x where x ~ a

or x ~ b . An i l l u s t r a t i o n may help c l a r i f y the con-

The remainder is returned to the stack top and the

cept. Consider the set of numbers as a c i r c l e with

the two cases described above pictured as next-of-TOP ( " d i v i s o r " ) is not changed. Since the

execution of a f l o a t i n g - p o i n t remainder could be

very lengthy, the remainder i n s t r u c t i o n is a c t u a l l y

a primitive: the r e s u l t is e i t h e r the remainder or

the p a r t i a l remainder a f t e r a f i x e d number of steps.

Thus to compute a remainder requires a software

loop that terminates when I(TOP)I is less than

I(TOP +I) I. Even by using remainder we w i l l not

have t r i g o n o m e t r i c functions with period 2'Irsince

'IT'cannot be e x a c t l y represented in the 8087. How-

ever, the functions w i l l be e x a c t l y p e r i o d i c with

o 0

'177

period 2"Ir'* (whereqT'* is the machine approximation l i n k stage, i t is necessary to explain the 8086-

to.lr') and thus w i l l obey the i d e n t i t i e s t h a t do not 8087 i n t e r f a c e .

explicitly involveqT'.

The 8086 (8088) has a set of ESCAPE i n s t r u c -

The other i n s t r u c t i o n s provided f o r special t i o n s t h a t , in memory addressing mode, cause the

functions are TANGENT, ARCTANGENT, EXPONENTIAL and 8086 to c a l c u l a t e the address and read the contents

LOGARITHM. of t h a t address. The 8086 ignores the word i t

reads and then preceeds to execute subsequent i n -

The tangent assumes the top of stack, X, i s s t r u c t i o n s . The 8087 is monitoring the same i n -

between zero and'IT'/4 and returns two r e s u l t s as s t r u c t i o n stream and when i t detects an ESCAPE i t

shown: . knows t h a t i t is being i n s t r u c t e d to do something.

I t latches the opcode and i f there was an address

ToP .

A

X

I t

I

TAN

T~P

/

/

A

Y

c a l c u l a t e d the 8087 captures both the address and

the datum read by the 8086. By decoding the i n -

s t r u c t i o n the 8087 knows how many more words i t

meeds from memory and i t increments the address and

fetches data u n t i l a l l required data is read. The

8087 then releases the bus and begins c a l c u l a t i n g

w h i l e the 8086 continues executing the i n s t r u c t i o n

The arctangent works in reverse by using two argu-

stream. Because of the overlapped coprocessing of

ments and r e t u r n i n g one: the 8086-8087 i t is necessary to preceed 8087 i n -

s t r u c t i o n s (ESCAPE) with a WAIT i n s t r u c t i o n in or-

der to synchronize the two processors. In place

A

:

•

Ily~z

ATAN

>O

.

A

" of the WAIT, when the software emulator is to be

invoked, an INTERRUPT i n s t r u c t i o n is inserted.

y IT°p " X There are some other d i f f e r e n c e s between the hard-

ware and software i n t e r f a c e s but they are the same

TOP = ~ II X--arc'fon(y~)j length and use the same addressing mechanism. This

The exponential i n s t r u c t i o n , which c a l c u l a t e s permits a compiler to output an external reference

instead of the WAIT-ESCAPE and l e t the LINKER f i l l

2 X -1, assumes t h a t 0 _~x~1/2 and overwrites the

in with e i t h e r WAIT-ESCAPE or INTERRUPT depending

argument on the top of the stack with the r e s u l t . on whether the user has an 8087 or desires to use

The logarithm f u n c t i o n , which computes Y * log2(X),

the emulator.

uses two arguments and returns a s i n g l e r e s u l t as

shown In a d d i t i o n to software emulation to aid s o f t -

ware development, the 8087 has an e i g h t level stack

I i i

of r e g i s t e r s t h a t supports the Temporary Real (80

b i t ) format and makes the 8087 f a r easier to use

than other f l o a t i n g - p o i n t processors. A l l calcu-

Y x >o ~" l a t i o n s are done in t h i s extended format and as

TOP ~ X [~:y~loq~Cx)l long as intermediates are kept in the stack or i t s

e q u i v a l e n t memory format ( i f e i g h t is not enough)

The e r r o r bound f o r a l l these f u n c t i o n s is about 2 then the t h r e a t of roundoff damage and r i s k of over-

u n i t s in the l a s t place thus a l l o w i n g f o r Long Real flow or underflow is g r e a t l y reduced. Roundoff er-

arguments to be computed to Long Real accuracy. ror is reduced because Temporary Real intermediates

The p r o v i s i o n of the described special f u n c t i o n s are more precise than Long Real data or f i n a l re-

support the goal of increased c a p a b i l i t y . s u l t s by eleven guard b i t s . Most overflows and

underflows occur on intermediate c a l c u l a t i o n s and

3.3 Ease of Use the extended range of Temporary over Long Real

(1024900 vs. 10 ±308 ) ensures t h a t on intermediates

As stated above, ease of use, along with sup- these exceptions need seldom, i f ever, occur.

port of the standard and extended c a p a b i l i t y , is

a major 8087 goal. We have made the 8087 easy and The symmetric mixed mode i n s t r u c t i o n set also

convenient f o r programmers and automatic code gen- c o n t r i b u t e s to ease of use. The CORE i n s t r u c t i o n s ,

erators by providing software emulation, a deep which include LOAD, STORE & POP, STORE, ADD, SUB-

(8 l e v e l s ) i n t e r n a l stack of very wide precision TRACT, SUBTRACT REVERSE, MULTIPLY, DIVIDE, DIVIDE

(64 bits) and large range (10:1:4900), optimized sym- REVERSE, COMPARE, and COMPARE & POP, take one o-

metric mixed mode arithmetic and on chip default perand from the top of stack and a second operand

exception handling. from e i t h e r memory or a stack element. There are

thus two forms of CORE i n s t r u c t i o n s : memory ad-

The i n t e r f a c e between the 8086 (8088) and 8087 dressed and stack addressed. The memory addressed

allows f o r software emulation of the 8087 permit- form supports four memory formats in a l l 8086 ad-

t i n g software f o r the 8087 to be developed, de- dressing modes:

bugged and executed on a system containing only an

8086 (8088). In order to run the developed soft- Integer Word (16 b i t 2's complement)

ware on an 8087 i t is not necessary to recompile Integer (32 b i t 2's complement)

but only r e l i n k . To understand how one can delay Real (32 b i t )

the resolution of either 8087 or emulator u n t i l the Long Real (64 b i t )

~78

The LOAD Integer i n s t r u c t i o n converts an i n t e g e r p h i c a l l y l a r g e r ( i g n o r i n g the sign) otherwise i t

to Temporary Real format and pushes i t on the stack; generates a special NAN c a l l e d INDEFINITE as the

the ADD Long Real i n s t r u c t i o n converts a Long Real result.

operand to Temporary Real and adds i t to the top of

the stack; and t h e STORE Integer Word i n s t r u c t i o n 2. Denormalized Operand - the operand is con-

converts the top of stack to a 16 b i t integer and verted to an e q u i v a l e n t unnormalized rep-

stores i t in memory ( w i t h o u t a l t e r i n g the contents resentation preserving the same number of

of the stack). leading zeros.

The stack addressed form of the CORE i n s t r u c - 3. Zero D i v i s o r - since the dividend is non-

t i o n s obtains the second operand from one of the zero the r e s u l t is ± ~ with the sign set

stack elements instead of memory. The reference in the usual way (XOR of the signs of the

is always r e l a t i v e to the top of stack; thus stack operands).

element i , where i:O . . . . . 7, refers to the i t h ele-

ment of the stack under the top of stack. The 4. Overflow - the r e s u l t i s ~ w i t h the sign

stack addressed form has two options f o r the des- of the overflowed r e s u l t .

t i n a t i o n of the r e s u l t . The r e s u l t can e i t h e r over-

w r i t e the top of stack or replace the contents of 5. Underflow - the r e s u l t is denormalized to

the i t h stack element depending on the s e t t i n g of f i t the d e s t i n a t i o n ' s format ("gradual

the "di-rection" (D) b i t in the i n s t r u c t i o n . I f the underflow" E4J).

d e s t i n a t i o n is the i t h stack element then depending

on the s e t t i n g of another b i t (the "pop" (P) b i t ) 6. Inexact Result - the c o r r e c t l y rounded

the stack is popped or l e f t unaltered. r e s u l t is returned.

The EXTENDED instructi~on set consists of two A l l of the features discussed above: software em-

memory addressed type of i n s t r u c t i o n s , LOAD and u l a t i o n , deep Temporary Real stack, symmetric and

STORE & POP, t h a t support three a d d i t i o n a l memory powerful i n s t r u c t i o n set and d e f a u l t exception

formats: handling, make the 8087 easy and convenient to use;

but to be useful i t must also be e f f i c i e n t .

Long Integer (64 b i t 2's complement)

Temporary Real (80 b i t ) 3.4 Effic.iency

Packed BCD (80 b i t )

E f f i c i e n c y was a major goal in the design of

The Temporary Real format is supported f o r extending the 8087. An extensive treatment of the i n t e r n a l

the 8087 stack to memory when necessary; the Packed hardware and algorithms w i l l be given elsewhere,

BCD format, which is a signed 18 d i g i t i n t e g e r as but a b r i e f d e s c r i p t i o n w i l l i l l u s t r a t e our concern

shown, f o r performance. The 8087's main ALU is more than

64 b i t s wide. This is to handle e f f i c i e n t l y 64

b i t operands with guard, round and s t i c k y b i t s [ 6 ]

°I °°. Hod and at l e a s t one overflow b i t . I t s s h i f t e r can

s h i f t r i g h t or l e f t from 0 to 63 places in one

clock cycle. This is useful f o r f o r m a t t i n g , nor-

is used to aid binary-decimal conversion and COBOL malizing and denormalizing and f o r the transcen-

type c a l c u l a t i o n s ; and the Long Integer format is dental f u n c t i o n s . For normalizing there is hard-

supported f o r a p p l i c a t i o n s r e q u i r i n g very wide pre- ware f o r detecting the p o s i t i o n of the most s i g -

c i s i o n exact computation. Again i t is important n i f i c a n t one. F i n a l l y , there is special harc~ware

to note t h a t conversion of these formats to Tem- to permit m u l t i p l y , d i v i d e , remainder and square

porary Real is done with no rounding e r r o r . root to be calculated r a p i d l y . Approximate speeds

of the basic operations f o r stack operands are

Another i n s t r u c t i o n , included to make the 8087 summarized below:

easy to use, is in n e i t h e r the CORE nor the EXTEN-

5MHz

DED set but i t s value is obvious. That i n s t r u c t i o n

Microseconds

is EXCHANGEtop of stack with the i t h stack element.

This i n s t r u c t i o n has no memory form and ignores the COMPARE 5

D and P b i t s . ADD (MAGNITUDE) 10

SUBTRACT (MAGNITUDE) 16

A f u r t h e r user convenience in the 8087 is i t s MULTIPLY 16, 24*

on-chip d e f a u l t exception handling. Though i t is DIVIDE 38

possible to handle exceptions with software, i t is SQUARE ROOT 38

often an onerous task to w r i t e , debug and maintain

exception handlers. The d e f a u l t 8087 response to * shorter time i f e i t h e r operand was o r i g i n a l l y

an exception is invoked by masking in the CONTROL Real (32 b i t )

WORD t h a t exception. The 8087's response to masked

exceptions balances safety With the u t i l i t y of con- The above timings apply f o r Real, Long Real or

tinued c a l c u l a t i o n . Listed below are the d e f a u l t Temporary Real operands and r e s u l t s . The p r e v i -

responses to masked exceptions: ously described overlapped i n s t r u c t i o n execution

by the 8086 and 8087 also increases throughput.

1. I n v a l i d Operation - i f e i t h e r operand is However, more important t h a t absolute execution

NAN, the 8087 propagates the l e x i c o g r a - speeds is the stack with i t s i n t e r n a l addressing

'179

t h a t minimizes memory referencing. There is an i n - 5. Add TOP (XT) to Sx and POP

s t r u c t i o n f o r scaling t h a t is much f a s t e r than mul-

tiply. For rapid context s w i t c h i n g , the 8087 has 6. LOAD Yi

SAVE and RESTORE i n s t r u c t i o n s . The i n s t r u c t i o n set

and the hardware to execute i t r a p i d l y give the 7. Add TOP (Yj) to My

8087 very high performance w i t h o u t s a c r i f i c i n g

quality. 8. M u l t i p l y TOP (Yi) to Xi

extensive set of programs would be very u s e f u l . We

w i l l here give two examples t h a t should r e i n f o r c e 11. Add TOP ( X i Y i ) to Cxy and POP

many of the points made e a r l i e r . The f i r s t example

is to c a l c u l a t e the length of a vector. The task 12. Loop to Step I

is conceptually simple but a r e l i a b l e , robust pro-

gram f o r the t y p i c a l f l o a t i n g - p o i n t system is hard The inner loop of t h i s program has only eleven

to produce. With the 8087 i t is easy, almost auto- 8087 i n s t r u c t i o n s and has the same properties of

matic, to produce such a program. r e l i a b i l i t y and robustness as the f i r s t example.

I t is also e f f i c i e n t since the minimum computation

Temporary Real : SUM and memory addressing is done.

Long Real : X (I), L

SUM : = 0 The I n t e l 8087 Numeric Data Processor, along

For I = 1 to N Do with i t s design goals of meeting I n t e l ' s REALMATH

SUM : = SUM + X ( I ) * * 2 standard, and providing increased c a p a b i l i t y , ease

of use and performance, has been described. We

L : = SQRT (SUM) have attempted to balance safety and u t i l i t y and

have provided an unprecendented level of c a p a b i l -

This program is free from intermediate overflow or i t y , accuracy and r e l i a b i l i t y in a math processor.

underflow problems and unless N is very large i t s

only s i g n i f i c a n t rounding e r r o r is in the l a s t i n - 5.0 ACKNOWLEDGEMENTS

, s t r u c t i o n - where i t is unavoidable but easy to

analyze. There are a great number of people who deserve

r e c o g n i t i o n f o r t h e i r c o n t r i b u t i o n to the 8087.

The second example demonstrates how several The i n i t i a l a r c h i t e c t u r a l design was the j o i n t work

accumulations can be calculated e f f i c i e n t l y in the of the author and Bruce Ravenel, r e l y i n q h e a v i l y

8087. I f we have two sets of data, Xi and Yi, t h a t on the advice of Professor W. Kahan. Robert

we want to analyze, we very l i k e l y w i l l want means, Koehler made s i g n i f i c a n t c o n t r i b u t i o n s to the sys-

standard deviations and c o r r e l a t i o n c o e f f i c i e n t s . tems aspects of the 8087 and Janis Baron was re-

We thus want to c a l c u l a t e : sponsible f o r designing the assembly language and

implementing the emulator. A great deal of c r e d i t

Mx =~Ex i My = ~ y j Sx = ~ x l 2 Sy =:Eyi 2 must go to Rafi Nave and his team in I n t e l Israel

f o r implementing the 8087 and to Dai-Sun Tsien f o r

Cxy = ~ x i Y i c a r e f u l l y reviewing and checking the implementa-

tion. Perhaps most s i g n i f i c a n t of a l l , we acknow-

In an o r d i n a r y stack machine, the f i v e values l i s - ledge the management of I n t e l f o r being w i l l i n g to

ted above would probably be calculated in f i v e sep- commit s i g n i f i c a n t resources to both implementation

arate passes through the data r e q u i r i n g t h a t each and promotion of a standard f o r r e l i a b l e numeric

datum be read three times. data processing.

in one pass through the data, r e q u i r i n g t h a t each

datum be read only once. The a r c h i t e c t u r a l feature I. Moore, R.E. (1979), "Methods and A p p l i c a t i o n s

t h a t permits t h i s increase in e f f i c i e n c y is the of I n t e r v a l A n a l y s i s , " SIAM Studies in Applied

a b i l i t y to do a r i t h m e t i c with operands from any Mathematics, SIAM, P h i l a d e l p h i a .

stack element. The algorithm is described below.

2. Palmer, J. (1977), "The I n t e l Standard f o r

STEP ACTION F l o a t i n g - P o i n t A r i t h m e t i c , " Proc. COMPSAC,

107-112.

O. Clear f i v e stack elements (push zero f i v e

times): Mx, My, Sx, Sy, Cxy 3. Coonan, J . , W. Kahan, J. Palmer, T. Pittman

and D. Stevenson (1979), "A Proposed Standard

1. LOAD X i f o r F l o a t i n g - P o i n t A r i t h m e t i c , " SIGNUM News-

l e t t e r , October, 1979.

2. Add TOP (Xi) to Mx

4. Kahan, W., J. Palmer (1979), "On a Proposed

3. Duplicate TOP of stack F l o a t i n g - P o i n t Standard," SIGNUM Newsletter,

October, 1979.

4. Square TOP

'180

5. Kahan, W. (1972), "A Survey of Error Analysis,"

Information Processing 71, North Holland Pub-

lishing Company, 1214-1239.

Arithmetic," IEEE Trans. Computers, Vol. C-22,

No. 6, 577-586.

t81

- 8086 ObjectiveUploaded byAjit Saraf
- mupUploaded byPayal Agarwal
- 310-502Uploaded byapi-3757581
- Bapi Entrysheet CreateUploaded bydsalvare
- Math Co Processor 8087Uploaded byshudhanshu291
- SECOND YEAR 4TH SEM QUESTION BANK FOR MICROPROCESSOR AND MICRO CONTROLLERUploaded byPRIYA RAJI
- BOFComponentGuidelinesDocumentationUploaded byapi-3726220
- Exception TutorialUploaded byimtishal_ali3263
- lec64Uploaded bySubhasish Mahapatra
- Microprocessor Interview QuestionsUploaded bygangadhar11
- ec6504-mp & mc QbUploaded bysenthilbabu.d
- Introduction to 8086 AssemblyUploaded byAyad M Al-Awsi
- Ec2304 Microprocessors and Micro Controllers l t p cUploaded bysanthoshiniselvaraj
- NotesUploaded byhrishikesh889942
- Question Bank 4Uploaded bySuyash Patil
- Dynamic MemoryUploaded bySourabh Bhagat
- c-apiUploaded byaleks3
- Swapping of Two NumberUploaded bychetanvchaudhari
- Texenv Unicode Langu InvalidUploaded byProsenjeet Mandal
- Cs2252 Imp QuesUploaded byAntonio Leon
- Assignment 1Uploaded bynehal
- Crafting an Interpreter Part 2 - The Calc0 Toy Language - VROSNET - CodeProjectUploaded byAntony Ingram
- vlookupUploaded bynbushscribd
- April 2013 Java ProgrammingUploaded byJOHN19740912
- p1Uploaded byNipa Patel
- If 002 Second SemesterUploaded by100_example_100
- 7 Exceptional HandlingUploaded bysuresh1130
- PL_SQL Tutorial - Exception HandlingUploaded bybowoyeki
- very impb prog.pdfUploaded byVishesh Shrivastava
- A Crash Course on the Depths of Win32Uploaded byYu Chi Huang

- BBB4ed InstructionsUploaded byHariprakash
- Rate Mono TonicUploaded byHariprakash
- BBB4ed-unit1Uploaded byHariprakash
- Lit SurveyUploaded byHariprakash
- mpi-unit3Uploaded byHariprakash
- Mpi Complete ReferenceUploaded byngrrns
- MPI presentationUploaded byHariprakash
- Intel 8086 Architecture class presentationUploaded byHariprakash

- GI ProcessUploaded bysaqibdar
- case analysisUploaded byRose Siena Simon Antioquia
- Proteus_ISIS.pdfUploaded byMauro Antivero
- ABBUploaded byDlip Kumar
- REPORT on Venture CapitalUploaded bySANDEEP ARORA
- MECH4406 - Assignment 2Uploaded byraf33
- Free PDF eBook.com Qt TutorialUploaded byBruno Ribeiro
- Pseudo Code LibraryUploaded bymoshetalkar
- ESRI ArcGIS Server 9.3 for VMware InfrastructureUploaded bydsuperbi
- PharmTech US 06 Grow ThroughUploaded byamitbiotechgndu
- New Microsoft Word DocumentUploaded byBala Kumar
- itm5Uploaded bywoomh94
- Dolby Atmos Home Theater Installation GuidelinesUploaded byCourtney Hernandez
- 043 - Guest Relocation PolicyUploaded byDaund Fron
- Lightweight Concrete and Application in Construction IndustryUploaded byRayyan Darwisy
- S 57 Writer FME Flyer 4466Uploaded bymaritimepressbox
- Installation for Sap Erp 6.0 Ehp7 on Hana – Sap HanaUploaded bydsa
- Endpoint Protection ProductGuideUploaded bydankorzon1
- fce pressrelease-3Uploaded byapi-295162019
- pdbpaUploaded bychengad
- Gul Ahmed 2009Uploaded byQoumal Hashmi
- Ferdinand Porsche BiographyUploaded byAnonymous umabEI6
- API PHP Package - MikroTik WikiUploaded bygustavito92012420945
- Man_ME-15M (2)Uploaded byJorge Gustavo Goyechea
- GL101Uploaded bynarendraykumar
- MMSI Internet Jaringan-Komputer 1Uploaded bykingkong3918
- SSP 402 Dynamic Steering in the Audi A4 08Uploaded byJavier
- OM0012 – SUPPLY CHAIN MANAGEMENTUploaded bySolved AssignMents
- Basic Principles of Turbo MachinesUploaded bybinho58
- jobpostings Magazine (Summer 2011)Uploaded byjobpostings Magazine