Professional Documents
Culture Documents
The
8086 Family
Useiis Manual
Numerics Supplement
July 1980
https://archive.org/details/8086familyusersm00inte
inter
The
8086 Family
User^ Manual
Numerics Supplement
July 1980
Additional copies of this manual or other Intel literature may be obtained from:
Literature Deparimeni
Intel Corporation
3065 Bowers Avenue
Santa Clara, CA9505I
IntelCorporation makes no warranty of any kind with regard to this material, including, but not limited
to, the implied warranties of merchantabilityand fitness for a particular purpose. Intel Corporation
assumes no responsibility for any errors that may appear in this document. Intel Corporation makes no
commitment to update nor to keep current the information contained in this document.
Intel Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in
Intel software products are copyrighted by and shall remain the property of Intel Corporation. Use,
duplication or disclosure is subject to restrictions stated in Intel's software license, or as defined in ASPR
7-104. 9(a)(9).
No part of this document may be copied or reproduced in any form or by any means without the prior
written consent of Intel Corporation.
The following are trademarks of Intel Corporation and may be used only to identify Intel products:
and the combination of ICE, ICS, iSBC, iSBX, MCS, or RMX and a numerical suffix.
ii
PREFACE
This supplement provides detailed information on With additional information, configura-
suffix
the 8087 Numeric Coprocessor extension to the tion options within each iAPX system can be
iAPX 86/10 and 88/10 CPUs. These processors identified, for example:
and their support circuits are described in detail in
the 8086 Family User's Manual, and discussions iAPX 86/10 CPU Alone (8086)
in this supplement frequently reference this iAPX 86/ 1 1 CPU + lOP (8086 + 8089)
manual. Below is a brief overview of architectural iAPX 88/20 CPU + Math Extension
concepts used throughout the family and the (8088+ 8087)
system nomenclature used to describe particular iAPX 88/21 CPU + Math Extension +
system configurations built around the four lOP (8088 + 8087 + 8089)
iAPX 86, 88 family processors.
This nomenclature is intended as an addition to
rather than a replacement for, Intel's current part
Microsystem 80 Nomenclature numbers. These new series level descriptions are
used to describe the functional capabilities pro-
Over the last several years, the increase in vided by specific configurations of the processors
microcomputer system and software complexity in the 8086 Family. The hardware used to imple-
has given birth to a new family of microprocessor ment each functional configuration is still
products oriented towards solving these increas- described by referring to the parts involved (as is
ingly complex problems. This new generation of the case for the majority of the 8087 information
microprocessors is both powerful and flexible and described in this supplement.)
includes many processor enhancements, such as
numeric point extensions, I/O pro-
floating This improved nomenclature provides a more
cessors, and operating system functionality in meaningful view of system capability and
silicon. performance within the evolving Microsystem 80
architecture.
iSBC Single Board Computers experience. Finally, the modular structure of the
iSBX MULTIMODULE Boards family architecture provides an orderly way for
systems to grow and change.
Concentrating on the iAPX Series, two Processor
The iAPX 86, 88 product line architecture is
Families are defined:
characterized by three major principles:
iAPX 868086 CPU based system 1. System functions are distributed among
iAPX 888088 CPU based system specialized components.
m
2. Multiprocessing capabilities are inherent in processor, allowing each to be compatible
the hardware. with a common set of bipolar bus support
components.
3. A hierarchical bus organization provides for
the complex data flows required by high-
performance systems without burdening Multiprocessing
simpler systems w ith unneeded capabilities.
iv
ADDRESS
datatype, and instruction resources directly to the being updated. Each of the processors has an
PRIVATE
MEMORY
PUBLIC
I MEMORY
BUS BUS
> UJ INTERFACE PROCESSOR INTERFACE
GROUP GROUP
IS
(/)
I
PRIVATE
I/O
LOCAL BUS
1 PROCESSOR 1 PROCESSOR
PROCESSING j j
MODULE PROCESSING
MODULE
instruction that examines and updates a memory usually reside on a system bus. The iAPX 86, 88
byte with the bus locked. This instruction can be bus interface components link a local bus to a
used to implement a semaphore mechanism for system bus.
controlling the access of multiple processors to
shared resources. (A semaphore is a variable that
indicates whether a resource, such as a buffer or a Local Bus
pointer, is "available" or "in use".)
The local busoptimized for use by the iAPX 86,
is
vi
bus are said to be local to each other; processors Processing Modules and Bus Topology
on different local buses are said to be remote to
each other, or configured remotely. Both
independent processors and coprocessors may The processor(s) and bus interface group(s) that
share a local bus; on-chip arbitration logic deter- are connected by a local bus constitute a process-
mines which processor drives the bus. Because the ing module. A
simple processing module could
processors on the local bus share the same bus
consist of a single CPU and one bus interface
interface components, the local configuration
group. A more complex module would contain
multiple processors, such as two lOPs, a CPU
of multiple processors provides a compact and
inexpensive multiprocessing system.
and one or two lOPs, or a CPU with a
coprocessor with/ without lOP. One bus interface
group typically links the processors in the module
System Bus to a public system bus. If there are multiple pro-
cessing modules in the system, all memory or I/O
A full implementation of an iAPX 86, 88 system
connected to the public bus is accessible to all
bus consists of the following five sets of signals:
processing modules on the public bus. 8289 bus
address bus, data bus, control lines, interrupt
arbiters in each processing module control the
lines and arbitration lines. A group of bus inter-
access of the modules to the public bus and hence
face components transforms the signals of a local
to the public memory and I/O.
bus into a system bus. The number of bus inter-
face components required to generate a system
bus depends on the size and complexity of the A second bus interface group may be connected
system; reduced application needs translate to a processing module's local bus, generating a
directly into reduced component counts. These demultiplexed bus. This bus can provide the pro-
main variables determine the configuration of a cessing module with a private address space that is
bus interface group: address space size (number not accessible to other processing modules.
of latches), data bus width (number of Distributing memory and I/O resources in this
transceivers), and arbitration needs (presence of a manner can improve system reliability by
bus arbiter). isolating the effects of failures. It can also
increase system throughput dramatically. If pro-
The iAPX system bus is functionally and
86, 88 cessor programs and data are placed in
local
electrically compatible with the MULTIBUS private memory, contention for use of the public
multimaster system bus used in Intel's iSBC line system bus can be held to a minimum to ensure
of single board computing products. This com- that shared resources are quickly available when
patiblity gives system designers access to a wide they are needed. In addition, processors in
variety of computer, memory, communications separate modules can simultaneously fetch
and other modules that may be incorporated into instructions from private memory spaces to allow
products, used for evaluation or for test vehicles. multiple system tasks to proceed in parallel.
vii
3
1
Table of Contents
ix
Tables Illustrations
S-1 8087/Emulator Speed Comparison S-3 S-1 8087 Numeric Data Processor Pin
S-2 Data Types S-6 Diagram S-2
S-3 Principal Instructions S-6 S-2 8087 Evolution and Relative
S-4 RealNumber Notation S-15 Performance S-2
S-5 Rounding Modes S-17 S-3 NDP Interconnect S-7
S-6 Exception and Response Summary S-20 S-4 8087 Block Diagram S-8
S-7 Processor State Following S-5 Register Structure S-9
Initialization S-26 S-6 StatusWord Format S-10
S-8 Bus Cycle Status Signals S-28 S-7 Control Word Format S-11
S-9 Data Transfer Instructions S-30 S-8 Tag Word Format S-12
S-10 Arithmetic Instructions S-32 S-9 Exception Pointers Format S-12
S-1 1 Basic Arithmetic Instructions S-10 8087 Number System S-13
and Operands S-33 S-11 Data Formats S-14
S-12 Comparison Instructions S-36 S-12 Projective Versus Affine Closure S-18
S-13 FXAM Condition Code Settings S-37 S-13 Storage of Integer Data Types S-21
S-1 4 Transcendental Instructions S-37 S-14 Storage of Real Data Types S-21
S-15 Constant Instructions S-38 S-15 Synchronizing Execution With
S-16 Processor Control Instructions S-39 WAIT S-24
S-17 Key to Operand Types S-42 S-16 Interrupt Request Logic S-27
S-18 Execution Penalties S-43 S-17 Interrupt Request Path S-29
S-19 Instruction Set Reference Data S-44 S-18 FSAVE/FRSTOR Memory Layout S-41
S-20 PL/M-86 Built-in Procedures S-59 S-19 FSTENV/FLDENV Memory Layout S-41
S-21 Storage Allocation Directives S-60 S-20 Sample 8087 Constants S-43
S-22 Addressing Mode Examples S-62 S-21 Status Word RECORD Definition S-62
S-23 Denormalization Process S-68 S-22 Structure Definition S-62
S-24 Exceptions Due to Denormal S-23 Sample PL/M-86 Program S-64
Operands S-69 S-24 Sample ASM-86 Program S-65
S-25 Unnormal Operands and Results S-70 S-25 Instructions and Register Stack S-68
S-26 Zero Operands and Results S-71 S-26 Conditional Branching for Compares S-82
S-27 Infinity Operands and Results S-72 S-27 Conditional Branching for FXAM S-83
S-28 Binary Integer Encodings S-75 S-28 Full State Exception Handler S-86
S-29 Packed Decimal Encodings S-76 S-29 Latency Exception Handler S-87
S-30 Real and Long Real Encodings S-76 S-30 Reentrant Exception Handler S-87
S-31 Temporary Real Encodings S-77
S-32 Exception Conditions and Masked
Responses S-79
S-33 Masked Overflow Response for
Directed Rounding S-81
A-1. Encoding
Instruction A-1
A-2. Machine Instruction Decoding
Guide A-2
X
8087 Numeric
Data Processor
.
S-1
8087 NUMERIC DATA PROCESSOR
vss [_ 40 1 vcc
1
n uH
A 1 1/ U
M ji /
1
1
1
39 1 A15/D15
A11
A1 J//n
U 11 J1 1
1
36 1 A16/S3
A12/D12 ^ 4 37 ]^ A17/S4
A 1 1 / n11 1 36 ^ A18/S5
A10/D10 6 35 ^ A19/S6
A9/D9 7 34 ^ BHE/S7
A8/D6 ^ 8 33 ^ RQ/GT1
A7/D7 9 32 n
J
INT
A5/D5 \ 1 NDP 30 1 NC
Ml / H 1 12 29 1 NC
A3/D3 ^ 13 28 1 S2
A2/D2 ^ 14 27
"Isi
A1/D1 15 26 ^ SO
AO/DO C 16 25
17 24
NC 18 23 ^ BUSY
CLK {2 19 22 ^ READY I I I I
Figure S-1 . 8087 Numeric Data Processor Pin Figure S-2. 8087 Evolution and Relative
Diagram Performance
made the commitment to expand the computa- In 1979, a working committee of the Institute for
tional of microprocessors from
capabilities Electrical and Electronic Engineers (IEEE) pro-
addition and subtraction of integers to an array of posed an industry standard for minicomputer and
widely useful operations on real numbers. (Real microcomputer floating point arithmetic*. The
numbers encompass integers, fractions, and intent of the standard is to promote portability of
irrational numbers such as n and sfl.) In 1977, numeric programs between computers and to pro-
the corporation adopted a standard for repre- vide a uniform programming environment that
senting real numbers in a "floating point" encourages the development of accurate, reliable
format. Intel's Floating Point Arithmetic Library software. The proposed standard specifies
(FPAL) was the first product to utilize this stan- requirements and options for number formats as
dard format. FPAL is a set of subroutines for the well as the results of computations on these
8080/8085 microprocessors. These routines per- numbers. The floating point number formats are
form arithmetic and limited standard functions identical to those previously adopted by Intel and
on single precision (32-bit) real numbers; an used in the products described in this section.
FPAL multiply executes in about 1.5 ms (1.6
MHz 8080A CPU). The next product, the iSBC
310 High Speed Math Unit, essentially The 8087 Numeric Data Processor is the most
implements FPAL iSBC^" card, in a single advanced development in Intel's continuing effort
reducing a single-precision multiply to about 100 to provide improved tools for numerically-
piS. The Intel 8232 is a single-chip arithmetic pro-
" oriented microprocessor applications. It is a
cessor for the 8080/8085 family. The 8232 accepts single-chip hardware implementation of the
double precision (64-bit) operands as well as proposed IEEE standard, including all its options
single precision numbers. It performs a single for single and double precision numbers. As such,
precision multiply in about 100 fis and multiplies it is compatible with previous Intel numerics
double precision numbers in about 875 f/s (2 MHz products; programs written for the 8087 will be
version). transportable to future products that conform to
* J. Coonen, W. Kahan, J. Palmer, T. Pittman, D. Stevenson, "A Proposed Standard for Binary Floating
Point Anlhmelk,'' ACM SIGNUM Newsletter, October 1979.
S-2
8087 NUMERIC DATA PROCESSOR
the proposed IEEE standard. The NDP also pro- The 8087's unique coprocessor interface to the
vides many additional functions that are CPU can yield an additional performance incre-
extensions to the proposed standard. ment beyond that of simple instruction speed. No
overhead is incurred in setting up the device for a
Performance computation; the 8087 decodes its own instruc-
tions automatically in parallel with the CPU.
As figure S-2 indicates, the 8087 provides about Moreover, built-in coordination facilities allow
10 times the instruction speed of the 8232 and a the CPU to proceed with other instructions while
100-fold improvement over FPAL. The 8087 the 8087 is simultaneously executing its numeric
multiplies 32-bit and 64-bit real numbers in about instruction. Programs can exploit this processor
19 ^sand 27 pis,, respectively. Of course, the actual parallelism to increase total system throughput.
performance of the NDP in a given system
depends on numerous application-specific Usability
factors.
Viewed strictly from the standpoint of raw speed,
Table S-1 compares the execution times of several the 8087 enables serious computation-intensive
8087 instructions with the equivalent operations tasks to be performed by microprocessors for the
executed in software on a 5 MHz 8086. The soft- first time. offers more than just high
The 8087
ware equivalents are highly optimized assembly performance, however. By synthesizing advances
language procedures from the 8087 emulator, an made by numerical analysts in the past several
NDP development tool discussed later in this years, the NDP provides a level of usability that
section. surpasses existing minicomputer and mainframe
arithmetic units. In fact, the charter of the 8087
The performance figures quoted in this section design team was first to achieve exceptional func-
are foroperations on real (floating point) tionality and then to obtain high performance.
numbers. The 8087 also has instructions that
enable it to utilize fixed point binary and decimal The 8087 is explicitly designed to deliver stable,
integers of up to 64 bits and 18 digits, respec- accurate results when programmed using
tively. Using an 8087, rather than multiple preci- straightforward "pencil and paper" algorithms.
sion software algorithms for integer operations, While this statement may seem trivial, experi-
can provide speed improvements of 10-100 times. enced users of "floating point processors" will
Add 17 1,600
Compare 9 1,300
Tangent 90 13,000
S-3
8087 NUMERIC DATA PROCESSOR
recognize its fundamental importance. For that exhibit any of the following characteristics
example, most computers can overflow when two can benefit by implementing numeric processing
single precision floating point numbers are on the 8087:
multiplied together and then divided by a third, Numeric data vary over a wide range of
even if the final result is a perfectly valid 32-bit
values, or include non-integral values;
number. The 8087 delivers the correctly rounded
result. Other typical examples of undesirable
Algorithms produce very large or very small
machine behavior in straightforward calculations intermediate results;
occur when solving for the roots of a quadratic Computations must be very precise, i.e., a
equation: large number of significant digits must be
maintained;
S-4
8087 NUMERIC DATA PROCESSOR
Robotics Coupling small size and modest Table S-2 lists the seven 8087 data types. Inter-
power requirements with powerful computa- nally, the8087 holds all numbers in the temporary
tional abilities, the NDP is ideal for on-board real format; the extended range and precision of
six-axis positioning. this format are key contributors to the NDP's
Navigation Very
small, light weight, and
ability to consistently deliver stable, expected
results. The 8087's load and store instructions
accurate inertial guidance systems can be
convert operands between the other formats and
implemented with the 8087. Its built-in
temporary real. The fact that these conversions
trigonometric functions can speed and
are made, and that calculations may be per-
simplify the calculation of position from
formed on converted numbers, is transparent to
bearing data.
the programmer. Integer operands, whether
Graphics terminals
The 8087 can be used in binary or decimal, yield correct integer results,
graphics terminals to locally perform many just as realoperands yield correct real results.
functions which normally demand the atten- Moreover, a rounding error does not occur when
tion of a main computer; these include rota- a number in an external format is converted to
tion, scaling, and interpolation. By also temporary real.
including an 8089 Input/Output Processor to
perform high speed data transfers, very Computations in the 8087 center on the pro-
powerful and highly self-sufficient terminals cessor's register stack. These eight 80-bit registers
can be built from a relatively small number provide the equivalent capacity of 40 of the 16-bit
of 8086 family parts. registers found in typical CPUs. This generous
space allows more constants and
Data acquisition The 8087 can be used to
register
intermediate results to be held in registers during
scan, scale, and reduce large quantities of
calculations, reducing memory access and conse-
data as it is collected, thereby lowering
quently improving execution speed as well as bus
storage requirements as well as the time
availability. The 8087 register set is unique in that
required to process the data for analysis.
it can be accessed both as a stack, with instruc-
S-5
8087 NUMERIC DATA PROCESSOR
Significant
Data Type Bits Approximate Range (Decimal)
Digits (Decimal)
yx u 1
*The short and long real data types correspond to the single and double precision data types
defined in other Intel numerics products.
Class Instructions
Data Transfer Load (all data types), Store (all data types), Exchange
that "address" the NDP as an "I/O device", or executes entirely on an 8086 or 8088 CPU. The
to incur the overhead of setting up a opera- DMA emulator allows 8087 routines to be developed
tion to perform data transfers. Second, the NDP and checked out on an 8086/8088 execution
automatically detects exception conditions that vehicle before prototype 8087 hardware is opera-
can potentially damage a calculation at run-time. tional. At the source code level, there is no
On-chip exception handlers are automatically difference between a routine that will ultimately
invoked by default to field these exceptions so run on an 8087 or on a CPU
emulation of an
that a reasonable result is produced and execution 8087. At link time, the decision is made whether
may proceed without program intervention. to use the NDP or the software emulator; no re-
Alternatively, the 8087 can interrupt the CPU and compilation or re-assembly is necessary. Source
thus trap to a user procedure when an exception is programs are independent of the numeric execu-
detected. tion vehicle: except for timing, the operation of
the emulated NDP is the same as for "real hard-
Besides the assembler and compiler, Intel ware". The emulator also makes it simple for a
provides a software emulator for the 8087. The product to offer the NDP as a "plug-in"
8087 emulator (E8087) is a software package that performance option without the necessity of
provides the functional equivalent of an 8087; it maintaining two sets of source code.
S-6
8087 NUMERIC DATA PROCESSOR
Hardware Interface when the 8087 not in control of the local bus.
is
The NDP uses one of its host CPU's 8.2 Processor Architecture
request/grant lines to obtain control of the local
bus for data transfers (loads and stores). The As shown in figure S-4, the NDP is internally
other CPU request/grant line is available for divided into two processing elements, the control
general system use, for example, by a local 8089 unit (CU) and the numeric execution unit (NEU).
Input/Output Processor. A loc al 80 89 may also In essence, the NEU executes all numeric instruc-
be connected to the 8087's RQ/GTl line. In this tions, while the CU fetches instructions, reads
configuration, the 8087 passes the request/grant and writes memory operands, and executes the
handshake signals between the CPU and the lOP processor control class of instructions. The two
INTR
8259A
CLK 8086/8088
CPU
I I
L .J.""- J RQ/GT1
QSO QSl TEST 8086
FAMILY MULTIMASTER
BUS SYSTEM
INTERFACE BUS
QSO QSl BUSY COMPONENTS
8284
CLOCK RQ/GTO
GENERATOR
8087
CLK CLK NDP
1^
INT
RQ/GTl :^
A 8
_y _ I H
w
RQ/GT
IS
8089
MCLK lOP
S-7
8087 NUMERIC DATA PROCESSOR
elements are able to operate independently of The two processors execute the instruction stream
one another, allowing the CU to maintain differently, however. The first five bits of all 8087
synchronization with the CPU while the NEU machine instructions are identical; these bits
executes numeric instructions. designate the coprocessor escape (ESC) class of
instructions. The control unit ignores all instruc-
tions that do not match these bits, since these
Control Unit instructions are directed to the CPU only. When
the CU decodes an instruction containing the
The CU keeps the 8087 operating in synchroniza- escape code, it either executes the instruction
tion with its host CPU. 8087 instructions are itself, or passes it to the NEU, depending on the
CONTROL WORD
STATUS WORD
EXPONENT
MODULE f4 Z PROGRAMMABLE
SHIFTER //
INTERFACE
/
OPERANDS
QUEUE
64.
TEMPORARY
REGISTERS
(7)
(6)
T
(5)
A
G
REGISTER STACK (1)
W (3)
0
STATUS ADDRESSING & R (2
BUSTRACKING D
(1)
ADDRESS EXCEPTION
POINTERS (0)
L 80 BITS
J
Figure S-4. 8087 Block Diagram
S-8
8087 NUMERIC DATA PROCESSOR
bus cycles. In a store operation, the CU captures square root; this instruction takes no operands
and saves the operand address as in a load, and because the top-of-stack register is implied as the
ignores the data word that follows in the "dummy operand. Other instructions allow the program-
read" cycle. When the 8087 is ready to perform mer to explicitly specify the register that is to be
the store, the CU obtains the bus from the CPU used. Explicit register addressing is "top-
and writes the operand at the saved address using relative" where the ASM-86 expression ST
as many consecutive bus cycles as are necessary to denotes the current stack top and ST(/) refers
store the operand. to the ith register from ST in the stack (0< / <7).
For example, if ST contains OllB (register 3 is the
top of the stack), the following instruction would
Numeric Execution Unit add registers 3 and 5:
The NEU executes all instructions that involve FADD ST, ST(2)
the register stack; these include arithmetic, In typical use, the programmer may conceptually
comparison, transcendental, constant, and data "divide" the registers into a fixed group and an
transfer instructions. The data path in the NEU is adjustable group. The fixed registers are used like
68 bits wide and allows internal operand transfers the conventional registers in a CPU, to hold con-
to be performed at very high speeds. stants, accumulations, etc. The adjustable group
is used like a stack, with operands pushed on and
S-9
8087 NUMERIC DATA PROCESSOR
Status Word Bit 7 is the interrupt request field. The NDP sets
this field to record a pending interrupt to the
The status word reflects the overall condition of CPU.
the 8087; it may be examined by storing it into
memory with an NDP instruction and then Bits 5-0 are set to indicate that the NEU has
inspecting it with CPU code. The status word is detected an exception while executing an instruc-
divided into the fields shown in figure S-6. The tion. Section S.3 explains these exceptions.
busy field (bit 15) indicates whether the NDP is
15 7 0
B 03 ST 02 01 CO IR PE UE OE ZE DE
1 1
I
DENORMALIZED OPERAND
I
ZERODIVIDE
'
OVERFLOW
I
UNDERFLOW
I
PRECISION
'
(RESERVED)
'
INTERRUPT REQUEST
CONDITION CODEd)
STACK TOP P0INTER12>
BUSY
(1) See descriptions of compare, test, examine and remainder instructions in section S.7 for
condition code interpretation.
<2) ST values:
000 = register 0 Is stack top
001 = register 1 is stack top
S-10
8087 NUMERIC DATA PROCESSOR
15
IC RC PC lEM PM UM OM ZM DM IM
1 , , 1
DENORMALIZED OPERAND
ZERODIVIDE
OVERFLOW
UNDERFLOW
PRECISION
(RESERVED)
INTERRUPT-ENABLE MASKd)
PRECISION C0NTR0L<2)
ROUNDING C0NTR0L(3)
INFINITY CONTROLO)
(RESERVED)
C) Interrupt-Enable Mask:
0 = Interrupts Enabled
1 = Interrupts Disabled (Masked)
also. An exception handler can store these tions which may arise during execution of NDP
pointers in memory and thus obtain information instructions are also described along with the
concerning the instruction that caused the options that are available for responding to these
exception. exceptions.
S-11
8087 NUMERIC DATA PROCESSOR
15
Tag values:
00 = Valid (Normal or Unnormal)
01 = Zero (True)
10 = Special (Not-A-Number, 0, or Denormal)
11 = Empty
OPERAND ADDRESS(^)
I
INSTRUCTION OPCODEtZ)
INSTRUCTION ADDRESSd)
10
(1)
20-bit physical address
(2)
1 1 least significant bits of opcode; 5 most significant bits are always 8087 hook (11011 B)
pencil and paper calculations is conceptually indicate the subset of real numbers the 8087 can
infinite and continuous. There is no upper or represent as data and final results of calculations.
lower limit to the magnitude of the numbers one The 8087's range is approximately 4.19x10'^^
can employ in a calculation, or to the precision to 1.67x10^^. Applications that are required to
(number of significant digits) that the numbers deal with data and final results outside this range
can represent. When considering any real are rare. By comparison, the range of the IBM
number, there are always an infinity of numbers 370 is about 0.54x10-^8 +o.72xlO''6
both larger and smaller. There is also an infinity
of numbers between (i.e., with more significant The spacing in figure S-10 illustrates that
finite
digits than) any two real numbers. For example, the NDP can represent a great many, but not all,
between 2.5 and 2.6 are 2.51, 2.5897, 2.500001, of the real numbers in its range. There is always a
etc. "gap" between two "adjacent" 8087 numbers,
and it is possible for the result of a calculation to
fall in this space. When this occurs, the NDP
While ideally itwould be desirable for a computer rounds the true result to a number that it can
to be able to operate on the entire real number represent. Thus, a real number that requires more
system, in practice this is not possible. Com- digits than the 8087 can accommodate (e.g., a 20
puters, no matter how large, ultimately have digit number) is represented with some loss of
fixed-size registers and memories that limit the accuracy. Notice also that the 8087's represent-
system of numbers that can be accommodated. able numbers are not distributed evenly along the
These limitations proscribe both the range and the real number line. There are, in fact, an equal
precision of numbers. The result is a set of number of representable numbers between suc-
numbers that is finite and discrete, rather than cessive powers of 2 (i.e., there are as many
infinite and continuous. This sequence is a subset representablenumbers between 2 and 4 as
of the real numbers which is designed to form a between 65,536 and 131,072). Therefore, the
usqM approximation of the real number system. "gaps" between representable numbers are
S-12
8087 NUMERIC DATA PROCESSOR
"larger" as the numbers increase in magnitude. Conversely, and equally important, the 8087 does
All integers in the range 2^"*, however, are perform exact arithmetic on its integer subset of
exactly representable. the reals. That is, an operation on two integers
returns an exact integral result, provided that the
In its internal operations, the 8087 actually true result is an integer and is in range. For exam-
employs a number system that is a substantial ple, 4-r2 yields an exact integer, 1-^3 does not, and
superset of that shown in figure S-10. The internal 240 X 230 + 1 ^Qgg j^Qj^ because the result requires
format (called temporary real) extends the 8087's greater than 64 bits of precision.
range to about 3.4x10-^^32 i.2xlO''^3^ and
its precision to about 19 (equivalent decimal)
digits. This format is designed to provide extra
range and precision for constants and Data Types and Formats
intermediate results, and is not normally intended
for data or final results. The 8087 recognizes seven numeric data types,
divided into three classes: binary integers, packed
From a practical standpoint, the 8087's set of real decimal integers, and binary reals. Section S.4
numbers is sufficiently "large" and "dense" so describes how these formats are stored in memory
as not to limit the vast majority of microprocessor (the sign is always located in the highest- address-
applications. Compared to most computers, ed byte). Figure S-11 summarizes the format of
including mainframes, the NDP
provides a very each data type. In the figure, the most significant
good approximation of the real number system. It digits of all numbers (and fields within numbers)
is important to remember, however, that it is not are the leftmost digits. Table S-2 provides the
an exact representation, and that arithmetic on range and number of significant (decimal) digits
real numbers is inherently approximate. that each format can accommodate.
S-13
8087 NUMERIC DATA PROCESSOR
INCREASING SIGNIFICANCE
QWHRT
onuni IMTPfiPR
inicucn c
o MAGNITUDE TWOS
COMPLEMENT)
31 0
MAGNITUDE
PACKED DECIMAL S X
79 72
BIASED
SHORT REAL EXPONENT SIGNIFICAND
23V^
31
I 1
BIASED
LONG REAL EXPONENT SIGNIFICAND
52y
63
^ I k
BIASED
TEMPORARY REAL EXPONENT SIGNIFICAND
79 64 63'
NOTES:
S = Sign bit (0 = positive. 1 = negative)
''n = Decimal digit (two per byte)
X = Bits have no significance: 8087 ignores when loading, zeros when storing.
i = Position of implicit binary point
I Integer bit of signlficand: stored in temporary real, implicit in short and long real
Exponent Bias (normalized values):
Short Real: 127 (7FH)
Long Real: 1023 (3FFH)
Temporary Real: 16383 (3FFFH)
Binary Integers are 0). The 8087 word integer format is identical
to the 16-bit signed integer data type of the 8086
The three binary integer
formats are identical and 8088.
except for length, which governs the range that
can be accommo(date(j in each format. The left-
most bit is interpreted as the number's Decimal Integers
sign: O=positive and l=negative. Negative
numbers are represented in standard two's com- Decimal integers are stored in packed decimal
plement notation (the binary integers are the only notation, with two decimal digits "packed" into
8087 format to use two's complement). The quan- each byte, e.xcepi the leftmost byte, which carries
tity zero is represented w ith a positive sign (all bits the sign bit (0 = positive, 1 = negative). Negative
S-14
8087 NUMERIC DATA PROCESSOR
numbers are not stored in two's complement form expressed exactly in decimal). When a translator
and are distinguished from positive numbers only encounters such a value, it produces a rounded
by the sign bit. The most significant digit of the binary approximation of the decimal value.
number is the leftmost digit. All digits must be in
the range 0H-9H. The NDP usually carries the digits of the signifi-
cand in normalized form. This means that, except
for the value zero, the significand is an integer
and a fraction as follows:
Real Numbers
l^fff.-.ff
The 8087 stores real numbers in a three-field
binary format that resembles scientific, or where A indicates an assumed binary point. The
exponential, notation. The number's significant number of fraction bits varies according to the
digits are held in the significand field, the real format: 23 for short, 52 for long and 63 for
exponent field locates the binary point within the temporary real. By normalizing real numbers so
significant digits (and therefore determines the that their integer bit is always a 1, the 8087
number's magnitude), and ihe sign field indicates eliminates leading zeros in small values (ixj < 1).
whether the number is positive or negative. (The This technique maximizes the number of signifi-
exponent and significand are analogous to the cant digits that can be accommodated in a
terms "characteristic" and "mantissa" used to significand of a given width. Note that in the
describe floating point numbers on some com- short and long real formats the integer bit is
puters.) Negative numbers differ from positive implicit and is not actually stored; the integer bit
numbers only in their sign bits. is physically present in the temporary real format
only.
Table S-4 shows how the real number 178.125
(decimal) is stored in the 8087 short real format.
The table lists a progression of equivalent nota- Ifone were to examine only the significand with
tions that express the same value to show how a itsassumed binary point, all normalized real
number can be converted from one form to numbers would have values between 1 and 2. The
another. The ASM-86 and PL/M-86 language exponent field locates the actual binary point in
translators perform a similar process when they the significant digits. Just as in decimal scientific
encounter programmer-defined real number con- notation, a positive exponent has the effect of
stants. Note that not every decimal fraction has moving the binary point to the right and a
an exact binary equivalent. The decimal num.ber negative exponent effectively moves the binary
1/10, for example, cannot be expressed exactly point to the left, inserting leading zeros as
in binary (just as the number 1/3 cannot be necessary. An unbiased exponent of zero
Notation Value
Scientific Binary
1^0110010001 El 00001 10
(Biased Exponent)
Biased
Sign Significand
Exponent
8087 Short Real
(Normalized) 0 10000110 01100100010000000000000
t_ 1^ (implicit)
S-15
8087 NUMERIC DATA PROCESSOR
tions that are constrained by memory, but it coded indefinite value will propagate through the
should be recognized that this format provides a calculation and thus flag the erroneous computa-
smaller margin of safety. It is also useful for tion when it is eventually delivered as the result.
debugging algorithms because roundoff problems Each format has an encoding that represents the
will manifest themselves more quickly in this for- special value indefinite
mat. The temporary real format should normally
be reserved for holding intermediate results, loop In the real formats, a whole range of special
accumulations, and constants. Its extra length is values, both positive and negative, is designated
designed to shield final results from the effects to represent a class of values called NAN (Not-A-
of rounding and overflow/underflow in inter- Number). The
special value indefinite is a
mediate calculations. When the temporary real reserved NAN
encoding, but all other encodings
format is used to hold data or to deliver final are made available to be defined in any way by
results, the safety features built into the 8087 are application software. Using a NAN
as an operand
compromised. Furthermore, the range and preci- raises the invalid operation exception, and can
sion of the long real form are adequate for most trap to a user-written routine to process the NAN.
microcomputer applications. Alternatively, the 8087's built-in exception
S-16
8087 NUMERIC DATA PROCESSOR
10 Round up (toward +) c
As mentioned earlier, the 8087 stores non-zero The NDP has four rounding modes, selectable by
real numbers in "normalized floating point" the RC field in the control word (see figure S-7).
form. It also provides for storing and operating Given a true result b that cannot be represented
on reals that are not normalized, i.e., whose by the target data type, the 8087 determines the
significands contain one or more leading zeros. two representable numbers a and c that most
Nonnormals arise when
the result of a calculation closely bracket b in value (a < b < c). The pro-
yields a value that too small to be represented in
is cessor then rounds (changes) to a or to c
normal form. The leading zeros of nonnormals according to the mode selected by the RC field as
permit smaller numbers to be represented, at the shown in table S-5. Rounding introduces an error
cost of some lost precision (the number of signifi- in a result that is less than one unit in the last
cant digits is reduced by the leading zeros). In place to which the result is rounded. "Round to
typical algorithms, extremely small values are nearest" is the default mode and is suitable for
most likely to be generated as intermediate, rather most applications; it provides the most accurate
than final results. By using the NDP's temporary and statistically unbiased estimate of the true
real format for holding intermediates, values as result. The "chop" mode is provided for integer
small as 3.4xlO~'*^^2 can be represented; this arithmetic applications.
makes the occurrence of nonnormal numbers a
rare phenomenon in 8087 applications. Never- "Round up" and "round down" are termed
theless, the NDP can load, store and operate on directed rounding and can be used to implement
nonnormalized real numbers. interval arithmetic. Interval arithmetic generates
a certifiable result independent of the occurrence
of rounding and other errors. The upper and
Rounding Control lower bounds of an interval may be computed by
executing an algorithm twice, rounding up in one
Internally, the 8087 employs three extra bits pass and down in the other.
(guard, round and sticky bits) which enable it to
represent the infinitely precise true result of a
computation; these bits are not accessible to Precision Control
programmers. Whenever the destination can
represent the infinitely precise true result, the The 8087 allows be calculated with 64,
results to
8087 delivers it. Rounding occurs in arithmetic 53, or 24 bits of precision as selected by the PC
and store operations when the format of the field of the control word. The default setting, and
S-17
8087 NUMERIC DATA PROCESSOR
the one that is best-suited for most applications, is with affinemode reserved for local computations
the full 64 bits. The other settings are required by where the programmer can take advantage of the
the proposed IEEE standard, and are provided to sign and knows for certain that the nature of the
obtain compatibility with the specifications of computation will not produce a misleading result.
certain existing programming languages. Specify-
ing less precision nullifies the advantages of the
Exceptions
temporary real format's extended fraction length,
and does not improve execution speed. When
During the execution of most instructions,
reduced precision is specified, the rounding of the
the 8087 checks for six classes of exception
fraction zeros the unused bits on the right.
conditions.
The 8087's system of real numbers may be closed An attempt to load a register that is not
by either of two models of infinity. These two empty, (e.g., stack overflow),
means of closing the number system, projective
and affine closure, are illustrated schematically in
An attempt to pop an operand from an
figure S-12. The setting of the IC field in the con- empty register (e.g., stack underflow),
trolword selects one model or the other. The An operand is a NAN,
means of closure is projective, and this is
default
The operands cause the operation to be
recommended for most computations. When pro-
indeterminate (0/0, square root of a negative
jective closure is selected, the NDP treats the
number, etc.).
special values + and - as a single unsigned
infinity (similar to its treatment of signed zeros).
In the affine mode the NDP respects the signs of An invalid operation generally indicates a pro-
+ and -00. gram error.
While affine mode may provide more informa- If the exponent of the true result too large for
is
tion than projective, there are occasions when the the destination real format, the 8087 signals
sign may misinformation. For
in fact represent overflow. Conversely, a true exponent that is too
example, consider an algorithm that yields an small to be represented results in the underflow
intermediate result x of +0 and -0 (the same exception. If either of these occur, the result of
numeric value) in different executions. If 1/x the operation is outside the range of the destina-
were then computed in affine mode, two entirely tion real format.
different values {+ and -o) would result from
numerically identical of x. Projective
values Typical algorithms are most likely to produce
mode, on the other hand, provides less informa- extremely large and small numbers in the calcula-
tion but never returns misinformation. In general, tion of intermediate, rather than final, results.
then, projective mode should be used globally, Because of the great range of the temporary real
format (recommended as the destination format
for intermediates), overflow and underflow are
oo
relatively rare events in most 8087 applications.
S-18
8087 NUMERIC DATA PROCESSOR
detected, the register stackand memory have not By writing different values into the exception
yet been updated, and appear as if the offending masks of the control word, the user can accept
instruction has not been executed. When an responsibility for handling exceptions, or delegate
"after" exception is detected, the register stack this to the NDP. Exception handling software is
and memory appear as if the instruction has run often difficult to write, and the 8087's masked
to completion, i.e., they may be updated. responses have been tailored to deliver the most
(However, in a store or store and pop operation, "reasonable" result for each condition. The
unmasked over/underflow is handled like a majority of applications will find that masking all
"before" exception; memory is not updated and exceptions other than invalid operation will yield
the stack is not popped.) In cases where multiple satisfactory results with the least programming
exceptions arise simultaneously, one exception is investment. An invalid operation exception
signalled according to the following precedence normally indicates a fatal error in a program that
sequence: must be corrected; this exception should not
normally be masked.
Denormalized (if unmasked),
Invalid operation, The exception flags are "sticky" and can be
that exception. In the second case, the exception handler will be invoked via an interrupt from the
is unmasked, and the processor performs its 8087. The 8087 sets the IR (interrupt request) bit
unmasked response. The masked response always in the status word, but this, in itself, does not
produces a standard result and then proceeds with guarantee an immediate CPU interrupt. The
the instruction. The unmasked response always interrupt request may be blocked by the lEM
traps to user software by interrupting the CPU (interrupt-enaWe mask) in the 8087 control word,
S-19
8087 NUMERIC DATA PROCESSOR
*0n overflow, 24,576 decimal is subtracted from the true result's exponent; this forces the exponent back
into range and permits a user exception handler to ascertain the true result from the adjusted result that
is returned. On underflow, the same constant is added to the true result's exponent.
status and control words in the saved Storing a diagnostic value (a NAN) in the
environment; result and continuing with the computation.
S-20
8087 NUMERIC DATA PROCESSOR
Notice that an exception may or may not con- on the CPU to generate the addresses of memory
an error depending on the application. For
stitute operands, the NDP can take advantage of the
example, an invalid operation caused by a stack CPU's memory addressing modes and its ability
overflow could signal an ambitious exception to relocate code and data during execution.
handler to extend the register stack to memory
and continue running. Data Storage
S.4 Memory Figures S-13 and S-14 show how the 8087 data
types are stored in memory. The sign bit is always
The 8087 can access any location in its host located in the highest-addressed byte. The least
CPU's megabyte memory space. Because it relies significant binary or decimal digits in a number
+ 3 sisl
|B|
+2
+ 1 sisi +1 +3 SjSI
|B|
l'm'
ri
II 1l
+0 IS +0 IS +2 sisl
IB 1 E,F,
7 0 7 ()
'l
1 +0 IS
+9 SI (X) +9 SIS I
|E|
1 7 0
SHORT REAL
+8 MSD 1
+8
'Ml
+ 7 Sis I +7 + 7 Sisl +7 IjSI
|E|
'lX
+6 lA
UJ
+6 +6 islsl +6
OC |E|F,
Q
O
+5 < +5 +5 +5
(/)
ui
</>
05
+4 = +4 +4 111 +4
cc
o
a
+ 3 +3 +3 < +3
tr.
UJ
X
o
+2 -t-2 +2 5 +2
+1 + 1 +1 + 1
II
+0 +0 1 LSD +0 IS +0
,F
7 0
LONG INTEGER PACKED DECIMAL LONG REAL TEMPORARY REAL
Figure S-13. Storage of Integer Data Types Figure S-14. Storage of Real Data Types
S-21
8087 NUMERIC DATA PROCESSOR
(or in a field in the case of reals) are those with the As described in section S.6, the 8087 automati-
lowest addresses. The word integer format is cally determines the identity of its host CPU.
stored exactly like an 8086/8088 16-bit signed When the NDP is wired to an 8088, it transfers
integer, and is directly usable by instructions one byte per bus cycle in the same manner as the
executed on either the CPU or the NDP. CPU. When used with an 8086, the NDP again
operates like the CPU, accessing odd-addressed
A few special instructions access memory to load words in two bus cycles and even-addressed words
or store formatted processor control and state in one bus cycle. If the 8087 is reading or writing
data. The formats of these memory operands are more than one word of an odd-addressed operand
provided with the discussions of the instructions in 8086 memory, it optimizes the transfer by
in section S.7. accessing a byte on the first transfer, forcing the
address to even, and then transferring words up
Storage Access to the last byte of the operand.
The host CPU always generates the address of the To minimize operand and 8087 use
transfer time
first (lowest-addressed) byte of a memory of the system bus, advantageous to align 8087
it is
Any 8086/8088 memory addressing mode transferred to an 8086. The ASM-86 EVEN direc-
direct, register indirect, based, indexed or based tive can be used to force word alignment.
indexed can be used to access an 8087 operand
in memory. This makes the NDP
easy to use with
data structures such as arrays, structures, and
Dynamic Relocation
lists.
Since the host CPU takes care of both instruction
fetching and memory operand addressing, the
When the CPU emits the 20-bit physical address
of the memory operand, the 8087 captures the
NDP may be utilized in systems that alter pro-
gram addresses during execution. The only
address and saves it. If the instruction loads
restriction on the CPU is that it should not change
information into the NDP, the 8087 captures the
the address of an 8087 operand while the 8087 is
lowest-addressed word when it becomes available
executing an instruction which stores a result to
on the bus as a result of the CPU's "dummy
that address. If thisdone, the 8087 will store to
is
read." (The "dummy read" may require either
the operand's old address (the one it picked up
one or two bus cycles depending on the CPU type
during the "dummy read").
and the alignment of the operand.) If the operand
is longer than one word (all 8087 operands are an
S-22
8087 NUMERIC DATA PROCESSOR
S-23
8087 NUMERIC DATA PROCESSOR
TEST:
BUSY-TEST
/
f
W"
N ' V7 V.
CPU: [wAItJ ESC [wAITj ESC I I
CMP I I
JG I
[
MOV I
[WAIT^ ESC |
|
V\/A1T
|
|
MOV |
NOTES:
^WAI^ = Assembler-generated instruction.
Instruction execution times are not drawn to scale.
a preceding WAIT, and the effect of the WAIT The 8087 CU can execute most processor control
when the NDP
is, and is not, busy. Since the NDP instructions by itself regardless of what the NEU
is not busy when the first WAIT is encountered, is doing: thus the 8087 can, in these cases, poten-
the CPU executes it and immediately proceeds to tially execute two instructions at once. The
the next instruction; the NDP ignores the WAIT. ASM-86 assembler provides separate "wait" and
The next instruction is decoded simultaneously by "no wait" mnemonics for these instructions. For
both processors. The NDP starts the multiplica- example, the instruction that sets the 8087 inter-
tion and raises its BUSY line. The CPU ex ecutes rupt enable mask, and thus disables interrupts,
the ESC and then the second WAIT. Since TEST can be coded as FDISI or FNDISI. The assembler
is active (it is tied to BUSY), the CPU effectively does not generate a preceding WAIT if the second
stretches execution of this WAIT until the NDP form is coded, so that interrupts can be disabled
signals completion of the multiply by lowering while the NEU is busy executing a previous
BUSY. The next instruction is interpreted as a instruction. The no-wait forms are principally
square root by the NDP and another escape by used in exception handlers and operating systems.
the CPU. The CPU finishes the ESC well before
the NDP completes the FSQRT. This time, in- Local Bus Arbitration
stead of waiting, the CPU executes three instruc-
tions (compare, jump if greater, and move) while Whenever an NDP instruction writes data to
the 8087 is working on the FSQRT. The 8087 memory, or reads more than one word from
ignores these CPU-only instructions. The CPU memory, the NDP forces the CPU to relinquish
then encounters the third WAIT, generated by the the local bus. It does this by means of the
assembler immediately preceding the FIST (store request/grant facility built into all 8086 family
stack top into integer word). When the NDP processors. For memory reads, the NDP requests
finishes the FSQRT, both processors proceed to the bus immediately upon the CPU's completion
the next instruction, FIST to the NDP and ESC to of its "dummy read" cycle; it follows from this
the CPU. The CPU completes the escape quickly that the CPU may "immediately" update a
and then executes an explicit programmer-coded variable read by the NDP in the previous instruc-
FWAIT to ensure that the 8087 has updated tion with the assurance that the NDP will have
BETA before it moves BETA'S new value to obtained the old value before the CPU has altered
register AX. it. For memory writes, the NDP performs as
S-25
8087 NUMERIC DATA PROCESSOR
8.6 Processor Control and The FINIT (intialize) and FSAVE (save state)
instructions also initialize the processor. Unlike a
Monitoring RESET pulse, software initialization does not
affect the 8087's tracking of the CPU.
all activities. When RESET subsequently goes is accessing an even-addressed word or an odd-
Control Word
Infinity Control 0 Projective
Rounding Control 00 Round to nearest
Precision Control 11 64 bits
Interrupt-enable Mask 1 Interrupts disabled
Exception Masks 111111 All exceptions masked
Status Word
Busy 0 Not busy
Condition Code ???? (Indeterminate)
Stack Top 000 Empty stack
Interrupt Request 0 No interrupt
Exception Flags 000000 No exceptions
Tag Word
Tags 11 Empty
Registers N.C. Not changed
Exception Pointers
Instruction Code N.C. Not changed
Address
Instruction N.C. Not changed
Operand Address N.C. Not changed
S-26
8087 NUMERIC DATA PROCESSOR
S-27
8087 NUMERIC DATA PROCESSOR
event when several occur simultaneously. Many should be done by executing the FNSTCW and
systems allow higher-priority interrupts to FNDISI instructions before enabling CPU inter-
preempt lower-priority interrupt handlers. The rupts. Before returning, the interrupt handler
8259A supports several priority-resolving techni- should restore the original control word in the
ques; a system will normally select one of these by 8087 by executing FLDCW.
programming the 8259A at initialization time.
Users should consult "Using the 8259A Program-
Application tasks execute only when no external mable Interrupt Controller'', Intel Application
event needs service, i.e., when no interrupt Note No. AP-59, for a description of the 8259A's
handler is running. Application tasks are invoked various modes of operation.
by software, rather than hardware; typically a
scheduling or dispatching algorithm is used to Endless Wait
select one task for execution. In effect, any inter-
rupt handler has higher priority than any applica- The 8087 and its host CPU can enter an endless
tion task, since the recognition of an interrupt will wait condition when the CPU
executing a
is
invoke the interrupt handler, preempting the WAIT instruction and a pending interrupt request
application task that was running. from the 8087 is prevented from being recognized
by the CPU. Thus, the CPU will wait for the 8087
There are two important questions to consider to lower its BUSY line, while the NDP will wait
when assigning a priority to the 8087's interrupt for the CPU to invoke the exception handler
request: interrupt procedure, and the task which has
generated the exception will be blocked from
Who can cause 8087 exceptions only further execution.
application tasks, or interrupt handlers as
well? Figure S-17 shows the typical path of an interrupt
request from the 8087 to the interrupt procedure
Who should be preempted by NDP which is designated to field NDP exceptions. The
exceptions only applications tasks, or inter- interrupt request can be potentially blocked at
rupt handlers as well? three points along the path, creating an endless
wait if the CPU is executing a WAIT instruction.
Given these considerations, the 8087 should The first block can occur at the 8087's interrupt-
normally be assigned the lowest priority of any enable mask (lEM). If this mask is set, the inter-
interrupting device in the system. This allows the rupt request is blocked except that the 8087 will
interrupt handler the(i.e., NDP
exception override the mask if the CPU is waiting (the 8087
handler) to preempt any application task that decodes the WAIT instruction simultaneously
generates an 8087 exception, and at the same time with the CPU). Thus, the 8087 detects and
prevents exception NDP handler
the from prevents one of the endless wait conditions.
interfering with other interrupt handlers.
A given interrupt request, IRn, can be masked on
Ifan interrupt handler uses the 8087 and requires the 8259A by setting the corresponding bit in the
the service of the exception handler, it can effec- PlC's interrupt mask register (IMR). This will
tively "raise" the priority of the exception prevent a request from the 8087 from being
handler by disabling all interrupts lower than passed to the CPU. (The 8259A's normal priority-
itself and higher than the 8087. Then, any un- resolving activity can also block an interrupt
masked exception caused by the interrupt handler request.) Finally, the CPU can exclude all
will be fielded without interference from lower- interrupts tied to INTR by clearing its interrupt-
priority interrupts. enable flag (IF). In these two cases, the CPU can
"escape" the endless wait only if another inter-
If, for some reason, the 8087 must be given higher rupt is recognized (if IF is cleared, the interrupt
priority than another interrupt source, the inter- must arrive on NMI, the CPU's non-maskable
rupt handler that services the lower-priority interrupt line). Following execution of the inter-
device may want to prevent interrupts from the rupt procedure and resumption of the WAIT, the
8087 (which may originate in a long instruction endless wait will be entered again, unless, as part
still running on the 8087 when the interrupt of its response to the interrupt it recognizes, the
handler is invoked) from preempting it. This CPU clears the interrupt path from the 8087.
S-28
8087 NUMERIC DATA PROCESSOR
A user-written exception handler can itself cause speed, bus transfers, and exceptions, as well
an unending wait. When the exception handler as a coding example for each combination of
starts to run, the 8087 is suspended with its BUSY operands accepted by the instruction. This
line active, waiting for the exception to be information is concentrated in a table, organized
cleared, and interrupts on the CPU are disabled. alphabetically by instruction mnemonic, for easy
If, in this condition, the exception handler issues reference.
any 8087 instruction, other than a no-wait form,
the result will be an unending wait. To prevent Throughout this section, the instruction set is
this, the exception handler should clear the excep- described as it appears to the ASM-86 program-
tion on the 8087 and enable interrupts on the mer who coding a program. Appendix A covers
is
CPU before executing any instruction that is the actual machine instruction encodings, which
preceded by a WAIT. are principally of use to those reading unfor-
matted memory dumps, monitoring instruction
More generally, an instruction that is preceded by fetches on the bus, or writing exception handlers.
a WAIT (or an FWAIT instruction) should never
be executed when CPU interrupts are disabled The instruction descriptions in this section con-
and there is any possibility that the 8087's BUSY centrate on describing the normal function of
line is active. each operation. Table S-19 lists the exceptions
that can occur for each instruction and table S-32
Status Lines details the causes of exceptions as well as the
8087's masked responses.
When the 8087 has control of the local bus, it
emits signals on status lines S2-S0 to identify The typical NDP instruction accepts one or twc
the type of bus cycle it is running. The 8087 operands as "inputs", operates on these, am
generates the restricted (compared to a CPU) set produces a result as an "output". Operands are
of encodings shown These lines
in table S-8.
correspond exactly to the signals output by the
8086 and 8088 CPU's, and are normally decoded
by an 8288 Bus Controller.
S-29
8087 NUMERIC DATA PROCESSOR
most often (the contents of) register or memory Data Transfer Instructions
locations. The operands of some instructions are
predefined; for example, FSQRT always takes the These instructions (summarized in table S-9)
square root of the number in the top stack ele- move operands among elements of the register
ment. Others allow, or require, the programmer stack, and between the stack top and memory.
to explicitly code the operand(s) along with the Any of the seven data types can be converted to
instruction mnemonic. Still others accept one temporary real and loaded (pushed) onto the
explicit operand and one implicit operand, which stack in a single operation; they can be stored to
is usually the top stack element. memory in the same manner. The data transfer
instructions automatically update the 8087 tag
Whether supplied by the programmer or utilized word to reflect the register contents following the
automatically, there are two basic types of instruction.
operands, sources and destinations A source .
from a source operand, however, because its con- (ST(i)) or any of the real data types in memory.
tent may be altered when it receives the result Short and long real source operands are converted
produced by the operation; that is, the destination to temporary real automatically. Coding FLD
Many instructions allow their operands to be cod- Table S-9. Data Transfer Instructions
ed in more than one way. For example, FADD
(add real) may be written without operands, with Real Transfers
only a source or with a destination and a source.
The instruction descriptions in this section FLD Load real
employ the simple convention of separating alter- FST Store real
native operand forms with slashes; the slashes, FSTP Store real and pop
however, are not coded. Consecutive slashes in- FXCH Exchange registers
dicate an option of no explicit operands. The
operands for FADD are thus described as: Integer Transfers
Mnemonics Intel1980
S-30
8087 NUMERIC DATA PROCESSOR
FSTP destination
FBLD source
FSTP (store real and pop) operates identically to
FST except that the stack is popped following the FBLD (packed decimal (BCD) load) converts the
transfer. This is done by tagging the top stack content of the source operand from packed
element empty and then incrementing ST. FSTP decimal to temporary real and loads (pushes) the
permits storing to a temporary real memory result onto the stack. The sign of the source is
variable while FST does not. Coding FSTP ST(0) preserved, including the case where the value is
is equivalent to popping the stack with no data negative zero. FBLD is an exact operation; the
transfer. source is loaded with no rounding error.
FILD (integer load) converts the source memory The 8087's arithmetic instruction set (table S-10)
operand from binary integer format (word,
its provides a wealth of variations on the basic add,
short, or long) to temporary real and loads subtract, multiply, and divide operations, and a
(pushes) the result onto the stack. The (new) stack number of other useful functions. These range
top is tagged zero if all bits in the source were from a simple absolute value to a square root
zero, and is tagged valid otherwise. instruction that executes faster than ordinary divi-
are designed to encourage the development of tion: the top is popped and the result is left at the
very efficient algorithms. In particular, they allow new top.
NOTES: Braces { > surround implicit operands; these are not coded, and are shown
here for information only.
ing mode may be used to define these operands, tract the source operand from the destination and
so they may be elements in arrays, structures or return the difference to the destination.
other data organizations, as well as simple
scalars.
Reversed Subtraction
The six basic operations are discussed further in FSU BR //source/destination, source
the next paragraphs, and descriptions of the FSUBRP destination, source
remaining seven arithmetic operations follow. FISUBR source
FMUL ST,ST(0) squares the content of the to load the scale factor from a word integer to
stack top. ensure correct operation.
FPREIVI
Normal Division FPREM (partial remainder) performs modulo
FDIV //source/destination, source division of the top stack element by the next stack
FDIVP destination, source element, i.e., ST(1) is the modulus. FPREM pro-
FID IV source duces an exact result; the precision exception does
not occur. The sign of the remainder is the same
The normal division instructions (divide real, as the sign of the original dividend.
divide real and pop, integer divide) divide the
destination by the source and return the quotient FPREM operates by performing successive scaled
to the destination. subtractions; obtaining the exact remainder when
the operands differ greatly in magnitude can con-
sume large amounts of execution time. Since the
Reversed Division 8087 can only be preempted between instructions,
FDIVR //source/destination, source the remainder function could seriously increase
FDIVRP destination, source interrupt latency in these cases. Accordingly, the
FIDIVR source instruction is designed to be executed iteratively in
a software-controlled loop.
The reversed division instructions (divide real
reversed, divide real reversed and pop, integer FPREM can reduce a magnitude difference of up
divide reversed) divide the source operand by to 2^^ in one execution. If FPREM produces a
the destination and return the quotient to the remainder that is less than the modulus, the func-
destination. tion is complete and bit C2 of the status word
condition code is cleared. If the function is
FSQRT incomplete, C2 is set to 1; the result in ST is then
called the partial remainder. Software can inspect
FSQRT (square root) replaces the content of the C2 by storing the status word following execution
top stack element with its square root. (Note: the of FPREM and re-execute the instruction (using
square root of -0 is defined to be -0.) the partial remainder in ST as the dividend), until
C2 is cleared. Alternatively, a program can deter-
FPREM also provides the least-significant three sign, true exponent of and its significand will
2)
bits of the quotient generated by FPREM (in C3, be 1^1 IOO...OOB. In other words the value in
C,, Cq). This is also important for transcendental ST(1) will be -1.11 x 2^ = -7.0. In both cases,
argument reduction since it locates the original following FXTRACT, ST's sign and significand
angle in the correct one of eight ti/4 segments of fields will be the same as the original operand's,
the unit circle. and its exponent field will contain 3FFFH,
(0 true).
FRNDINT (round to integer) rounds the top debugging since it allows the exponent and signifi-
stack element to an integer. For example, assume cand parts of a real number to be examined
that ST
contains the 8087 real number encoding separately.
of the decimal value 155.625. FRNDINT will
change the value to 155 if the RC field of the con-
trol word is set to down or chop, or to 156 if it is FABS
set to up or nearest.
FABS (absolute value) changes the top stack ele-
ment to its absolute value by making its sign
positive.
FXTRACT
codes as follows:
C3 CO Order
0 0 ST > source
C3 CO Result
0 1 ST < source
"1
0 ST = source
0 0 ST is positive and
1 1 ST? source
nonzero
0 1 ST is negative and
NANs and (projective) cannot be compared
nonzero
and return C3=C0=1 as shown above.
1 0 ST is zero (+ or-)
1 1 ST is not com-
Table S-12. Comparison Instructions parable (i.e., it is a
NAN or projective
FCOM Compare real CO)
FCOMPP
FCOMPP (compare real and pop twice) operates Transcendental Instructions
like FCOM and additionally pops the stack twice,
discarding both operands. The comparison is of The instructions in this group (table S-14) per-
the stack top to ST(1); no operands may be form the time-consuming core calculations for all
explicitly coded. common trigonometric, inverse trigonometric,
hyperbolic, inverse hyperbolic, logarithmic and
exponential functions. Prologue and epilogue
FICOM source software may be used to reduce arguments to the
range accepted by the instructions and to adjust
FICOM (integer compare) converts the source the result to correspond to the original arguments
operand, which may reference a word or short if necessary. The transcendentals operate on the
binary integer variable, to temporary real and top one or two stack elements and they return
compares the stack top to it. their results to the stack also.
Table S-13. FXAM Condition Code Settings ensure that operands are valid and in-range
before executing a transcendental. For periodic
Condition Code functions, FPREM may be used to bring a valid
interpretation operand into range.
C3 C2 CI CO
0 0 0 0 + Unnormal
0 0 0 1 + NAN FPTAN
0 0 1 0 - Unnormal
FPTAN (partial tangent) computes the function
0 0 1 1 - NAN Y/X = TAN (0). 0 is taken from the top stack
L) 1 0 0 + Normal element; it must lie in the range 0 < 0 < n/4. The
0 1 0 1
result of the operation is a ratio; Y replaces 0 in
4 A
the stack and X is pushed, becoming the new
U
1
A A
The ratio result of FPTAN and the ratio argu-
u u u + 0
ment of FPATAN are designed to optimize the
0 0 1 Empty calculation of the other trigonometric functions,
-
including SIN, COS, ARCSIN and ARCCOS.
0 1 0 0
These can be derived from TAN and ARCTAN
0 1 1 Empty via standard trigonometric identities.
1 0 0 + Denormal
1 0 1 Empty
1 1 0 - Denormal FPATAN
1 1 1 Empty
FPATAN (partial arctangent) computes the func-
tion 0 = ARCTAN (Y/X). X is taken from the
The transcendental instructions assume that their This instruction is designed to produce a very
operands are valid and in-range. The instruction accurate result even when X is close to zero. To
descriptions in this section provide the range of obtain Y=2^, add 1 to the result delivered by
each operation. To be considered valid, an F2XM1.
operand to a transcendental must be normalized;
denormals, unnormals, infinities and NANs are The following formulas show how values other
considered invalid. (Zero operands are accepted than 2 may be raised to a power of X:
by some functions and are considered out-of-
range by others.) If a transcendental operand is 10X = 2><*'-OG2iO
Mnemonics ^ Intel1980
S-37
8087 NUMERIC DATA PROCESSOR
As shown in the next section, the 8087 has built-in Table S-15. Constant Instructions
instructions for loading the constants LOGjlO
and L0G2e, and the FYL2X instruction may be FLDZ
I L
L_ Load -t- 0 0
FLDPI Load 71
Y from ST(1). The operands must be in the ranges FLDLN2 Load logg2
0 < X < oo and - > < Y < + w. The instruction
pops the stack and returns Z at the (new) stack
top, replacing the Y operand.
FLD1
FYL2XP1
FLDl (load one) loads (pushes) +1.0 onto the
FYL2XP1 (Y log base 2 of (X + 1)) calculates stack.
Constant Instructions
FLDLG2
Each of these instructions (table S-15) loads
(pushes) a commonly-used constant onto the FLDLG2 (load log base 10 of 2) loads (pushes)
stack. The values have full temporary real preci- the value L0G,q2 onto the stack.
sion (64 bits) and are accurate to approximately
19 decimal digits. Since a temporary real constant
occupies 10 memory bytes, the constant instruc- FLDLN2
tions,which are only two bytes long, save storage
and improve execution speed, in addition to FLDLN2 (load log base e of 2) loads (pushes) the
simplifying programming. value LOGg2 onto the stack.
Most of these instructions (table S-16) are not FDISI/FNDISI (disable interrupts) sets the inter-
used in computations; they are provided prin- rupt enable mask in the control word and
cipally for system-level activities. These include prevents the NDP from issuing an interrupt
initialization, exception handling and task request.
switching.
Table S-16. Processor Control Instructions
FNSTENV is decoded while another instruction is manner as FNSAVE. (There is no risk of the
executing concurrently in the NEU, the 8087 following FWAIT causing an endless wait,
queues the FNSTENV and does not store the because FNSTENV masks all exceptions, thereby
environment until the other instruction has com- preventing an interrupt request from the 8087.)
pleted. Thus, the data saved by the instruction
INCREASING ADDRESSES
reflects the 8087 after any previously decoded
instruction has been executed. 15
CONTROL WORD +0
INCREASING ADDRESSES
STATUS WORD +2
15 TAG WORD +4
SIGNIFICAND 15-0 + 14
FLDENV source
SIGNIFICAND 31-16 + 16
TOP STACK SIGNIFICAND 47-32 + 18
FLDENV (load environment) reloads the 8087
ELEMENT;ST
environment from the memory area defined by
SIGNIFICAND 63-48 + 20
the source operand. This data should have been
EXPONENT 14-0 + 22 written by a previous FSTENV/FNSTENV
SIGNIFICAND 15-0 + 24
instruction. CPU do not
instructions (that
reference the environment image) may
SIGNIFICAND 31-16 + 26
immediately follow FLDENV, but no subsequent
NEXT STACK
ELEMENT:ST(1) SIGNIFICAND 47-32 + 28 NDP instruction should be executed without an
SIGNIFICAND 63-48 + 30 intervening FWAIT or assembler-generated
EXPONENT 14-0 + 32
WAIT.
EXPONENT 14-0 + 92
FINCSTP
NOTES: FINCSTP (increment stack pointer) adds 1 to the
S = Sign stack top pointer (ST) in the status word. does It
Bit 0 of each field is rigtitmost, least significant bit of corresponding
register field. not alter tags or register contents, nor does it
63 of significand integer (assumed binary point is immediately
Bit is bit
transfer data. It is not equivalent to popping the
to the right).
stack since it does not set the tag of the previous
Figure S- 1 8 . FS A VE/FRSTOR Memory stack top to empty. Incrementing the stack
Layout pointer when ST=7 produces ST=0.
FWAIT is not actually an 8087 instruction, but an that may be detected during the operation.
alternate mnemonic for the CPU WAIT instruc-
tion described in section 2.8. The FWAIT There is one entry for each combination of
mnemonic should be coded whenever the pro- operand types that can be coded with the
grammer wants to synchronize the CPU to the mnemonic. Table S-17 explains the operand iden-
NDP, that is, to suspend further instruction tifiers allowed in table S-19. Following this entry
decoding until the NDP has completed the current are columns that provide execution time in clocks,
instruction. A CPU instruction should not the number of bus transfers run during the opera-
attempt to access a memory operand that has tion, the length of the instruction in bytes, and an
been read or written by a previous 8087 instruc- ASM-86 coding sample.
Identifier Explanation
memory. This activity is performed during spare the operand's effective address (EA) should be
bus cycles and proceeds in parallel with the execu- added. Effective address calculation time varies
tion of instructions from the queue. Because of according to addressing mode; table 2-20 supplies
their complexity, 8087 instructions typically take
the figures.
S-43
8087 NUMERIC DATA PROCESSOR
Instruction Length Note that the lengths quoted in table S-19 do not
include the one byte CPU WAIT instruction that
Instructions that do not reference memory are the assembler automatically inserts in front of all
two bytes long. Memory reference instructions NDP instructions (except those coded with a "no-
vary between two and four bytes. The third and wait" mnemonic).
fourth bytes are used for 8- or 16-bit displacement
values; the assembler generates the short displace-
ment whenever possible. No displacements are
required in memory references that use only CPU
register contents to calculate an operand's effec-
tive address.
Absolute value
Change sign
Mnemonics ^ Intel1980
S-45
8087 NUMERIC DATA PROCESSOR
Integer load
FISTP ,44^
FISTP destination
Integer store and pop
Exceptions: 1, P
Load log,Q2
Load loQg 2
Load lOQj e
Load lOQjlO
Load Tt
Load +0.0
Load +1.0
occurs when one or both operands is "short" it has 40 trailing zeros in its fraction (e.g., it was loaded from a short-real
memory operand).
occurs when one or both operands is "short' it has 40 trailing zeros in its fraction (e.g., it was loaded from a short-real
memory operand).
1,
^
D, P
Square root
*n = number of times CPU examines TEST line before 8087 lowers BUSY.
Exchange registers
Reference Manual, Order No. 9800640, and expressions that contain REAL data types,
MCS-86 Assembler Operating Instructions for whether variables or constants or both. This
ISIS-II Users, Order No. 9800641. PL/M-86 and
means that addition, subtraction, multiplication,
the partial emulator are documented in PL/M-86 division, comparison, and assignment of REALs
will be performed by the NDP. INTEGER expres-
Programming Manual, Order No. 9800466 and
sions, on the other hand, are evaluated on the
ISIS-II PL/M-86 Compiler Operator's Manual,
Order No. 9800478. These publications may be CPU.
ordered from Intel's Literature Department.
Five built-inprocedures (table S-20) give the
Readers should be familiar with section 2.9 of the PL/M-86 programmer access to 8087 functions
8086 Family User's Manual in order to benefit manipulated by the processor control instruc-
from the material in this section. tions. Prior to any arithmetic operations, a
typical PL/M-86 program will setup the NDP
PL/M-86 after power up using the INIT$REAL$MATH
$UNIT procedure and then issue
High levellanguage programmers can access a SET$REAL$MODE to configure the NDP.
useful subset of the 8087's (real or emulated) SETSREALSMODE loads the 8087 control word,
capabilities. The PL/M-86 REAL data type and its 16-bit parameter has the format shown in
corresponds to the NDP's short real (32-bit) for- figure S-7. The recommended value of this
mat. This data type provides a range of about parameter is 033EH (projective closure, round to
8.43*10'" < |x| < 3.38*1038, with about seven nearest, 64-bit precision, interrupts enabled, all
significant decimal digits. This representation is exceptions masked except invalid operation).
adequate for the data manipulated by many Other settings may be used at the programmer's
microcomputer applications. discretion.
If any exceptions are unmasked, an exception NDP can request an interrupt and that interrupt is
handler must be provided in the form of an inter- blocked (this may result in the endless wait
rupt procedure that is designated to be invoked by condition described in section S.6.)
CPU interrupt pointer (vector) number 16. The
exception handler can use the GET$REAL
SERROR procedure to obtain the low-order byte
of the 8087 status word and to then clear the
exception flags. The byte returned by ASM-86
GETSREALSERROR contains the exception
flags; these can be examined to determine the The ASM-86 assembly language provides a single
source of the exception. uniform set of facilities for all combinations of
the 8086/8088/8087 processors. Assembly
language programs can be written to be com-
The SAVESREALSSTATUS and RESTORE pletely independent of the processor set on which
SREALSSTATUS procedures are provided for
they are destined to execute. This means that a
multi-tasking environments where a running task
program written originally for an 8088 alone will
that uses the 8087 may be preempted by another
execute on an 8086/8087 combination without
task that also uses the 8087. It is the responsibility
re-assembling. The programmer's view of the
of the preempting task to issue
hardware is a single machine with these resources:
SAVESREALSSTATUS before executes any
it
8087 instruction with a CPU WAIT. Therefore, rather than the 8087, the WAlTs are auto-
programmers should not code PL/M-86 matically removed at link time (since there is no
statements that generate 8087 instructions if the NDP for which to wait).
S-60
8087 NUMERIC DATA PROCESSOR
;EVEN ;FORCEWORDALIGNMENT
WORO_INTEGER DW 1 000 0 0 B
1 1BIT STRING
1 1 1 1 1 1 1 ;
; RESERVE SPACE FOR STATUS WORD data aggregates ranging from simple to complex
STATUS_WORD OW ->
; LAY OUT STATUS WORD FIELDS according to the needs of the application. The
STATUS RECORD
addressing modes, and the ASM-86 notation used
BUSY : 1 ,
t
POLL STATUS WORD UNTIL 8087 IS NOT BUSY
POLL: FNSTSW STATUS_WORD Intel offers two software products that provide
TEST STATUS_WORD, MASK BUSY
JNZ POLL the functional equivalent of an 8087,
implemented in 8086/8088 software. The full
emulator (E8087) emulates all 8087 instructions.
Figure S-21 . Status Word RECORD Definition
The partial emulator (PE8087) is a smaller version
that implements only the instructions needed to
support PL/M-86 programs. The full emulator
SAMPLE STRUG
adds about 16k bytes to a program, while the
N_OBS DD ' ;SHORT INTEGER partial emulator executes in about 8k. Any
MEAN DQ ' ;LONG REAL
MODE DW ' ;WORO INTEGER emulated program will deliver the same results
STO_DEV DQ ' ;LONG REAL (except for timing) if it is executed on 8087
ARRAY OF OBSERVATIONS --
; WORD INTEGER
TEST_SCORES DW 1000 DUP C) hardware.
SAMPLE ENDS
Coding Interpretation
direct).
direct).
based).
Since the decision to produce real or emulated the FWAIT mnemonic should always be used
8087 instructions is made at link time, a program when the external processor that the CPU is to
may be switched from one mode to the other wait for is an 8087.
without retranslating the source code. When the
PL/M-86 compiler or ASM-86 assembler places
an 8087 machine instruction into an object In order to be compatible with E8087, ASM-86
module, it also inserts a special external reference. programs should observe the following
This reference is satisfied by linking the object conventions:
module to one of two Intel-supplied libraries: the
Their stack segment and class should be
real library, or the emulator library. If the real
library LINK-86 simply deletes the
named STACK.
is specified,
external references, leaving the original 8087 Interrupt pointer should be
(vector) 16
machine instructions. designated for the user's exception handler
interrupt procedure.
ASM-86 code for a simple 8087 program, called Either version of the program will run on an
ARRSUM. The program references an array actual or an emulated 8087 without altering the
(XSARRAY), which contains 0-100 short real code shown.
values; the integer variable N$OF$X indicates the
number of array elements the program is to
consider. ARRSUM steps through XSARRAY
accumulating three sums: The PL/M-86 version of ARRSUM (figure S-23)
SUMSX, the sum of the array values; is very straightforward and illustrates how easily
the 8087 can be used in this language. After
SUMSINDEXES, the sum of each array
declaring variables the program calls built-in
value times its index, where the index of the
procedures to initialize the processor (or its
first element is 1 , the second is 2, etc.;
emulator) and to load the control word. The pro-
SUMSSQUARES, the sum of each array gram clears the sum variables and then steps
element squared. through XSARRAY with a DO-loop. The loop
control takes into account PL/M-86's practice of
(A true program, of course, would go beyond considering the index of the first element of an
these steps to store and use the results of these array to be 0. In the computation of
calculations.) The control word is set with the SUMSINDEXES, the built-in procedure FLOAT
recommended values: projective closure, round to converts I+l from integer to real because the
nearest, 64-bit precision, interrupts enabled, and language does not support "mixed mode"
all exceptions masked except invalid operation. It arithmetic. One of the strengths of the NDP, of
*
ARRAYSUM. MOD *
*
* **#** **#**/
1 ARRAY$SUM: DO;
/ CLEAR SUMS /
8 1 SUMSX, SUMSINDEXES, SUMSSQUARES = 0.0;
/ ETC. . ./
14 1 END ARRAYSSUM;
S-63
8087 NUMERIC DATA PROCESSOR
CrtOSS-REPSRENCE LISTING
4 019EH 2 I INTEGER 9
INITREALMATHUNIT ,
BUILTIN 6
4 019CH 2 NOPX , INTEGER 9
3ETHEALM0DE. . . . BUILTIN 7
2 0004 4 SUMINDEXES . . . , REAL 8 1
MODULE INFORMATION:
course, is that it does support arithmetic on mixed ment registers and stack pointer, the program
data types, and assembly language programmers calls INIT87 and loads the control word. The
can take advantage of this facility. computation begins with the next three instruc-
tions, which clear three registers by loading
The ASM-86 version (figure S-24) defines the (pushing) zeros onto the stack. As shown in figure
external procedure INIT87, which makes the dif- S-25, these registers remain at the bottom of the
ferent initialization requirements of the processor stack throughout the computation while tem-
and its emulator transparent to the source code. porary values are pushed on and popped off the
After defining the data, and setting up the seg- stack above them.
17
18 ; LABEL INITIAL TOP OP STACK
0190 19 3TACK_T0P LABEL WORD
20 STACK ENDS
21
22 CODE SEGMENT PUBLIC 'CODE'
23 ASSU."1E CS CODE DS DAT A SS STACK , ES : NOTHING
: , : , :
24
0000 25 START:
0000 B8 R 26 MOV AX, DAT
0003 BEDS 27 MOV DS.AX
0005 B8 B 28 MOV AX, STACK
0008 8ED0 29 MOV S3, AX
OOOA BC9001 R 30 MOV SP,OFPSET STACK_TOP
51
32 ASSUME X_ARRAY 4 N OP_X ARE INITIALIZED.
33 NOTE: PROGRiM ZEROS N_OP_X
34 PREPARE THE 8087 OR ITS EMULATOR.
35
OOOD 9A0000 36 CALL INIT87
001 2 9BD92EOOO0 37 PLDCW CONTROL 87
38
39 ; CLEAR 3 REGISTERS TO HOLD RUNNING SUMS.
40
0017 9BD9EE 41 PLDZ
001 A 9BD9ES 42 PLDZ
001 D 9BD9EE 43 PLDZ
44
45 ; SETUP CX AS LOOP COUNTER * SI AS INDEX TO X ARRAY.
46
0020 8B020200 47 MOV CX , N_OP_X
0024 E329 43 JCXZ POP RESULTS ;EXIT EARLY IP X ARRAY EMPTY
0026 B80400 49 MOV AX, Type x_array
0029 P7E9 50 IMUL CX
002 B 8BP0 51 MOV 31, AX
52
53 ;SI NOW CONTAINS INDEX OF LAST ELEMENT + 1 ,
77
78 CODE ENDS
79
0000 80 END START
The program uses the CPU LOOP instruction to 8.9 Special Topics
control its iteration through X ARRAY; register
CX, which LOOP automatically decrements, is This section describes features of the 8087 which
loaded with N OF X, the number of array will be of interest to groups of users who have
elements to be summed. Register SI is used to special requirements. Most users will not need to
select (index) the array elements. The program understand this material in detail in order to
steps through X ARRAY from "back to utilize the NDP successfully. Most readers, then,
can either browse this section, or skip it altogether
front", so SI is initialized to point at the element
in favor of the programming examples in section
just beyond the first element to be processed. The
S.IO.
ASM-86 TYPE operator is used to determine the
number of bytes in each array element. This per-
The four topics in this section cover the
first
mits changing X ARRAY to a long real array by 8087's generation and handling of nonnormalized
simply changing its definition (DD to DQ) and real values, zeros, infinities and NANs. In the
re-assembling. great majority of applications, these special
values will either not appear at all, or in the case
of zeros, will function according to the normal
rules of arithmetic. Next the bit encodings of each
data type are summarized in table form, including
special values. This information may be of use to
Figure S-25 shows the effect of the instructions in programmers who are sorting these data types or
the program loop on the NDP register stack. The are decoding unformatted memory dumps or data
S-67
8087 NUMERIC DATA PROCESSOR
stop and request an interrupt if the destination is accomplished by denormalizing the result until it
a memory operand. If the destination is a register, is just within the exponent range of the destina-
the processor adds the constant 24,576 (decimal) tion. Denormalizing means incrementing the true
to the true result's exponent, returns the result, result's exponent and inserting a corresponding
and then requests an interrupt. The constant leading zero in the significand, shifting the rest of
forces the exponent into the range of the tem- the significand one place to the right. Table S-23
porary real format, and an exception handler can illustrates how a result might be denormalized to
subtract out the constant to ascertain the true fit a short real destination.
exponent. Thus, execution always stops when
there is an unmasked underflow. Denormalization produces a denormal or a zero.
Denormals are readily identified by their
The intent of the masked response
to underflow is exponents, which are always the minimum for
to allow computation to continue without pro- their formats; in biased form, this is always the
gram intervention, while introducing an error that bit string: 00... 00. This same exponent value is
carries about the same risk of contaminating the also assigned to the zeros, but a denormal has a
final result as roundoff error. Roundoff (preci- nonzero significand. A denormal in a register is
sion) errors occur frequently in real number tagged special.
calculations; sometimes they spoil the result of
computation, but often they do not. Recognizing The denormalization process may cause the loss
that roundoff errors are often non-fatal, com- of low-order significand bits as they are shifted
putation usually proceeds and the programmer off the right. In a severe case, all the significand
inspects the final result to see if these errors have bits of the true result are shifted out and replaced
had a significant effect. The 8087's masked by the leading zeros. In this case, the result of
underflow response allows programmers to treat denormalization is and if the value is
a true zero,
underflows in a similar manner; the computation in a register, it is tagged as such. However, this is
continues and the programmer can examine the a comparatively rare occurrence, and in any case
final result to determine if an underflow has had is no worse than "abrupt" underflow.
Notes:
'^'expressed as unbiased, decimal number
*2>Before storing, significand is rounded to 24 bits, integer bit is dropped, and exponent is
S-68
8087 NUMERIC DATA PROCESSOR
Operation Result
unnormal.
ni\/i^inn hinnnrmal riix/irlpnH nnl\/^ Rp<?iilt 1"? iinnnrmal
rovj<ri 1
^innal in\/;^li(H nnpr^tinn
COT pCTD /chnrt/lnnn rpal II value loCLUUVCUCoUIIClllVJII oUIIUv^lll^W
destination) boundary, then signal invalid operation; else
signal underflow.
Zeros and Pseudo-Zeros operands and also shows how a true zero may be
created from nonzero operands. (Nonzero
As discussed in section S.3, the real and packed operands are denoted "X" or "Y" in the table.)
decimal data types support signed zeros, while the
binary integers represent a single zero, signed Only the temporary real format may contain a
positive. The signed zeros behave, however, as special class of values called pseudo-zeros. A
though they are a single unsigned quantity. If pseudo-zero is an unnormal whose significand is
necessary, the FXAM
instruction may be used to all zeros, but whose (biased) exponent is nonzero
-0 +0
+x w +0 FSQRT
-X w +0 -0 -0
+0 +0
Addition
+0 plus +0 +0 Compare
-0 plus -0 -0 0:+X A< B
+0 plus -0, -0 plus +0 0: 0 A= B
-X plus +X,+X plus -X 0 -X: A> B
0 plus X, X plus 0 tX <6)
FTST
Subtraction 0 Zero
+0 minus -0 +0 FCHS
-0 minus +0 -0 +0 -0
+0 minus +0, -0 minus -0 *0<5) -0 +0
+X minus +X, -X minus -X *0(5) FABS
0 minus X, X minus 0 tX (6)
0 +0
F2XM1
Multiplication +0 +0
+0 +0, -0 -0 +0 -0 -0
+0 -0, -0 +0 -0 FRNDINT
+o+x,+x+o +0 +0 +0
+0-X, -X+0 -0 -0 -0
-0 +X, +X -0 -0 FXTRACT
-0 -X, -X -0 +0 +0 Both +0
+X+Y,-X-Y +0, underflow (7) -0 Both -0
+X-Y,-X+Y -0, underflow
Notes:
C) Arithmetic and compare operations with real memory operands interpret the memory operand signs in
the same way.
<2>
Arithmetic and compare operations with binary integers interpret the integer sign in the same manner.
'3>
Severe underflows in storing to short or long real may generate zeros.
Small values (|x| < 1) stored into integers may round to zero.
Very small values of X and Y may yield zeros, after rounding of true result. NDP signals underflow to
warn that zero has been yielded by nonzero operands.
Very small X and very large Y may yield zero, after rounding of true result. NDP signals underflow to
warn that zero has been yielded from nonzero operands.
When Y divides into X exactly.
zero or a pseudo-zero (the divisor is a created by the NDP as its masked response to an
pseudo-zero). overflow or a zerodivide exception. Note that
when rounding is up or down,- the masked
In addition and subtraction of a pseudo-zero and response may create the largest valid value
a true zero or another pseudo-zero, the pseudo- representable in the destination rather than infin-
zero(s) behave like unnormals, except for the ity. See table S-33 for details. As operands,
determination of the result's sign. The sign is infinities behave somewhat differently depending
determined as shown in table S-26 for two true on how the infinity control field in the control
zero operands. word is set (see table S-27). When the projective
model of infinity is selected, the infinities behave
as a single unsigned representation; because of
Infinities
this, infinity cannot be compared with any value
The real formats support signed representations except infinity. In affine mode, the signs of the
of infinities. These values are encoded with a infinities are observed, and comparisons are
biased exponent of all ones and a significand of possible.
Addition
-1-00 plus +00 Invalid operation +00
-oo plus -00 Invalid operation oo
+00 plus -oo Invalid operation Invalid operation
-00 plus +00 Invalid operation Invalid operation
oo plus X oo *00
Subtraction
+0O minus -oo Invalid operation + 00
-00 minus +oo Invalid operation 00
+00 minus +oo Invalid operation Invalid operation
-oo minus -oo Invalid operation Invalid operation
00 minus X *00 *00
Multiplication
+ 00 +00 oo 0OO
00 +Y Qoo 0OO
0 oo, 00 * +0 Invalid operation Invalid operation
S-72
8087 NUMERIC DATA PROCESSOR
Division
-t-00 +00 IllVallU upclciUUII IllVallU UpcldllUII
+00 - +y Q 00
+y _:_ _i_oo
FSQRT
00 Invalid operation Invalid operation
+00 Invpiliri nnpraton +00
FPREM
-i-oo rom --00 IllVdllU W|JCICIUL/II InvaliH nnPratiAn
U^JCIdUL/ll
IIIVClllU
4-oorpm -f-X Invalid nnpratinn Invalid nnpratinn
-f-Y
_L 1
rom
1C H-oo 1 1 1
*Y *Y
V
FRNDINT
+ 00 * 00 *oo
rbUALt
+00 <5ra|pfi hv +00 Invalid onpration Invalid oneration
00 scaieu Dy A * CO *oo
Xu oodicu uy *n
-t-Y
1
qp^iIpH hv
X oociicvj uy +00
x*-^ Invalirl nnpratinn Invalirl nnpratinn
FXTRACT
+00 Invalid operation Invalid operation
Compare
+00 +00 ' A LJ
r\ R 00 <^ -f"Oo
00 : Y A ? B (and) invalid
operation -00 <Y< +00
0 : 0 A ? B (and) invalid
operation -00 < 0 < +
FTST A ? B (and) invalid
operation *oo
could point to its associated diagnostic area in indefinite cannot be loaded from a packed
memory. The program would then continue. decimal or binary integer.
S-74
. )
Negativ
When rounding is directed (the RC field of the
controlword is set to "up" or "down"), the 8087
( Largestl Indefinite * 1 00. ..00 handles a masked overflow differently than it
does for the "nearest" or "chop" rounding
Word: -*-15bits-^ modes. Table S-33 shows the NDP's masked
Short: -*-31 bits-^
response when the true result is too large to be
Long: -^63 bits-^ represented in it's destination real format. For a
normalized result, the essence of this response is
to deliver or the largest valid number represen-
table in the destination format, as dictated by the
If this encoding is used as a source operand rounding mode and the sign of the true result.
Thus, when RC=down, a positive overflow is
(as in an integer load or integer arithmetic
instruction), the as the
8087 interprets it
rounded down to the largest positive number.
largest negative number representable in the
Conversely, when RC=up, a negative overflow is
2) as the response to a masked invalid In all masked overflow responses for directed
operation exception, in which case it rounding, the overflow flag is not set, but the
represents the special value integer precision exception is raised to signal that the
indefinite exact true result has not been returned.
S-75
8087 NUMERIC DATA PROCESSOR
Magnitude
Class Sign
digit digit digit digit . . . digit
(Largest) 0 0000000
1001100110011001. ..1001
Positives
(Smallest) 0 UUUUUuU UUUUUUUUUUUUUUUU...UUU1
Zero 0 0000000 0000000000000000. ..0000
Zero 0000000 0000000000000000. ..0000
Negatives
(Smallest) 0000000
0000000000000000. ..0001
Biased Significand*
Class Sign
Exponent Aff...ff
NANS
(A
0 11. ..10 11. ..11
0)
>
Normals
Posit
Denormals
S-76
^
8087 NUMERIC DATA PROCESSOR
Biased Significand*
Class Sign
Exponent Aff...ff
Denormals
<n
CO
0) 00
\J\J . . 00
.\J\J 11
1 1 ... 11
1 1
QC
f\f\ r\f\
UU...U1 UU.. .UU
Normals
latives
z
< Indefinite 11. ..11 10. ..00
Biased Significand
Class Sign
Exponent lAff..ff
NANs
Positives
S-77
8087 NUMERIC DATA PROCESSOR
Biased Significand
Class Sign
Exponent lAff...ff
100. ..00
Unnormals
iitives 011. ..11
in
O
a.
Denormals
0 00... 00 Oil. ..11
Denormals
1 00... 00 000. ..01
Normals
100. ..00
s-78
8087 NUMERIC DATA PROCESSOR
Biased Significand
Class Sign
txponeni 1 ft ti
Negatives
NANs Indefinite 11. ..11 110. ..00
One or both operands is a NAN. Return NAN with larger absolute value
(ignore signs).
(Compare and test operations only): Set condition codes "not comparable".
one or both operands is a NAN.
(Addition operations only): closure is Return real indefinite
affineand operands are opposite-signed
closure is projective and both
infinities; or
operands are (signs immaterial).
(Subtraction operations only): closure is Return real indefinite.
affine and operands are like-signed
infinities; or closure is projective and both
operands are < (signs immaterial).
(FPREM modulus
instruction only): Return rea\ indefinite set condition code
,
Invalid Operation
(FXCH instruction only): one or both Change empty register(s) to real indefinite
registers is tagged empty. and then perform exchange.
Denormalized Operand
(Compare and test operations only): one Convert (in a work area) any denormal to
operands is denormal or unnormal
or both the equivalent unnormal; normalize as
(other than pseudo-zero). much as possible, and proceed with
operation.
Zerodivide
(Division operations only): divisor = 0. Return > signed with "exclusive or" of
operand signs.
Overflow
(FST, FSTP instructions only): rounding is Return properly signed and signal
nearest or chop, and exponent of true precision exception.
result > +127 (short real destination)
or > +1023 (long real destination).
Underflow
(FST, FSTP instructions only): destination Denormalize until exponent rises to -126
is short real and exponent of true result (true), round significand to 24 bits, store
< \Zo (true). true 0 if denormalized rounded significand
= 0; else, store denormal (biased expo-
nent =0).
(FST, FSTP instructions only): destination Denormalize until exponent rises to -1022
is long real and exponent of true result (true), round significand to 53 bits, store
<-1022 (true). true 0 if rounded denormalized significand
= 0; else, store denormal (biased expo-
nent = 0).
Precision
Normal + Up +00
Normal Down 00
Unnormal -1-
Up +00
Unnormal -1-
Up Largest exponent, result's significand*^*
Unnormal Down oo
The significand retains its identity as an unnormal; the true result is rounded as usual
(effectively chopped toward 0 in this case). The exponent is encoded 11...10B.
8.10 Programming Examples pond to the CPU's zero and carry flags (ZF and
CF), the byte is written into the flags (see
if
As discussed in section S.7, the comparison jump instruction tests the CPU flags.
instructions post their results to the condition
code of the 8087 status word. Although there
bits
The FXAM instruction updates all four condition
code bits. Figure S-27 shows how a jump table
are many ways to implement conditional branch-
can be used to determine the characteristics of the
ing following a comparison, the basic approach is
value examined. The jump table (FXAM TBL)
as follows:
is initialized to contain the 16-bit displacement of
execute the comparison, 16 labels, one for each possible condition code
store the status word, setting.Note that four of the table entries contain
the same value, since there are four condition
inspect the condition code bits,
code settings that correspond to "empty."
jump on the result.
The program fragment performs the FXAM and
Figure S-26 is a code fragment that illustrates how stores the status word. It then manipulates the
two memory-resident long real numbers might be condition code bits to finally produce a number in
compared (similar code could be used with the register BX that equals the condition code times
FTST instruction). The numbers are called A and 2. This involves zeroing the unused bits in the byte
B, and the comparison is A to B. The comparison that contains the code, shifting C3 to the right so
itself simply requires loading A onto the top of that it is adjacent to C2, and then shifting the
the 8087 register stack and then comparing it to B code to multiply it by 2. The resulting value is
and popping the stack in the same instruction. used as an index which selects one of the
The status word is written to memory and the displacements from FXAM TBL (the
code waits for completion of the store before multiplication of the condition code is required
attempting to use the result. because of the 2-byte length of each value in
FXAM TBL). The unconditional JMP instruc-
There are four possible orderings of A and B, and tion effectively vectors through the jump table to
bits C3 and CO of the condition code indicate the labelled routine that contains code (not shown
which ordering holds. These bits are positioned in in the example) to process each possible result of
the upper byte of the status word so as to corres- the FXAM instruction.
A DQ ?
B DQ ?
STAT 87 DW ?
A_GREATER :
A_LESS_OR_UNORDERED
; CF (CO) = 1 , TEST ZF (C3)
JNE A_LESS
A_B_UNORDERED :
; CF (CO) = 1 , ZF (C3) = 1
A_LESS
; CF (CO) = 1 , ZF (C3) = 0
JMP FXAM_TBL[BX]
POS NAN:
NEG UNNORM
NEG NAN
POS NORM
POS INFINITY:
NEG NORM
NEG INFINITY
POS_ZERO
EMPTY
NEG_ZERO
POS_DENORM:
NEG_DENORM:
interruption by higher-priority sources. Typically tions to save and restore the 8087. The tradeoff
this will involve saving CPU registers and here isbetween the increased diagnostic infor-
transferring diagnostic information from the 8087 mation provided by FNSAVE and the faster
to memory. When the critical processing has been execution of FNSTENV. For applications that are
completed, the prologue may enable CPU inter- sensitive to interrupt latency, or do not need to
rupts to allow higher-priority interrupt handlers examine register contents, FNSTENV reduces the
to preempt the exception handler. duration of the "critical region," during which
the CPU will not recognize another interrupt
The exception handler body examines the request (unless it is a non-maskable interrupt).
diagnostic information and makes a response that
is necessarily application-dependent. This After the exception handler body, the epilogues
response may range from halting execution, to prepare the CPU and the NDP to resume execu-
displaying a message, to attempting to repair the tion from the point of interruption (i.e., the
problem and proceed w^ith normal execution. instruction following the one that generated the
unmasked exception). Notice that the exception
The epilogue essentially reverses the actions of the flags in the memory image that is loaded into the
CPU and the NDP so that
prologue, restoring the 8087 are cleared to zero prior to reloading (in
normal execution can be resumed. The epilogue fact, in these examples, the entire status word
image is cleared). The prologue also provides for the general approach shown in figure S-30 can be
indicating to the interrupt controller hardware employed. The basic technique is to save the full
(e.g., 8259A) that the interrupt has been pro- 8087 state and then to load a new control word in
cessed. The actual processing done here is the prologue. Note that considerable care should
application-dependent, but might typically be taken when designing an exception handler of
involve writing an "end of interrupt" command this type to prevent the handler from being
to the interrupt controller. reentered endlessly.
MOV BP SP ,
SUB SP 94 ,
POP BP
SAVE_ENVIRONMENT PROC
MOV BP,SP
SUB SP 1 ,
POP BP
REENTRANT PROC
MOV BP SP ,
SUB SP,94
;SAVE STATE, LOAD NEW CONTROL WORD, WAIT
;FOR COMPLETION, ENABLE CPU INTERRUPTS
FNSAVE CBP-94]
FLDCW LOCAL_CONTROL
FWAIT
STI
;CODE TO SEND ''END OF INTERRUPT'' COMMAND TO
;8259A GOES HERE
POP BP
;RETURN TO POINT OF INTERRUPTION
IRET
REENTRANT ENDP
S ^
OP
OP
*^*Memory transfers, including applicable processor control instructions; 0, 1, or 2 displacement bytes may
follow.
OP, OP-A, OP-B: Instruction opcode, possibly split into two fields.
A-1
MACHINE INSTRUCTION ENCODING AND DECODING
Table A-2 lists all 8087 machine instructions in the data bus. Users writing exception handlers
binary sequence. This table may be used to may also find this information useful to identify
"disassemble" instructions in unformatted the offending instruction.
memory dumps or instructions monitored from
1st Byte
ASM-86 Instruction
2nd Byte Bytes 3,4
Hex Binary Format
A-3
Mnemonics Intel 1980
MACHINE INSTRUCTION ENCODING AND DECODING
-4-1-1
DD 11 01 1 101 111- reserved
DE 1101 1110 MULJOO AD
OH/ KA
M /
(disp- o),{disp-hi) rIADD word-integer
i D KA CI K
DE MUUOO 1 H/ M rlMUL
1 II
1101 1110 (disp- 0),(dlSp-nl) Jl 1
word-integer
DE -1n-i
1101 -i
1110 MUU01 r\D
On/ KA M 1
(disp- o),(disp-ni) rlUUM word-integer
DE 1101 1110 MUUOl 41 On/ M
/ ft J
(disp- o),(disp-ni) rICUMP word-integer
DE 1101 1110 MUU10 AD
On/ KA M 1
(disp- o),(disp-hi) CIO ID 1
word-integer
1 D K CIO DD
DE 1101 1110 MUU10 1 n/ M / Jl
(disp- o),(disp-hi) rlbUtJR 1 1
word-integer
DE 1 1 01 1110 MUUl 1 AD
On M /
/ (disp- o),{disp-hi) rlUIV word-integer
DE 1 1 01 1110 MUUl 1 11 Dn/ KX
M /
(disp- o),(disp-hi) CI ni V/ D
rlUIVn word-integer
DE nui niu 11 UU Unho rAUUr ox/:\ OT
DE nui 1 1 1 1 1 UU 1
1
D Cr*
nbla rMULrD
CKyi 1 1 1 0"r/;\ ox
A
DE 1 1
1 1
ni
Ul 1 1 1
1 n un 1 1 U 1 U (a)
DE 1 1
1 1
fM
Ul 1 nu nu 1
1
1
nr>A
UUU reserveu
DE M Ul 1 Mu n Ul 1
1
nni
UU1 rUUM DD
C^r>K/l
rr
DE n Ul 1 1 1 1 1 Ul 1
1
M
Ul - reserveu
DE 1 1 Ul M 1 U nu 1 11 reserved
DE 1 1 Ul 11 1 m Unto u
CO
roU D
DrD
1 1 CT/i\
b 1
CT
(l),b 1
DE 1 1 Ul m n
i 1 1
u niu 1 nto roU DOD
CCI tSnr 1
b (l),b
1
CT
1
DE nui 1 1 n
lllU -1
n 11
1111 Unbo rUI Vr CT/i\
b 1(l),b 1
MULJOO AOn/M / K
(disp- lo),(disp-ni)
1
word-integer
DF 1101
1
1111
-4 H '4
* The marked encodings are not generated by the language translators. If, however, the 8087
encounters one one these encodings in the instruction streann, it will execute it as follows:
intel
8087
80-BIT HMOS
NUMERIC DATA PROCESSOR
Full Internal 80-Bit Architecture for All iAPX 86/10, 88/10 Addressing
High Performance Modes Available
The Intel 8087 Is a high performance coprocessor that extends the IAPX 86/10, 88/10 architecture by providing all the
required numerics support for iAPX 86/20, 88/20 systems. The 8087 is implemented in N-channel, depletion load,
silicon gate technology (HMOS) and packaged in a 40-pin ceramic package. The IAPX 86/20 and 88/20 fully conform to
the proposed IEEE Floating Point Standard. Using its coprocessor architecture, the 8087 adds over fifty opcodes
iAPX 86/10 instruction set, making the IAPX 86/20, 88/20 a complete solution to high performance
directly into the
numeric processing.
7
3S
34 n BHE/S7
NEU INSTRUCTION MICROCODE ARITHMETIC
AD e 33 ^ B0/GT1
CONTROL MODULE AD7 [2 9 32 ^ INT
UNIT
DATA AOS 10 8087 31 ^h6/gto
BUFFER
AOS ^ 1 NOP 30 nc
OPERANDS
QUEUE AD4 [2 12 n NC
TEMPORARY
REGISTERS ADS Q 13 28 S2
AD2 Q M 27 S.
A01 ^ 15 26 so
ADO Q 16 25 oso
osi
Nc n 17 24
STATUS ADDRESSING t
BUS TRACKING
CLK Q 19 22 READ*
GND 1^ 20 21 RESET
EXCEPTION
ADDRESS POINTERS
L J NC = NO CONNECT
iAPX 86/20, 88/20 systems. It also executes numerous and an extension to the iAPX 86/10 or 88/10, providing
built-in transcendental functions (e.g., tangent and log register, datatype, control, and instruction capabilities
functions). The 8087 executes instructions as a co- at the hardware level that can be used as a single unified
processor to a maximum mode 8086 or 8088. It effec- system, the iAPX 86/20, 88/20, at the programming level.
NDP STATUS
FLAGS
cs
DS
ES
SS
8259A
CLK 8086/8088
CPU
I I
I J RQ/GT1
TEST 8086
QSO 081
FAMILY MULTIMASTER
TT-T
QSO QS1 BUSY
BUS
INTERFACE
COMPONENTS
SYSTEM
BUS
8284
CLOCK RQ/GTO
GENERATOR
8087
CLK CLK NDP
RQ/GT1 <
o
o
RO/GT <
M CLK 8089
iOP
ready to execute subsequent instructions. The NDP can generator and system bus interface components (bus
interrupt the CPU when It detects an error or exception. controller, latches, transceivers and bus arbiter).
The NDP's interrupt request line is typically routed to
the CPU through an 8259A Programmable Interrupt
Controller. (See Figure 2 for 8087 pinout information.) Bus Operation
The NDP uses one of the request/gra nt ines of the l The 8087 bus structure, operation and timing are iden-
iAPX86, 88 architecture (typically RQ/GT1) to obtain other processors in the iAPX86, 88 series
tical to all
control of the local bus for data transfers. The other (maximum mode configuration). The address is time
request/grant line is available for general system use multiplexed with the data on the first 16/8 lines of the
(for instance by an I/O processor in LOCAL mode). A bus address/data bus. A16 through A19 are time multiplexed
19 Two's
Long Integer 10 64 Bits '63
Complement
18
Packed BCD 10 IBDIgits s
Short Real 38 E7 Eo
10 24 Bits 23 Fq Implicit
308 ^0 Implicit
Long Real 10 53 Bits ^10 52
4932
Temporary Real 10 64 Bits Ei4 h 63
Integer: I
Real: (-1)^^2^-^'*=)(Fq.F,...)
with four status lines S3-S6. S3, S4 and S6 are always PROCESSOR ARCHITECTURE
one (high) for 8087 driven bus cycles while S5 Is always
zero (low). When monitoring CPU bus cycles
the 8087 is
As shown in Figure 1, the NDP is internally divided into
S2 S1 so Control Unit
0 X X Unused The CU keeps the 8087 operating in synchronization
1 0 0 Unused with its host CPU. 8087 instructions are intermixed with
1 0 1 Memory Data Read CPU instructions in a single instruction stream. The
1 1 0 Memory Data Write CPU fetches all instructions from memory; by monitor-
1 1 1 Passive (no bus cycle) ing the status signals (S?T-S2, 86) emitted by the CPU,
the NDPcontrol unit determines when an 8086 instruc-
Programming Interface tion being fetched. The CU taps the bus in parallel
is
Table 1 lists the seven data types the 8087 supports and with the CPU and obtains that portion of the data
presents the format for each type. Internally, the 8087 stream.
holds all numbers in the temporary real format. Load
The CU maintains an instruction queue that is Identical
and store instructions automatically convert operands
to the queue in the host CPU. The CU automatically
represented in memory as 16-, 32-, or 64-bit integers, 32-
determines if the CPU is an 8086 o r an 8088 immediately
or 64-bit floating point numbers or 18-diglt packed BCD after reset (by monitoring the BHE/S7 line) and matches
numbers into temporary real format and vice versa.
its queue length accordingly. By monitoring the CPU's
Computations in the 8087 use the processor's register queue status lines (QSO, QS1), the CU obtains and
stack. These eight 80-bit registers provide the equiva- decodes instructions from the queue In synchronization
lent capacity of 40 16-bit registers. The 8087 register set with the CPU.
can be accessed as a stack, with instructions operating
After decoding the instruction, the 8086 executes all
on the top one or two stack elements, or as a fixed
opcodes but ESCAPE (ESC), while the 8087 executes
register set, with instructions operating on explicitly
only the ESCAPE class instructions. (The first five bits
designated registers.
of ESCAPE instructions are identical.) The CPU does
all
Table 5 lists the 8087's instructions by class. Assembly provide addressing for ESC instructions, however.
language programs are written in ASM-86, the iAPX 86,
The CPU distinguishes between ESC instructions that
88 assembly language. Table 2 gives the execution
reference memory and those that do not. If the instruc-
times of some typical numeric instructions and their
tion refers to a memory operand, the CPU calculates the
equivalent time on a 5 MHz 8086.
operand's address using any one of its available addres-
sing modes, and then performs a "dummy read" of the
word at that location. (Any location within the 1M byte
Table 2. Execution Time for Selected 8087 Actual
and Emulated Instructions address space Is allowed.) This is a normal read cycle
except that the CPU ignores the data it receives. If the
Approximate Execution ESC instruction does not contain a memory reference
Time (>iS) (e.g., an 8087 stack operation), the CPU simply proceeds
Floating Point Instruction to the next instruction.
8087 8086
(5 MHz Clock) Emulation An 8087 Instruction either will not reference memory,
loading one or more operands from memory
will require
Add/Subtract Magnitude 14/18 1,600
into the 8087, or will require storing one or more
Multiply (single precision) 19 1,600
operands from the 8087 into memory. In the first case a
Multiply (extended precision) 27 2,100
non-memory reference escape is used to start 8087
Divide 39 3,200
operation. In the last two cases, the CU makes use of
Compare 9 1,300
the "dummy read" cycle initiated by the CPU. The CU
Load (double precision) 10 1,700
captures and saves the address which the CPU places
Store (double precision) 21 1,200
on the bus. If the instruction is a load, the CU addition-
Square Root 36 19,600
allycaptures the data word when it becomes available
Tangent 90 13,000
on the local data bus. If data required is longer than one
Exponentiation 100 17,100
word, the CU immediately obtains the bus from the CPU
using the request/grant protocol and reads the rest of
15
B c. TOP c, Co IR X PE UE OE ZE DE IE
INVALID OPERATION
DENORMALIZED OPERAND
ZERO DIVIDE
OVERFLOV/
UNDERFLOW
PRECISION
(RESERVED)
INTERRUPT REQUEST''!
CONDITION CODE"'
TOP OF STACK POINTER'"
NEU BUSY
"'IR IS set if any unmasked exception bit is set. cleared otherwise.
"See Table 3 for condition code interpretation.
"Top Values
000 = Register 0 is Top of Stack
001 = Register 1 is Top of Stack
The four numeric condition code bits (C0-C3) are similar TAG VALUES:
to the flags in a CPU: various instructions update these 00 = VALID
01 = ZERO
bits to reflect the outcome of NOP operations. The 10 = SPECIAL
11 = EMPTY
effect of these Instructions on the condition code bits is
summarized In Table 3.
Figure 7. 8087 Tag Word
Bit 7 is the interrupt request bit. This bit is set if any provided for user-written error handlers. Whenever the
unmasked exception bit is set and cleared otherwise. 8087 executes an NEU instruction, the CU saves the
Instruction address, the operand address (if present)
and the instruction opcode. 8087 instructions can store
Bits 5-0 are set to Indicate that the NEU has detected
this data into memory.
an exception while executing an Instruction.
15 0
INSTRUCTION POINTER
Tag Word (19-16)
0 INSTRUCTION OPCODE (10-0)
The tag word marks the content of each register as DATA POINTER (15-0)
Instruction C3 C2 Ci Co Interpretation
the unbiased round to nearest mode specified in the will generate an encoding for infinity if this exception
proposed IEEE standard. Control over closure of the is masked.
number space at infinity is also provided (either affine
4. UNDERFLOW: The result is non-zero but too small in
closure, <, or projective closure, , is treated as
magnitude to fit in the specified format. If this excep-
unsigned, may be specified).
tionis masked the 8087 will denormalize (shift right)
15 0
XXX 1 c R C P C M X PM UM OM ZM DM IM
1
I
INVALID OPERATION
I
DENORMALIZED OPERAND
I
ZERO DIVIDE
'
OVERFLOW
'
UNDERFLOW
I
PRECISION
(RESERVED)
I
'
INTERRUPT MASK (1 = INTERRUPTS ARE MASKED)
l '
PRECISION CONTROL'"
I '
ROUNDING CONTROL"'
I
INFINITY CONTROL (0 --
PROJECTIVE, 1 = AFFINE)
I I
(RESERVED)
AFN 01525A
B-7
intel 8087 [p[^l(LD[i!^lQ[f!il/^^V
S2, SI, SO I/O For 8087 driven bus cycles, these status 1. A pulse 1 CLK wide from another local
lines areencoded as follows: bus master indicates a local bus re-
quest to the 8087 (pulse 1).
S2 SI so
0 (LOW) X X Unused 2. During the NDP's next T4 or T| a pulse 1
1 (HIGH) 0 0 Unused CLK wide from the 8087 to the request-
Read Memory ing master (pulse 2) indicates that the
1 0 1
QS1, QSO 1 QS1 and QSO provide the 8087 with status Ready 1 READY is the acknowledgement from the
to allow tracking of the CPU instruction addressed memory device that it will
queue. complete the data transfer. The RDY
signal from memory is synchronized by
QS1 QSO
the 8284A Clock Generator to form
0 (LOW) 0 No Operation
READY. This signal is active HIGH.
0 1 First Byte of Op Code from
Queue
1 (HIGH) 0 Empty the Queue Reset RESET causes
1 the processor to
1 1 Subsequent Byte from Queue immediately terminate its present
INT 0 This line is used to indicate that an activity. The signal must be active HIGH
unmasked exception has occurred during for at least four clock cycles. RESET is
Storage Temperature -65C to + 150C stress rating only and functional operation of the device at these
or any other conditions above those indicated in the operational
Voltage on Any Pin with
sections of this specification is not implied. Exposure toabsolute
Respect to Ground -1.0to+7V maximum rating conditions for extended periods may affect
Power Dissipation 3.0 Watt device reliability.
8087 [p[^i[LD()^DM/?\[^V
Timing Responses
Notes:
1. Signal at 8284A or 8288 shown for reference only.
2. Applies only to T3 and wait states.
3. Applies only to T2 state (8 ns into T3).
WAVEFORMS
CLK
TOVCL TCLQX
OSQS
TSACH
X X X
TSNCL
s,.s,.s
1 TOVCL TCLDX TOVCL TCLDX
7
BHE/S,,AVS.-A/S3 -A,.)
^ BHE, A,
s,.s,.s.
BHE/SA/S.-A/S3
READ CYCLE
AD-AD<,
DT/R
WRITE CYCLE
AD-AD,
8288 OUTPUTS .
MWTC
NOTES:
1 ALL SIGNALS SWITCH BETWEEN V^^ AND V^^ UNLESS OTHERWISE SPECIFIED
2 RDYISSAMPLEDNEARTHEENDOFT, Tj AND TO DETERMINE IF T^^,
MACHINE STATES ARE TO BE INSERTED
3 THE LOCAL BUS FLOATS ONLY IF THE 8087 IS RETURNING CONTROL TO THE
8086/8088
4 ALE RISES AT LATER OF (TSVLH, TCLLH)
5 STATUS INACTIVE IN STATE JUST PRIOR TO T,
6 READY SHOULD REMAIN ACTIVE UNTIL S_, BECOME INACTIVE
7 SIGNALS AT 8284A OR 8288 ARE SHOWN FOR REFERENCE ONLY
8 THE IS SUANCE OF 8286 COMMAND AND CONTROL SIGNALS MRDC, MWTC.
AMWC AND DEN) LAGS THE ACTIVE HIGH 8288 CEN
9 ALL TIMING MEASUREMENTS ARE MADE AT 5V UNLESS OTHERWISE NOTED,
1
1^ >50 mSCC ^
f
vcc =20 CLK CYCLES
=8 CLK CYCLES
CLK
TCLDX-
TDVCL-H }*
\
\
RESET
JT' >4
\
CLK CYCLES
8087 TRACKS 8087 READY TO
EXECUTE INSTRUCTIONS
CPU ACTIVITY
CLK
TCLGL TGVCH
TCLGH
BQ/GTO 8087
RO
AD-AD
A,>^S,-A/S,
CPU
S^,.S
BHE/S7
>1 CLK
"CYCLE"
CLK
TCLGL
^^^^ ^ TCLGH
TGVCH-^
TCHGX
RQ/GT1
\ / ^ \ B087 GT
'
TCLAZ
A^S.-A/S,
BHEyST
XT ALTERNATE MASTER
(SEE NOTE)
)(fE:
NOTE: ALTERNATE MASTER MAY NOT DRIVE THE BUSES OUTSIDE OF THE REGION
SHOWN WITHOUT RISKING BUS CONTENTION
CLK
TCHBV I
76543210765432107654321076543210
Data Transfer
FLO = LOAD
Integer/Real Memory to ST(0) ESCAPE MF 1
I
MOD OOP R/M I"
(DISP-LO) (DISP-HI)
FST = STORE
ST(0) to Integer/Real Memory ESCAPE MF MOD 0 1 R/M (DISP-LO) (DISP-HI)
ST(0) to ST(i) 1
ESCAPE 1 0 1
1
110 11 ST(i)
1
1
ESCAPE 0 0 1
1
110 0 1 ST(i)
1
Comparison
FCOM = Compare
Arithmetic 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7654321076543210
FADD = Addition
FSUB = Subtraction
FMUL - Multiplication
FDIV = Division
FXTHACT
0( ST(0)
- Extract Components
ESCAPE 0 0 1 1 1 1
0 1 0 0
1
Transcendental
FPTAN = Partial Tangent ol ST(0)
|
ESCAPE 0 0 0 0 1 0
1
F2XM1 = 2 ST(0)_i
1
ESCAPE 0 0 0 0 0 0
1
Constants
FLDZ = LOAD 0 0 into ST(0) ESCAPE 0 0 1 1 0 1 1 1 0
7 6 5 4 3 2 1 0 5 4 3 2 1 0 7 5432107 6543210
Processor Control
FINIT : Initialize NOP ESCAPE 0 0 0 1 1
FOOTNOTES:
if mod = 00 then DISP = 0*, disp-low and disp-high are absent ST(0) = Current stack top
if mod = 01 then DISP= disp-low sign-extended to 16-bits, ST(i) = I"' register below stack top
disp-high is absent
if mod= 10 then DISP= disp-high; disp-low d= Destination
if mod =11 then r/m is treated as an ST(i) field 0 Destination is ST(0)
1 Destination is ST(i)
P= Pop
if r/m = 000 then EA = (BX) MSI) ^ DISP 0 No pop
if r/m = 001 then EA = (BX) MDI) * DISP 1 Pop ST(0)
if r/m = 010 then EA = (BP) ^(Sl) + DISP
r/m = 011 then EA = (BP) + (DI) + DISP R= Reverse
If
Holiday Otiice Center Lowry & Associates. Inc Chagrin-Brainard Bidg . No 210
FLORIDA w
3322 Memonal Pkwy S W. .
135 Norih Street 28001 Chagrin Blvd
Huntsville 35801 Intel Corp Suite 4 Cleveland 44122
Tel 1205)881-9298 1001 N w e2nd Street. Suite 406 Brighton 481 16 Tel (216)464-2736
FtLauderdale 33309 Tel 1313)227 7067 TWX 810-4279298
ARIZONA Tel(305) 771-0600
Lowry & Associates. Inc.
Intel Corp
TWX 510-956-9407 3902 Costa NE OREQON
10210 N 25tn Avenue, Suite 11 Intel Corp Grand Rapids 49505 Intel Corp
Pnoeni 85021 5151 Adanson Street. Suite 203 Tel (616) 363-9839 10700 SW Seavenon
Tel (G02I 997 9695 Orlando 32804 Hillsdale Highway
BFA
Tel (3051628-2393 MINNESOTA Suite 324
4426 Nonn SaOdle Bag Trail
TWX 81^853-9219
IntelCorp. Beavenon 97005
Scottsdale 85251 Pen-Tech Associates Inc 7401 Metro Blvd. Tel (503)641-8086
Tel (6021994-5400 201 S E 15lh Terrace. Suite K Suite 355 TWX 910467^741
Deertield Beach 33441 Edina 55435
CALIFORNIA Tel 1305) 421 4969 Tel (612)835*722 PENNSYLVANIA
Intel Corp Pen-Tech Associates. Inc
TWX 910-576-2e67 Intel Corp
7670 Opportunily Rd- 111 So Maiiiand Ave.. Suite 202 275 Commerce Or
PC Bo 1475 MISSOURI 200 Office Center
Suite 135
Mailland 32751 Corp Suite 300
San Diego 92111 Intel
Tel (714) 268-3563 Tel (305)645-3444 502 Earth City Plaza Fon Washington 19034
Suite 121
Tel (2151542-9444
Intel Corp TWX 51&<61 2077
GEORGIA Earth City 63045
2000 East 4tn street
Suite 100 Pen Tech Associates, inc. Tel (314)291 1990 QED Electronics
Santa Ana 92705 Cherokee Center, Suite 21 Technical Representatives, Inc 300 N York Road
627 Cherokee Street Haieoro 19040
Tel 1714)835-9642 320 Brookes Drive, Suite 104
Tel (215)674-9600
TWX 910-595-1114 Marietta 3O06O Hazeiwood 63042
Tel (404) 424-1931 Te) (314) 731-5200
Intel Corp TEXAS
15335 Morrison
TWX 910-762-0618
ILLINOIS Intel Corp
"
Suite 345
Sherman Oaks 91403 Intel Corp ' NEW JERSEY 2925 L B J Freeway
Suite 175
Tel (213)986-9510 2550 Goll Road Suite 815 Intel Corp
Rolling Meadows 60008 Dallas 75234
TWX 91^495 2045 Raritan Plaza
Tel (214)241-9521
* Tel (312)961 7200 2nd Floor
Intel Corp TWX 910-860-5617
3375 Scott Blvd
TWX 910^51 5881 Raritan Center
'
Technical Representatives Edison 08817 Intel Corp
Santa Clara 95051
Tel (408) 987-8086 1502 North Lee Street
Tel (201) 2253000 6420 Richmond Ave
TWX 910-339 9279 Bloomington 61701 TWX 710-480-6238 Suite 280
TWX 510-227-6236
Beiievue 98005
Overland Park 66210 Tel (206) 453-8066
Tel 1408)946-8885 TWX 910-443-3002
Tel (913)642-8080 IntelCorp
Mac I
Technical Representatives, tnc 80 Washington St
P O Bo 6763 Poughkeepsie i260i WISCONSIN
8245 Nieman Road Suite 100
Fountain Valley 92708 Tei (914) 473-2303 Intel Corp
Lenexa 66214
Tel 1714)839-3341
Tel(913)888-0212, 3. S 4
TWX 510-2480060 150 S Sunnyslope Rd.
Mac- TWX 910-749-6412 Brooklield 53005
Corp
Intel
1321 Centineia Avenue Tel (414) 784 9060
2255 Lyell Avenue
Technica( Representatives, inc
Suite 1 Lower Floor East Suite
360 N Rock Road CANADA
Santa Monica 90404 Rochester 14606
Suite 4
Tel (213)829-4797 Tel (716)254-6120 Intel Semiconductor Corp *
Wichita 67206
Mac-I Tel (316)681-0242
TWX 510-253-7391 Suite 233. Bell Mews
20121 Ventura Blvd Suite 240e Measurement Technology, Inc 39 Highway 7, Bells Corners
Woodland Hills 91364 MARYLAND 159 Northern Boulevard Ottawa. Ontario K2H 8R2
1213)347 5900 Tel (613)829-9714
Tel Great Neck 1 1021
lnte( Corp TELEX 053-4115
Tel (516)482 3500
7257 Paniway Drive
COLORADO Hanover 21076 T Squared Intel Semiconductor Corp
Intel Corp
Tel (301) 796-7500 4054 Newcourt Avenue 50 Galaxy Blvd
650 S Cnerry Street TWX 710^2 1944 Unit 12
Syracuse 13206
Suite 720 Tel (315)463-8592
Reidate, Ontario
Mesa Inc M9W 4Y5
Denver 80222
11900 Parklawn Drive
TWX 710-541-0554
Tel (3031321-8066 Tel (4161675-2105
Rockville 20852 T-Squared TELEX 06983574
TWX 910-931 2289
Tel Washington (301) 681^30 2 E Mam *
Westek Data Products, inc Baltimore (301) 792O021 Multiiek, Inc
Victor 14564
25921 Fern Guicn 15 Grenteii Crescent
Tel (716)924 9101
PO Bo 1355 MASSACHUSETTS TWX 51254 542 Ottawa, Ontano K2G 0G3
Evergreen 80439 Tel (613)226-2365
Intel Corp TELEX 053-4585
Tel (3031 674 5255
27 Industrial Ave
NORTH CAROLINA
Westek Data Products, inc Chelmsford 01824 Multiiek, Inc
Intel Corp
1322 Arapanoe Tel (617)667 8126 154 Huflman Mill Rd Toronto
Boulder 80302 TWX 710-343-6333 Tel (416) 245-4622
Burlington 27215
Tel 1303) 449 2620 Tel (9191584 3631 Multiiek, Inc
EMC Corp
Westek Data Products. Inc 381 Elliot Street Pen-Tech Associates, inc Montreal
Tel (514)481 1350
1228 w Hinsdale Or Newton 02164 1202 Eastchester Dr
Littleton 80120 Tel(617)244-4740 Highpoint 27260
Tel 1303) 797^)482 TWX 922531 Tel (9191883-9125