You are on page 1of 3

This gem calculates the absolute value of the number you input to the routine.

This gem can be optimized in two ways: size or speed. Both optimizations are
presented.
Optimize for size
This small version of the algorithm is based on the fact that NEG sets the sign
flag to reflect the result of the operation.
To explain the algorithm we must look at the two possible cases:
Input number is positive
Input number is negative
If the input number is positive the NEG will set the sign flag to one (SF=1) and
negate the number. Then the jump will be executed and the computer jumps back to
the NEG instruction. Now the value will be negative. Negating a negative value
makes it positive (and SF=0), the computer will not execute the jump and the out
value is positive.
If the input number is negative the NEG will set the sign flag to zero (SF=0)
and negate the number. Then the jump will not be executed as the sign flag is
not set. The out value will therefore be positive.
;
; return absolute value of input
;
; input:
; ax = value
;
; output:
; ax = abs(value)
;
; destroys:
; flags
;
@locallabel:
neg
ax
js
short @locallabel
On older CPUs this is quite fast, on newer it is not - mainly because of the
branchpredicting units. The jump in the algorithm is hard to predict and a
mispredicted jump may cost as much as 20 cycles on a Pentium Pro / Pentium II
(about 4 on Pentium / Pentium MMX). This is however the smallest available
version of the gem.
This version of the gem is also easy to implement as a macro and can handle
memory very easy:
;
; abs(value) macro
;
MACRO

invalue
LOCAL localloop

localloop:
neg
js

invalue
short localloop

ENDM
Just call the macro with any input you wish (memory offset or register).
Optimize for speed
This version of the gem is considerably faster than the previous presented

version on newer CPUs. On older it is very likely that they are almost equally
fast. The basic gem looks like this:
;
; return absolute value of input
;
; input:
; ax = value
;
; output:
; ax = abs(value)
;
; destroys:
; dx
; flags
;
cwd
xor
ax,dx
sub
ax,dx
The gem uses the fact that:
xor
ax,dx
sub
ax,dx
is an exact replacement for (if DX = FFFFh)
neg
ax
This is where the CWD comes in. If the number is negative DX will be set to
FFFFh. If the number is positive DX will be set to zero, and the following
instructions will have no effect on AX.
You can implement and use this gem on any of the CPU's ranging from 8086 to the
new Pentium II. There are however problems with it:
It does only operate on AX/EAX
It destroys a register (DX/EDX)
It can be done faster on Pentium and newer processors
The problem is the CWD instruction. To replace CWD, you can use this
combination:
mov
dx,ax
sar
dx,15
(If 32-bit registers are used, shift with a value of 31 instead.) This has
several advantages:
If paired correctly it is faster than CWD on Pentium and above processors.
It is equally fast on a 486.
You are free to use any of the available registers.
You can use memory to certain points.
There are some disadvantages, mainly that we can not use it on a 8086 anymore,
the problem beeing the SAR DX,15 instruction. The 8086 can not shift with
immideates greater than 1. The best option is to use the version optimized for
size.
The Pentium+ "optimized" algorithm will look like this:
;
; return absolute value of input
;
; input:
; ax = value
;
; output:
; ax = abs(value)
;
; destroys:
; dx
; flags
;
mov
dx,ax

shr
dx,15
xor
ax,dx
sub
ax,dx
Note: This version is not as easy to implement as a macro, mainly because it
does destroy contents of one register (this can be fixed with a PUSH+POP, but
that would slow down the gem). The above presented version is also not paired at
all. (No pairing allowed, unless other instructions are inserted.)
Gem writers: Tylisha C. Andersen (code)
John Eckerdal (text and code)
last updated: 1998-03-15