You are on page 1of 2

1.

Beating Floating Point at its Own Game: Posit Arithmetic


A new data type called a posit is designed as a direct drop-in replacement for IEEE Standard 754
floating-point numbers (floats). Unlike earlier forms of universal number (unum) arithmetic,
posits do not require interval arithmetic or variable size operands; like floats, they round if an
answer is inexact. However, they provide compelling advantages over floats, including larger
dynamic range, higher accuracy, better closure, bitwise identical results across systems, simpler
hardware, and simpler exception handling. Posits never overflow to infinity or underflow to zero,
and “Nota-Number” (NaN) indicates an action instead of a bit pattern. A posit processing unit
takes less circuitry than an IEEE float FPU. With lower power use and smaller silicon footprint,
the posit operations per second (POPS) supported by a chip can be significantly higher than the
FLOPS using similar hardware resources. GPU accelerators and Deep Learning processors, in
particular, can do more per watt and per dollar with posits, yet deliver superior answer quality.
2. Posits as an alternative to floats for weather and climate models
Posit numbers, a recently proposed alternative to floating-point numbers, claim to have smaller
arithmetic rounding errors in many applications. By studying weather and climate models of low
and medium complexity (the Lorenz system and a shallow water model) we present benefits of
posits compared to floats at 16 bit. As a standardised posit processor does not exist yet, we
emulate posit arithmetic on a conventional CPU. Using a shallow water model, forecasts based
on 16-bit posits with 1 or 2 exponent bits are clearly more accurate than half precision floats. We
therefore propose 16 bit with 2 exponent bits as a standard posit format, as its wide dynamic
range of 32 orders of magnitude provides a great potential for many weather and climate models.
Although the focus is on geophysical fluid simulations, the results are also meaningful and
promising for reduced precision posit arithmetic in the wider field of computational fluid
dynamics.
3. “Efficient posit multiply-accumulate unit generator for deep learning applications
The recently proposed posit number system is more accurate and can provide a wider dynamic
range than the conventional IEEE754-2008 floating-point numbers. Its nonuniform data
representation makes it suitable in deep learning applications. Posit adder and posit multiplier
have been well developed recently in the literature. However, the use of posit in fused arithmetic
unit has not been investigated yet. In order to facilitate the use of posit number format in deep
learning applications, in this paper, an efficient architecture of posit multiply-accumulate (MAC)
unit is proposed. Unlike IEEE754-2008 where four standard binary number formats are
presented, the posit format is more flexible where the total bitwidth and exponent bitwidth can be
any number. Therefore, in this proposed design, bitwidths of all datapath are parameterized and a
posit MAC unit generator written in C language is proposed. The proposed generator can
generate Verilog HDL code of posit MAC unit for any given total bitwidth and exponent
bitwidth. The code generated by the generator is a combinational design, however a 5-stage
pipeline strategy is also presented and analyzed in this paper. The worst case delay, area, and
power consumption of the generated MAC unit under STM-28nm library with different bitwidth
choices are provided and analyzed.
4. Performance-efficiency trade-off of low-precision numerical formats in deep neural networks
Deep neural networks (DNNs) have been demonstrated as effective prognostic models across
various domains, e.g. natural language processing, computer vision, and genomics. However,
modern-day DNNs demand high compute and memory storage for executing any reasonably
complex task. To optimize the inference time and alleviate the power consumption of these
networks, DNN accelerators with low-precision representations of data and DNN parameters are
being actively studied. An interesting research question is in how low-precision networks can be
ported to edge-devices with similar performance as high-precision networks. In this work, we
employ the fixed-point, floating point, and posit numerical formats at ≤8-bit precision within a
DNN accelerator, Deep Positron, with exact multiply-and-accumulate (EMAC) units for
inference. A unified analysis quantifies the trade-offs between overall network efficiency and
performance across five classification tasks. Our results indicate that posits are a natural fit for
DNN inference, outperforming at ≤8-bit precision, and can be realized with competitive resource
requirements relative to those of floating point.

You might also like