How to Read Floating Point Numbers
Loading...
Date
1990-06-05
Authors
Clinger, William D.
Journal Title
Journal ISSN
Volume Title
Publisher
University of Oregon
Abstract
Consider the problem of converting decimal scientific notation
for a number into the best binary floating point approximation
to that number, for some fixed precision. This
problem cannot be solved using arithmetic of any fixed precision.
Hence the IEEE Standard/or Binary Floating-Point
Arithmetic does not require the result of such a conversion
to be the best approximation.
This paper presents an efficient algorithm that always·finds
the best approximation. The algorithm uses a few extra
bits of precision to compute an IEEE-conforming approximation
while testing an intermediate result to determine
whether the approximation could be other than the best.
If the approximation might not be the best, then the best
approximation is determined by a few simple operations
on multiple-precision integers, where the precision is determined
by the input When using 64 bits of precision
to compute IEEE double precision results, the algorithm
avoids higher-precision arithmetic over 99% of the time.
The input problem considered by this papet is the inverse of
an output problem considered by Steele and White: Given
a binary floating point number, print a correctly rounded
decimal representation of it using the smallest number of
digits that will allow the number to be read without loss of
accuracy. The Steele and White algorithm assumes that the
input problem is solved; an imperfect solution to the input
problem, as allowed by the IEEE standard and ubiquitous
in current practice, defeats the purpose of their algorithm.
Description
13 pages
Keywords
IEEE Standard, Binary Floating-Point Arithmetic, Steele and White