Title :
Representation Error for Real Numbers in Binary Computer Arithmetic
Author_Institution :
Dept. of Computer Sci, Stanford University, Stanford, Calif. 94305.
Abstract :
Real numbers can be represented in a binary computer by the form i-Be where i is the integer part, B the base, and e the exponent. The accuracy of the representation will depend upon the number of bits allocated to the integer part and exponent part as well as what base is chosen. If L(i) and L(e) are the number of bits allocated to the magnitudes of the integer and exponent parts and we define I= 2L(i) and E = 2L(e), the exponent range is given by B±E, the maximum relative representation error is given by B/2I, and the average relative representation error is given by (B-1)/(4I 1n B). The formulas provide quantitative comparison for the effectiveness of alternative formats for real number representations.
Keywords :
Cellular networks; Computer errors; Counting circuits; Digital arithmetic; Distributed computing; Large scale integration; Modular construction; Shift registers; Average error; computer arithmetic; distribution of real numbers; floating-point error; floating-point format; representation error;
Journal_Title :
Electronic Computers, IEEE Transactions on
DOI :
10.1109/PGEC.1967.264781