Double-Precision Floating-Point Format

A Double-Precision Floating-Point Format is a computer number format that uses 64 bits to represent a wide dynamic range of numeric values by utilizing a floating radix point.

Context:
- It can represent real numbers (floating-point values) using 64 bits within computer memory, providing higher precision compared to single-precision floating-point format.
- It can (typically) encode a number in three parts: the sign bit, the exponent, and the fraction (or mantissa), enabling it to handle extremely large or small numbers with greater precision.
- It can be particularly useful in scientific calculations, financial modeling, and other domains where the precision of calculations is critical.
- It can (often) occupy twice the memory space of single-precision formats, making it a less favorable choice for applications with strict memory limitations.
- It can provide an exact representation for any integer value up to 2⁵³, beyond which the representation becomes approximate.
- ...
Example(s):
- in IEEE 754 standard.
- In languages like C, Java, and Python, the `double` keyword typically denotes a double-precision floating-point variable.
- ...
Counter-Example(s):
- Floating-Point Format (bfloat16).
- Single-Precision Floating-Point Format, which uses 32 bits.
- Fixed-Point Arithmetic format, where the number of digits after the decimal point is fixed.
See: 64-Bit MBF, Floating-Point Arithmetic, Computer Number Format, [[Bit
See: 64-Bit MBF, Floating-Point Arithmetic, Computer Number Format, Bit, Dynamic Range, Radix Point, Single-Precision Floating-Point Format, IEEE 754-2008, Standardization, IEEE 754-1985, Decimal Floating Point, Programming Language.

References

2024

(Wikipedia, 2024) ⇒ https://en.wikipedia.org/wiki/Double-precision_floating-point_format Retrieved:2024-3-27.
- Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.
  Double precision may be chosen when the range or precision of single precision would be insufficient.
  In the IEEE 754-2008 standard, the 64-bit base-2 format is officially referred to as binary64; it was called double in IEEE 754-1985. IEEE 754 specifies additional floating-point formats, including 32-bit base-2 single precision and, more recently, base-10 representations (decimal floating point).
  One of the first programming languages to provide floating-point data types was Fortran.Before the widespread adoption of IEEE 754-1985, the representation and properties of floating-point data types depended on the computer manufacturer and computer model, and upon decisions made by programming-language implementers. E.g., GW-BASIC's double-precision data type was the 64-bit MBF floating-point format.

Double-Precision Floating-Point Format

References

2024

Navigation menu

Search