Skip to content Skip to navigation Skip to collection information

OpenStax-CNX

You are here: Home » Content » High Performance Computing » History of IEEE Floating-Point Format

Navigation

Table of Contents

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Endorsed by Endorsed (What does "Endorsed by" mean?)

This content has been endorsed by the organizations listed. Click each link for a list of all content endorsed by the organization.
  • HPC Open Edu Cup

    This collection is included inLens: High Performance Computing Open Education Cup 2008-2009
    By: Ken Kennedy Institute for Information Technology

    Click the "HPC Open Edu Cup" link to see all content they endorse.

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • NSF Partnership display tagshide tags

    This collection is included inLens: NSF Partnership in Signal Processing
    By: Sidney Burrus

    Click the "NSF Partnership" link to see all content affiliated with them.

    Click the tag icon tag icon to display tags associated with this content.

  • Featured Content

    This collection is included inLens: Connexions Featured Content
    By: Connexions

    Comments:

    "The purpose of Chuck Severence's book, High Performance Computing has always been to teach new programmers and scientists about the basics of High Performance Computing. This book is for learners […]"

    Click the "Featured Content" link to see all content affiliated with them.

Also in these lenses

  • UniqU content

    This collection is included inLens: UniqU's lens
    By: UniqU, LLC

    Click the "UniqU content" link to see all content selected in this lens.

  • Lens for Engineering

    This module and collection are included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

  • eScience, eResearch and Computational Problem Solving

    This collection is included inLens: eScience, eResearch and Computational Problem Solving
    By: Jan E. Odegard

    Click the "eScience, eResearch and Computational Problem Solving" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.

Tags

(What is a tag?)

These tags come from the endorsement, affiliation, and other lenses that include this content.
 

History of IEEE Floating-Point Format

Module by: Charles Severance, Kevin Dowd. E-mail the authors

History of IEEE Floating-Point Format

Prior to the RISC microprocessor revolution, each vendor had their own floating- point formats based on their designers’ views of the relative importance of range versus accuracy and speed versus accuracy. It was not uncommon for one vendor to carefully analyze the limitations of another vendor’s floating-point format and use this information to convince users that theirs was the only “accurate” floating- point implementation. In reality none of the formats was perfect. The formats were simply imperfect in different ways.

During the 1980s the Institute for Electrical and Electronics Engineers (IEEE) produced a standard for the floating-point format. The title of the standard is “IEEE 754-1985 Standard for Binary Floating-Point Arithmetic.” This standard provided the precise definition of a floating-point format and described the operations on floating-point values.

Because IEEE 754 was developed after a variety of floating-point formats had been in use for quite some time, the IEEE 754 working group had the benefit of examining the existing floating-point designs and taking the strong points, and avoiding the mistakes in existing designs. The IEEE 754 specification had its beginnings in the design of the Intel i8087 floating-point coprocessor. The i8087 floating-point format improved on the DEC VAX floating-point format by adding a number of significant features.

The near universal adoption of IEEE 754 floating-point format has occurred over a 10-year time period. The high performance computing vendors of the mid 1980s (Cray IBM, DEC, and Control Data) had their own proprietary floating-point formats that they had to continue supporting because of their installed user base. They really had no choice but to continue to support their existing formats. In the mid to late 1980s the primary systems that supported the IEEE format were RISC workstations and some coprocessors for microprocessors. Because the designers of these systems had no need to protect a proprietary floating-point format, they readily adopted the IEEE format. As RISC processors moved from general-purpose integer computing to high performance floating-point computing, the CPU designers found ways to make IEEE floating-point operations operate very quickly. In 10 years, the IEEE 754 has gone from a standard for floating-point coprocessors to the dominant floating-point standard for all computers. Because of this standard, we, the users, are the beneficiaries of a portable floating-point environment.

IEEE Floating-Point Standard

The IEEE 754 standard specified a number of different details of floating-point operations, including:

  • Storage formats
  • Precise specifications of the results of operations
  • Special values
  • Specified runtime behavior on illegal operations

Specifying the floating-point format to this level of detail insures that when a computer system is compliant with the standard, users can expect repeatable execution from one hardware platform to another when operations are executed in the same order.

IEEE Storage Format

The two most common IEEE floating-point formats in use are 32- and 64-bit numbers. Table 1 gives the general parameters of these data types.

Table 1: Parameters of IEEE 32- and 64-Bit Formats
IEEE75 FORTRAN C Bits Exponent Bits Mantissa Bits
Single REAL*4 float 32 8 24
Double REAL*8 double 64 11 53
Double-Extended REAL*10 long double >=80 >=15 >=64

In FORTRAN, the 32-bit format is usually called REAL, and the 64-bit format is usually called DOUBLE. However, some FORTRAN compilers double the sizes for these data types. For that reason, it is safest to declare your FORTRAN variables as REAL*4 or REAL*8. The double-extended format is not as well supported in compilers and hardware as the single- and double-precision formats. The bit arrangement for the single and double formats are shown in Figure 1.

Based on the storage layouts in Table 1, we can derive the ranges and accuracy of these formats, as shown in Table 2.

Figure 1
IEEE754 floating-point formats
This figure shows two diagrams, the first labeled single precision, and the second labeled double precision. Both diagrams are thin segmented blocks, with the first section on the left labeled s, the section to the right of it labeled exp, and the rightmost section labeled mantissa. The widths of these sections are labeled below each diagram. For single precision, the entire width is measured as 32 bits. The s-section is labeled with a black block, exp is labeled with a black block, and the width of mantissa is labeled 23. For double precision, the entire width is measured as 64 bits. The s-section is labeled with a black block, exp is labeled 11, and mantissa is labeled 52.

Table 2: Range and Accuracy of IEEE 32- and 64-Bit Formats
IEEE754 Minimum Normalized Number Largest Finite Number Base-10 Accuracy
Single 1.2E-38 3.4 E+38 6-9 digits
Double 2.2E-308 1.8 E+308 15-17 digits
Extended Double 3.4E-4932 1.2 E+4932 18-21 digits

Converting from Base-10 to IEEE Internal Format

We now examine how a 32-bit floating-point number is stored. The high-order bit is the sign of the number. Numbers are stored in a sign-magnitude format (i.e., not 2’s - complement). The exponent is stored in the 8-bit field biased by adding 127 to the exponent. This results in an exponent ranging from -126 through +127.

The mantissa is converted into base-2 and normalized so that there is one nonzero digit to the left of the binary place, adjusting the exponent as necessary. The digits to the right of the binary point are then stored in the low-order 23 bits of the word. Because all numbers are normalized, there is no need to store the leading 1.

This gives a free extra bit of precision. Because this bit is dropped, it’s no longer proper to refer to the stored value as the mantissa. In IEEE parlance, this mantissa minus its leading digit is called the significand.

Figure 2 shows an example conversion from base-10 to IEEE 32-bit format.

Figure 2
Converting from base-10 to IEEE 32-bit format
This figure consists of a boxed set of numbers and a long string of binary below the box. Inside the box are three rows of text. The first row reads 172.625, Base 10. The second reads 10101100.101 x 2 ** 0, Base 2. The third reads 1.0101100101 x 2 ** 7, Base 2 Normalized. Below this diagram is the caption, add 127 for bias = 134. From the 7 in the third row is an arrow that points to the beginning of a string of binary, that reads, 0 space 10000110 space 01011001010000000000000. From the beginning of the third row in the diagram is an arrow that points at the third segment of the binary. Below the binary is a line of text that reads, 1. Assumed bit and binary point.

The 64-bit format is similar, except the exponent is 11 bits long, biased by adding 1023 to the exponent, and the significand is 54 bits long.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks