# OpenStax_CNX

You are here: Home » Content » Computer Arithmetic

### Lenses

What is a lens?

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

#### Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
• VOCW

This module is included inLens: Vietnam OpenCourseWare's Lens
By: Vietnam OpenCourseWare

Click the "VOCW" link to see all content affiliated with them.

### Recently Viewed

This feature requires Javascript to be enabled.

# Computer Arithmetic

Module by: Nguyen Thi Hoang Lan. E-mail the author

## 1. The Arithmetic and Logic Unit

The ALU is that part of the computer that actually performs arithmetic and logical operations on data. All of the other elements of the computer system—control unit, registers, memory, I/O—are there mainly to bring data into the ALU for it to process and then to take the results back out. We have, in a sense, reached the core or essence of a computer when we consider the ALU. An ALU and, indeed, all electronic components in the computer arc based on the use of simple digital logic devices that can store binary digits and perform simple Boolean logic operations.

Figure 3.1 indicates, in general terms, how the ALU is interconnected with the rest of the processor. Data are presented to the ALU in registers, and the results of an operation are stored in registers. These registers are temporary storage locations within the processor that are connected by signal paths to the ALU. The ALU may also set flags as the result of an operation. For example, an over­flow flag is set to 1 if the result of a computation exceeds the length of the register into which it is to be stored.

Figure 3.1 ALU Input and Output

## 2. Integer Representation

In the binary number system, arbitrary numbers can be represented with just the digits zero and one, the minus sign, and the period, or radix point.

For purposes of computer storage and processing, however, we do not have the ben­efit of minus signs and periods. Only binary digits (0 and 1) may be used lo represent numbers.

### 2.1 Unsigned Integer

If we are limited to nonnegative integers, the representation is straightforward.

In general, if an n-bit sequence of binary digits an1an2...a1a0an1an2...a1a0 size 12{a rSub { size 8{n - 1} } a rSub { size 8{n - 2} } "." "." "." a rSub { size 8{1} } a rSub { size 8{0} } } {} is interpreted as an unsigned integer A, its value is A=an12n1+an22n2+...+a121+a020A=an12n1+an22n2+...+a121+a020 size 12{A=a rSub { size 8{n - 1} } 2 rSup { size 8{n - 1} } +a rSub { size 8{n - 2} } 2 rSup { size 8{n - 2} } + "." "." "." +a rSub { size 8{1} } 2 rSup { size 8{1} } +a rSub { size 8{0} } 2 rSup { size 8{0} } } {}

A = i = 0 n 1 a i 2 i A = i = 0 n 1 a i 2 i size 12{A= Sum cSub { size 8{i=0} } cSup { size 8{n - 1} } {a rSub { size 8{i} } 2 rSup { size 8{i} } } } {}

### 2.2 Signed Integer

#### Sign-magnitude representation

There are several alternative conventions used to represent negative as well as positive integers, all of which involve treating the most significant (leftmost) bit in the word as a sign bit. If the sign bit is 0, the number is positive: if the sign bit is 1, the number is negative.

The simplest form of representation that employs a sign bit is the sign-magnitude representation. In an n-bit word, the rightmost n - 1 bits hold the mag­nitude of the integer.

In general, if an n-bit sequence of binary digits an1an2...a1a0an1an2...a1a0 size 12{a rSub { size 8{n - 1} } a rSub { size 8{n - 2} } "." "." "." a rSub { size 8{1} } a rSub { size 8{0} } } {} represented A, its value is

A=i=0n2ai2iA=i=0n2ai2i size 12{A= Sum cSub { size 8{i=0} } cSup { size 8{n - 2} } {a rSub { size 8{i} } 2 rSup { size 8{i} } } } {} If an1=0an1=0 size 12{a rSub { size 8{n - 1} } =0} {}

A=i=0n2ai2iA=i=0n2ai2i size 12{A= - Sum cSub { size 8{i=0} } cSup { size 8{n - 2} } {a rSub { size 8{i} } 2 rSup { size 8{i} } } } {} If an1=1an1=1 size 12{a rSub { size 8{n - 1} } =1} {}

For example: +18= 0 0010010

- 18=1 0010010

There are several drawbacks to sign-magnitude representation. One is that addition and subtraction require a consideration of both the signs of the numbers and their relative magnitudes to carry out the required operation. Another drawback is that there are two representations of 0 (e.g 0000 0000 and 1000 00000).

This is inconvenient, because it is slightly more difficult to test for 0 (an operation performed frequently on computers) than if there were a single representation.

Because of these drawbacks, sign-magnitude representation is rarely used in implementing the integer portion of the ALU. Instead, the most common scheme is twos complement representation.

#### Twos Complement Representation

• Ones Complement

The ones complement of a number is represented by flipping the number's bits one at a time. For example, the value 01001001 becomes 10110110.

Suppose you are working with B-bit numbers. Then the ones complement for the number N is 2B2B size 12{2 rSup { size 8{B} } } {}-1 -N

• Twos complement

The twos complement of a number N in a B-bit system is 2B2B size 12{2 rSup { size 8{B} } } {}-N.

There is a simple way to calculate a twos complement value: invert the number's bits and then add 1.

For a concrete example, consider the value 17 which is 00010001 in binary. Inverting gives 11101110. Adding one makes this 11101111.

• Twos Complement Representation

Like sign magnitude, twos complement representation uses the most significant bit as a sign bit, making it easy to test whether an integer is positive or negative. It differs from the use of the sign-magnitude representation in the way that the other bits are interpreted.

Consider an n-bit integer. A, in twos complement representation. If A is positive, then the sign bit an-1 is zero. The remaining, bits represent the magnitude of the number in the same fashion as for sign magnitude:

A=i=0n2ai2iA=i=0n2ai2i size 12{A= Sum cSub { size 8{i=0} } cSup { size 8{n - 2} } {a rSub { size 8{i} } 2 rSup { size 8{i} } } } {} If an1=0an1=0 size 12{a rSub { size 8{n - 1} } =0} {}

The number zero is identified as positive and therefore has a 0 sign bit and a mag­nitude of all 0s. We can see that the range of positive integers that may he repre­sented is from 0 (all of the magnitude bits are 0) through - 1 (all of the magnitude bits are 1). Any larger number would require more bits.

For a negative number A (A < 0), the sign bit, is one. The remain­ing n-1 bits can take on any one of an12n1an12n1 size 12{a rSub { size 8{n - 1} } 2 rSup { size 8{n - 1} } } {} values. Therefore, the range of negative integers that can be represented is from -1 to - 2n12n1 size 12{2 rSup { size 8{n - 1} } } {}

This is the convention used in twos complement representation, yielding the following expression for negative and positive numbers:

A = 2 n 1 a n 1 + i = 0 n 2 a i 2 i A = 2 n 1 a n 1 + i = 0 n 2 a i 2 i size 12{A= - 2 rSup { size 8{n - 1} } a rSub { size 8{n - 1} } + Sum cSub { size 8{i=0} } cSup { size 8{n - 2} } {a rSub { size 8{i} } 2 rSup { size 8{i} } } } {}

The range of A is from - 2n12n1 size 12{2 rSup { size 8{n - 1} } } {} to 2n12n1 size 12{2 rSup { size 8{n - 1} } } {} -1.

Example: Using 8 bit to represent

+50= 0011 0010

-70=1011 1010

#### Converting between Different Bit Lengths

The rule for twos complement integers is to move the sign hit to the new leftmost position and fill in with copies of the sign bit. For positive numbers, fill in with zeros, and for negative numbers, till in with ones.

For example:

+18 = 00010010

+18 = 00000000 00010010

-18 = 10010010

-18 = 11111111 10010010

## 3. Integer Arithmetic

### 3.1 Negation

In sign-magnitude representation, the rule for forming the negation of an integer is simple: Invert the sign bit.

In twos complement representation, the negation of an integer can he formed with the following rules:

1. Take the Boolean complement of each bit of the integer (including the sign bit). That is. set each 1 to 0 and each 0 to 1.

2. Treating the result as an unsigned binary integer, add 1.

This two-step process is referred to as the twos complement operation, or the taking of the twos complement of an integer

Example:

Negation Special Case 1:

0 = 00000000

Bitwise not 11111111

Result 1 00000000

Overflow is ignored, so: - 0 = 0

Negation Special Case 2:

-128 = 10000000

bitwise not 01111111

Result 10000000

So: -(-128) = -128.

Because 128 is out of the range of signed 8bits numbers.

Addition and Subtraction is done using following steps:

• Monitor sign bit for overflow
• Take twos compliment of subtrahend and add to minuend ,i.e. a - b = a + (-b)

### 3.3 Multiplying positive numbers:

The multiplying is done using following steps:

• Work out partial product for each digit
• Take care with place value (column)

### 3.4 Multiplying Negative Numbers

#### Solution 1:

• Convert to positive if required
• Multiply as above
• If signs were different, negate answer

#### Solution 2:

• Booth’s algorithm:

Example of Booth’s Algorithm:

### 3.5 Division:

• More complex than multiplication
• Negative numbers are really bad!
• Based on long division
• (for more detail, reference to Computer Organization and Architecture, William Stalling)

## 4. Floating-Point Representation

### 4.1 Principles

We can represent a real number in the form

± S × B ± E ± S × B ± E size 12{ +- S times B rSup { size 8{ +- E} } } {}

This number can be stored in a binary word with three fields:

• Sign: plus or minus
• Significant: S
• Exponent: E.

(A fixed value, called the bias, is subtracted from the biased exponent field to get the true exponent value (E). Typically, the bias equal 2k112k11 size 12{2 rSup { size 8{k - 1} } - 1} {}, where k is the number of bits in the binary exponent)

• The base B is implicit and need not be stored because it is the same for all numbers.

### 4.2 IEEE Standard for Binary Floating-Point Representation

The most important floating-point representation is defined in IEEE Standard 754 [EEE8]. This standard was developed to facilitate the portability of programs from one processor to another and to encourage the development of sophisticated, numerically oriented programs. The standard has been widely adopted and is used on virtually all contemporary processors and arithmetic coprocessors.

The IEEE standard defines both a 32-bit (Single-precision) and a 64-bit (Double-precision) double format with 8-bit and 11-bit exponents, respectively. Binary floating-point numbers are stored in a form where the MSB is the sign bit, exponent is the biased exponent, and "fraction" is the significand. The implied base (B) is 2.

Not all bit patterns in the IEEE formats are interpreted in die usual way; instead, some bit patterns are used to represent special values. Three special cases arise:

1. if exponent is 0 and fraction is 0, the number is ±0 (depending on the sign bit)
2. if exponent = 2e2e size 12{2 rSup { size 8{e} } } {}-1 and fraction is 0, the number is ±infinity (again depending on the sign bit), and
3. if exponent = 2e2e size 12{2 rSup { size 8{e} } } {}-1 and fraction is not 0, the number being represented is not a number (NaN).

This can be summarized as:

Single-precision 32 bit

A single-precision binary floating-point number is stored in 32 bits.

The number has value v:

v = s × 2e2e size 12{2 rSup { size 8{e} } } {}× m

Where

s = +1 (positive numbers) when the sign bit is 0

s = −1 (negative numbers) when the sign bit is 1

e = Exp − 127 (in other words the exponent is stored with 127 added to it, also called "biased with 127")

m = 1.fraction in binary (that is, the significand is the binary number 1 followed by the radix point followed by the binary bits of the fraction). Therefore, 1 ≤ m < 2.

In the example shown above:

S=1

E= 011111100(2) -127 = -3

M=1.01 (in binary, which is 1.25 in decimal).

The represented number is: +1.25 × 2−3 = +0.15625.

## 5. Floating-Point Arithmetic

The basic operations for floating-point X1=M1RE1X1=M1RE1 size 12{X1=M1*R rSup { size 8{E1} } } {} and X2=M2RE2X2=M2RE2 size 12{X2=M2*R rSup { size 8{E2} } } {}

• X1±X2=(M1RE1E2)RE2X1±X2=(M1RE1E2)RE2 size 12{X1 +- X2= $$M1*R rSup { size 8{E1 - E2} }$$ R rSup { size 8{E2} } } {} (assume E1 size 12{ <= {}} {}E2)
• X1 X2 = ( M1 M2 ) R E1 + E2 X1 X2 = ( M1 M2 ) R E1 + E2 size 12{X1*X2= $$M1*M2$$ R rSup { size 8{E1+E2} } } {}
• X1 / X2 = ( M1 / M2 ) R E1 E2 X1 / X2 = ( M1 / M2 ) R E1 E2 size 12{X1/X2= $$M1/M2$$ R rSup { size 8{E1 - E2} } } {}

For addi­tion and subtraction, it is necessary lo ensure that both operands have the same exponent value. I his may require shifting the radix point on one of the operands to achieve alignment. Multiplication and division are more straightforward.

A floating-point operation may produce one of these conditions:

• Exponent overflow: A positive exponent exceeds the maximum possible expo­nent value. In some systems, this may be designated as
• Exponent underflow: A negative exponent is less than the minimum possible exponent value (e.g.. -200 is less than -127). This means that the number is too small to be represented, and it may be reported as 0.
• Significand underflow: In the process of aligning significands, digits may flow off the right end of the significand. Some form of rounding is required.
• Significand overflow: The addition of two significands of the same sign may result in a carry out of the most significant bit. This can be fixed by realign­ment.

## Content actions

PDF | EPUB (?)

### What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

#### Definition of a lens

##### Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

##### What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

##### Who can create a lens?

Any individual member, a community, or a respected organization.

##### What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks