Questions or comments concerning this laboratory should be directed to Prof. Charles A. Bouman, School of Electrical and Computer Engineering, Purdue University, West Lafayette IN 47907; (765) 494-0340; bouman@ecn.purdue.edu
This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory. Numbers may be either in fixed point or floating point format. Integers are often represented with fixed point format. Decimals, and numbers that may take on a very large range of values would use floating point. The second issue of numeric storage is quantization. All analog signals that are processed on the computer must first be quantized. We will examine the errors that arise from this operation, and determine how different levels of quantization affect a signal's quality. We will also look at two types of quantizers. The uniform quantizer is the simpler of the two. The Max quantizer, is optimal in that it minimizes the mean square error between the original and quantized signals.
There are two types of numbers that a computer can represent: integers and decimals. These two numbers are stored quite differently in memory. Integers (e.g. 27, 0, -986) are usually stored in fixed point format, while decimals (e.g. 12.34, -0.98) most often use floating point format. Most integer representations use four bytes of memory; floating point values usually require eight.
There are different conventions for encoding fixed point binary numbers because of the different ways of representing negative numbers. Three types of fixed point formats that accommodate negative integers are sign-magnitude, one's-complement, and two's-complement. In all three of these "signed" formats, the first bit denotes the sign of the number: 0 for positive, and 1 for negative. For positive numbers, the magnitude simply follows the first bit. Negative numbers are handled differently for each format.
Of course, there is also an unsigned data type which can be used when the numbers are known to be non-negative. This allows a greater range of possible numbers since a bit isn't wasted on the negative sign.
Sign-magnitude notation is the simplest way to represent negative numbers. The magnitude of the negative number follows the first bit. If an integer was stored as one byte, the range of possible numbers would be -127 to 127.
The value +27 would be represented as
0 0 0 1 1 0 1 1 .
The number -27 would represented as
1 0 0 1 1 0 1 1 .
To represent a negative number, the complement of the bits for the positive number with the same magnitude are computed. The positive number 27 in one's-complement form would be written as
0 0 0 1 1 0 1 1 ,
but the value -27 would be represented as
1 1 1 0 0 1 0 0 .
The problem with each of the above notations is that two different values represent zero. Two's-complement notation is a revision to one's-complement that solves this problem. To form negative numbers, the positive number is subtracted from a certain binary number. This number has a one in the most significant bit (MSB), followed by as many zeros as there are bits in the representation. If 27 was represented by an eight-bit integer, -27 would be represented as:
1 0 0 0 0 0 0 0 0
- 0 0 0 1 1 0 1 1
= 1 1 1 0 0 1 0 1
Notice that this result is one plus the one's-complement representation for -27 (modulo-2 addition). What about the second value of 0? That representation is
1 0 0 0 0 0 0 0 .
This value equals -128 in two's-complement notation!
1 0 0 0 0 0 0 0 0
- 1 0 0 0 0 0 0 0
= 1 0 0 0 0 0 0 0
The value represented here is -128; we know it is negative, because the result has a 1 in the MSB. Two's-complement is used because it can represent one extra negative value. More importantly, if the sum of a series of two's-complement numbers is within the range, overflows that occur during the summation will not affect the final answer! The range of an 8-bit two's complement integer is [-128,127].
Floating point notation is used to represent a much wider range of numbers. The tradeoff is that the resolution is variable: it decreases as the magnitude of the number increases. In the fixed point examples above, the resolution was fixed at 1. It is possible to represent decimals with fixed point notation, but for a fixed word length any increase in resolution is matched by a decrease in the range of possible values.
A floating point number, F, has two parts: a mantissa, M, and an exponent, E.
The mantissa is a signed fraction, which has a power of two
in the denominator.
The exponent is a signed integer, which represents
the power of two that the mantissa must be multiplied by.
These signed numbers may be represented with any of the
three fixed-point number formats.
The IEEE has a standard for floating point numbers (IEEE 754). For a
32-bit number, the first bit is the mantissa's sign. The exponent
takes up the next 8 bits (1 for the sign, 7 for the quantity), and the mantissa
is contained in the remaining 23 bits. The range of values for this number
is (
To add two floating point numbers, the exponents must be the same. If the
exponents are different, the mantissa is adjusted until the exponents match.
If a very small number is added to a large one, the result may be the same
as the large number! For instance, if
Quantization is the act of rounding off the value of a signal or quantity to certain discrete levels. For example, digital scales may round off weight to the nearest gram. Analog voltage signals in a control system may be rounded off to the nearest volt before they enter a digital controller. Generally, all numbers need to be quantized before they can be represented in a computer.
Digital images are also quantized. The gray levels in a black and white photograph must be quantized in order to store an image in a computer. The “brightness" of the photo at each pixel is assigned an integer value between 0 and 255 (typically), where 0 corresponds to black, and 255 to white. Since an 8-bit number can represent 256 different values, such an image is called an “8-bit grayscale" image. An image which is quantized to just 1 bit per pixel (in other words only black and white pixels) is called a halftone image. Many printers work by placing, or not placing, a spot of colorant on the paper at each point. To accommodate this, an image must be halftoned before it is printed.
Quantization can be thought of as a functional mapping
![]() |
Quantization is sometimes used for compression. As an example, suppose we have a digital image which is represented by 8 different gray levels: [0 31 63 95 159 191 223 255]. To directly store each of the image values, we need at least 8-bits for each pixel since the values range from 0 to 255. However, since the image only takes on 8 different values, we can assign a different 3-bit integer (a code) to represent each pixel: [000 001 ... 111]. Then, instead of storing the actual gray levels, we can store the 3-bit code for each pixel. A look-up table, possibly stored at the beginning of the file, would be used to decode the image. This lowers the cost of an image considerably: less hard drive space is needed, and less bandwidth is required to transmit the image (i.e. it downloads quicker). In practice, there are much more sophisticated methods of compressing images which rely on quantization.
Download the file fountainbw.tif for the following section.
The image in fountainbw.tif is an 8-bit grayscale image.
We will investigate what happens when we quantize it to fewer bits
per pixel (b/pel).
Load it into Matlab and display it using the following sequence of
commands:
y = imread('fountainbw.tif');
image(y);
colormap(gray(256));
axis('image');
The image array will initially be of type uint8, so you
will need to convert the image matrix to type
double before performing any computation.
Use the command z=double(y) for this.
There is an easy way to uniformly quantize a signal. Let
where X is the
signal to be quantized, and N is the number of quantization levels.
To force the data to have a uniform quantization step of
Write a Matlab function Y = Uquant(X,N)
which will
uniformly quantize an input array
Download the files speech.au and music.au for the following section.
If an audio signal is to be coded, either for compression or for digital transmission, it must undergo some form of quantization. Most often, a general technique known as vector quantization is employed for this task, but this technique must be tailored to the specific application so it will not be addressed here. In this exercise, we will observe the effect of uniformly quantizing the samples of two audio signals.
Down load the audio files
speech.au
and
music.au
.
Use your Uquant function to quantize each of these signals to
7, 4, 2 and 1 bits/sample.
Listen to the original and quantized signals and answer the following
questions:
Use subplot to plot in the same figure,
the four quantized speech signals over the
index range 7201:7400.
Generate a similar figure for the music signal, using the same indices.
Make sure to use orient tall before printing these out.
As we have clearly observed, quantization produces errors in a signal. The most effective methods for analysis of the error turn out to be probabilistic. In order to apply these methods, however, one needs to have a clear understanding of the error signal's statistical properties. For example, can we assume that the error signal is white noise? Can we assume that it is uncorrelated with the quantized signal? As you will see in this exercise, both of these are good assumptions if the quantization intervals are small compared with sample-to-sample variations in the signal.
If the original signal is
Compute the error signal for the quantized speech for 7, 4, 2 and 1 b/sample.
When the spacing, hist(E,20)
to generate 20-bin histograms
for each of the four error signals.
Use subplot to place the four histograms in the same figure.
Next we will examine correlation properties of the error signal. First compute and plot an estimate of the autocorrelation function for each of the four error signals using the following commands:
[r,lags] = xcorr(E,200,'unbiased');
plot(lags,r)
Now compute and plot an estimate of the cross-correlation function
between the quantized speech Y and each error signal E using
[c,lags] = xcorr(E,Y,200,'unbiased');
plot(lags,c)
One way to measure the quality of a quantized signal is by the Power Signal-to-Noise Ratio (PSNR). This is defined by the ratio of the power in the quantized speech to power in the noise.
In this expression, the noise is the error signal E.
Generally, this means that a higher PSNR implies a less noisy signal.
From previous labs we know the power of a sampled signal,
where
In evaluating quantization (or compression) algorithms,
a graph called a “rate-distortion curve" is often used.
This curve plots signal distortion vs. bit rate.
Here, we can measure the distortion by
Assuming that the speech is sampled at 8kHz, plot the
rate distortion curve using
In this section, we will investigate a different type of quantizer which produces less noise for a fixed number of quantization levels. As an example, let's assume the input range for our signal is [-1,1], but most of the input signal takes on values between [-0.2, 0.2]. If we place more of the quantization levels closer to zero, we can lower the average error due to quantization.
A common measure of quantization error is mean squared error (noise power). The Max quantizer is designed to minimize the mean squared error for a given set of training data. We will study how the Max quantizer works, and compare its performance to that of the uniform quantizer which was used in the previous sections.
The Max quantizer determines quantization levels based on a data set's
probability density function,
where
We still need the quantization boundaries,
This means that each non-infinite boundary is exactly halfway in between the two adjacent quantization levels, and that each quantization level is at the centroid of its region. Figure 2 shows a five-level quantizer for a Gaussian distributed signal. Note that the levels are closer together in areas of higher probability.
![]() |
Download the file speech.au for the following section.
In this section we will use Matlab to compute an optimal quantizer, and compare its performance to the uniform quantizer. Since we almost never know the actual probability density function of the data that the quantizer will be applied to, we cannot use equation Equation 7 to compute the optimal quantization levels. Therefore, a numerical optimization procedure is used on a training set of data to compute the quantization levels and boundaries which yield the smallest possible error for that set.
Matlab has a built-in function called lloyds
which performs
this optimization.
It's syntax is...
[partition, codebook] = lloyds(training_set, initial_codebook)
;
This function requires two inputs. The first is the training data set, from which it will estimate the probability density function. The second is a vector containing an initial guess of the optimal quantization levels. It returns the computed optimal boundaries (the “partition”) and quantization levels (the “codebook”).
Since this algorithm minimizes the error with respect to the quantization levels, it is necessary to provide a decent initial guess of the codebook to help ensure a valid result. If the initial codebook is significantly “far" away from the optimal solution, it's possible that the optimization will get trapped in a local minimum, and the resultant codebook may perform quite poorly. In order to make a good guess, we may first estimate the shape of the probability density function of the training set using a histogram. The idea is to divide the histogram into equal “areas" and choose quantization levels as the centers of each of these segments.
First plot a 40-bin histogram of this speech signal using
hist(speech,40), and make an initial guess of
the four optimal quantization levels.
Print out the histogram.
Then use the lloyds function to compute an optimal 4-level
codebook using
speech.au
as the training set.
Once the optimal codebook is obtained, use the codebook and partition vectors to quantize the speech signal. This may be done with a for loop and if statements. Then compute the error signal and PSNR. On the histogram plot, mark where the optimal quantization levels fall along the x-axis.