The Fourier transforms (FT, DTFT, DFT,
etc.) do not clearly indicate how the
frequency content of a signal changes over time.
That information is hidden in the phase - it is not
revealed by the plot of the magnitude of the spectrum.
To see how the frequency content of a signal changes over
time, we can cut the signal into blocks and compute the
spectrum of each block.
To improve the result,
- blocks are overlapping
-
each block is multiplied by a window that is tapered
at its endpoints.
Several parameters must be chosen:
- Block length, RR.
- The type of window.
-
Amount of overlap between blocks. (Figure 1)
- Amount of zero padding, if any.
The short-time Fourier transform is defined as
Xωm=STFTxn≔DTFTxn−mwn=∑
n
=−∞∞xn−mwne−(iωn)=∑
n
=0R−1xn−mwne−(iωn)
X
ω
m
≔
STFT
x
n
DTFT
x
n
m
w
n
n
x
n
m
w
n
ω
n
n
0
R
1
x
n
m
w
n
ω
n
(1)
where
wn
w
n
is the window function of length
RR.
-
The STFT of a signal
xn
x
n
is a function of two variables: time and
frequency.
-
The block length is determined by the support of the
window function
wn
w
n
.
-
A graphical display of the magnitude of the STFT,
|Xωm|
X
ω
m
, is called the spectrogram of the
signal. It is often used in speech processing.
-
The STFT of a signal is invertible.
-
One can choose the block length. A long block length will
provide higher frequency resolution (because the main-lobe
of the window function will be narrow). A short block
length will provide higher time resolution because less
averaging across samples is performed for each STFT value.
-
A narrow-band spectrogram is one computed
using a relatively long block length
RR, (long window function).
-
A wide-band spectrogram is one computed using
a relatively short block length
RR, (short window function).
To numerically evaluate the STFT, we sample the frequency
axis ωω in
NN equally spaced samples from
ω=0
ω
0
to
ω=2π
ω
2
.
∀k,0≤k≤N−1:
ω
k
=2πNk
k
0
k
N
1
ω
k
2
N
k
(2)
We then have the discrete STFT,
X
d
km≔X2πNkm=∑
n
=0R−1xn−mwne−(iωn)=∑
n
=0R−1xn−mwn
W
N
−(kn)=
DFT
N
xn−mwn|n=0R−10,…0
≔
X
d
k
m
X
2
N
k
m
n
0
R
1
x
n
m
w
n
ω
n
n
0
R
1
x
n
m
w
n
W
N
k
n
DFT
N
n
0
R
1
x
n
m
w
n
0,…0
(3)
where
0,…00,…0 is
N−RNR
.
In this definition, the overlap between adjacent blocks is
R−1
R
1
. The signal is shifted along the window one sample
at a time. That generates more points than is usually
needed, so we also sample the STFT along the time
direction. That means we usually evaluate
X
d
kLm
X
d
k
L
m
where LL is the
time-skip. The relation between the time-skip, the number of
overlapping samples, and the block length is
Overlap=R−L
Overlap
R
L
Match each signal to its spectrogram in Figure 2.
The matlab program for producing the figures above (Figure 3 and Figure 4).
% LOAD DATA
load mtlb;
x = mtlb;
figure(1), clf
plot(0:4000,x)
xlabel('n')
ylabel('x(n)')
% SET PARAMETERS
R = 256; % R: block length
window = hamming(R); % window function of length R
N = 512; % N: frequency discretization
L = 35; % L: time lapse between blocks
fs = 7418; % fs: sampling frequency
overlap = R - L;
% COMPUTE SPECTROGRAM
[B,f,t] = specgram(x,N,fs,window,overlap);
% MAKE PLOT
figure(2), clf
imagesc(t,f,log10(abs(B)));
colormap('jet')
axis xy
xlabel('time')
ylabel('frequency')
title('SPECTROGRAM, R = 256')
Here is another example to illustrate the frequency/time
resolution trade-off (See figures - Figure 5, Figure 6, and Figure 7).
A spectrogram is computed with different parameters:
L∈110
L
1
10
N∈32256
N
32
256
-
LL = time lapse between
blocks.
-
NN = FFT length (Each
block is zero-padded to length
NN.)
In each case, the block length is 30 samples.
For each of the four spectrograms in Figure 8 can you tell what
LL and
NN are?
LL and
NN do not effect the time
resolution or the frequency resolution. They only affect the
'pixelation'.
Shown below are four spectrograms of the same signal. Each
spectrogram is computed using a different set of parameters.
R∈1202561024
R
120
256
1024
L∈35250
L
35
250
where
- RR = block length
- LL = time lapse between
blocks.
For each of the four spectrograms in Figure 9, match the above values
of LL and
RR.
If you like, you may listen to this signal with the
soundsc command; the data is in the
file: stft_data.m. Here is a figure
of the signal.