Short Time Fourier Transform
The Fourier transforms (FT, DTFT, DFT,
etc.) do not clearly indicate how the
frequency content of a signal changes over time.
That information is hidden in the phase - it is not
revealed by the plot of the magnitude of the spectrum.
Note:
To see how the frequency content of a signal changes over
time, we can cut the signal into blocks and compute the
spectrum of each block.
To improve the result,
- blocks are overlapping
-
each block is multiplied by a window that is tapered
at its endpoints.
Several parameters must be chosen:
- Block length, RR.
- The type of window.
-
Amount of overlap between blocks. (Figure 1)
- Amount of zero padding, if any.
The short-time Fourier transform is defined as
Xωm=STFTxn≔DTFTxn-mwn=∑n=-∞∞xn-mwnⅇ-ⅈωn=∑n=0R-1xn-mwnⅇ-ⅈωn
X
ω
m
≔
STFT
x
n
DTFT
x
n
m
w
n
n
x
n
m
w
n
ω
n
n
0
R
1
x
n
m
w
n
ω
n
(1)
where
wn
w
n
is the window function of length
RR.
-
The STFT of a signal
xn
x
n
is a function of two variables: time and
frequency.
-
The block length is determined by the support of the
window function
wn
w
n
.
-
A graphical display of the magnitude of the STFT,
|Xωm|
X
ω
m
, is called the spectrogram of the
signal. It is often used in speech processing.
-
The STFT of a signal is invertible.
-
One can choose the block length. A long block length will
provide higher frequency resolution (because the main-lobe
of the window function will be narrow). A short block
length will provide higher time resolution because less
averaging across samples is performed for each STFT value.
-
A narrow-band spectrogram is one computed
using a relatively long block length
RR, (long window function).
-
A wide-band spectrogram is one computed using
a relatively short block length
RR, (short window function).
Sampled STFT
To numerically evaluate the STFT, we sample the frequency
axis
ωω in
NN equally spaced samples from
ω=0
ω
0
to
ω=2π
ω
2
.
∀k,0≤k≤N-1:
ω
k
=2πNk
k
0
k
N
1
ω
k
2
N
k
(2)
We then have the discrete STFT,
X
d
km≔X2πNkm=∑n=0R-1xn-mwnⅇ-ⅈωn=∑n=0R-1xn-mwn
W
N
-kn=
DFT
N
xn-mwn|n=0R-10,…0
≔
X
d
k
m
X
2
N
k
m
n
0
R
1
x
n
m
w
n
ω
n
n
0
R
1
x
n
m
w
n
W
N
k
n
DFT
N
n
0
R
1
x
n
m
w
n
0,…0
(3)
where
0,…00,…0 is
N-RNR
.
In this definition, the overlap between adjacent blocks is
R-1
R
1
. The signal is shifted along the window one sample
at a time. That generates more points than is usually
needed, so we also sample the STFT along the time
direction. That means we usually evaluate
X
d
kLm
X
d
k
L
m
where LL is the
time-skip. The relation between the time-skip, the number of
overlapping samples, and the block length is
Overlap=R-L
Overlap
R
L
Problem 1
Match each signal to its spectrogram in
Figure 2.
[
Click for Solution 1 ]
Solution 1
[
Hide Solution 1 ]
Spectrogram Example
% LOAD DATA
load mtlb;
x = mtlb;
figure(1), clf
plot(0:4000,x)
xlabel('n')
ylabel('x(n)')
% SET PARAMETERS
R = 256; % R: block length
window = hamming(R); % window function of length R
N = 512; % N: frequency discretization
L = 35; % L: time lapse between blocks
fs = 7418; % fs: sampling frequency
overlap = R - L;
% COMPUTE SPECTROGRAM
[B,f,t] = specgram(x,N,fs,window,overlap);
% MAKE PLOT
figure(2), clf
imagesc(t,f,log10(abs(B)));
colormap('jet')
axis xy
xlabel('time')
ylabel('frequency')
title('SPECTROGRAM, R = 256')
Effect of window length R
Here is another example to illustrate the frequency/time
resolution trade-off (See figures -
Figure 5,
Figure 6, and
Figure 7).
Effect of L and N
A spectrogram is computed with different parameters:
L∈110
L
1
10
N∈32256
N
32
256
-
LL = time lapse between
blocks.
-
NN = FFT length (Each
block is zero-padded to length
NN.)
In each case, the block length is 30 samples.
Problem 2
For each of the four spectrograms in
Figure 8 can you tell what
LL and
NN are?
[
Click for Solution 2 ]
Solution 2
[
Hide Solution 2 ]
LL and
NN do not effect the time
resolution or the frequency resolution. They only affect the
'pixelation'.
Effect of R and L
Shown below are four spectrograms of the same signal. Each
spectrogram is computed using a different set of parameters.
R∈1202561024
R
120
256
1024
L∈35250
L
35
250
where
- RR = block length
- LL = time lapse between
blocks.
Problem 3
For each of the four spectrograms in
Figure 9, match the above values
of
LL and
RR.
[
Click for Solution 3 ]
Solution 3
[
Hide Solution 3 ]
If you like, you may listen to this signal with the
soundsc command; the data is in the
file:
stft_data.m.
Here is a figure
of the signal.