In a sonar system, one is often searching for targets, e.g. submarines, mines, or fish. The sonar system gathers sounds from its acoustic sensors searching for either echoes or sounds emitted by the target. These received sounds are the observations, and for active sonar, are collected into sets of observations related to the time of transmission of the sonar waveform. The sounds related to the broadcast of a single transmission is called a ping history. These pings occur sequentially in time, so one naturally has a sequence of observations (sound recordings) indexed by the time that a ping was transmitted.
The sonar system decision space includes hypotheses about the target’s presence, location, velocity and classification. The observations
YkYk size 12{Y rSub { size 8{k} } } {} at ping k contain information about the sonar decision space, but are also influenced, and often dominated by other sounds, such as noise and reverberation. Echoes, noise and reverberation are significantly influenced by the propagation properties of the ocean. These environmental effects are important when making useful inferences about the target echoes that may or may not be present in the sonar ping history.
In most sonar systems today, environmental interference effects, such as noise and reverberation, are treated as random variables. The sonar processing designer develops algorithms that make detections and estimates target states by assuming a statistical model of the echo and interference, choosing environmental interference model parameters (amplitude, covariance, autocorrelation, etc.) and then computing a detection decision or state estimate.
The environmental effects are usually estimated as part of the target state decision process, or the processing algorithm is constructed to be invariant to the environmental effects.
The primary detection processing method for current active sonars is to process the ping history with a bank of matched filters. The filters are constructed so that each filter is constructing the cross-correlation between the transmitted waveform and the pre-whitened ping history.
A monostatic sonar has the source and receiver in the same location, and hence the receiver cannot realistically capture the ping history until the waveform transmission is complete. To simplify the problem, we will look for targets that are far away from the sonar, so that the echo reception occurs when the reverberation level has fallen below the background noise level. In this way, we are dealing with target echoes that are essentially embedded in background noise only.
We use the symbol
φφ size 12{φ} {} to designate the target absent hypothesis. The other hypotheses concern the location of a single target. We then use as a decision space the composite space
D=φ∪HτD=φ∪Hτ size 12{D=φ union H"" lSup { size 8{τ} } } {}where
HτHτ size 12{H"" lSup { size 8{τ} } } {}is the space constructed from all target present hypotheses. The target hypothesis space consists of those locations around the sonar that generate echoes embedded in noise only. These location hypothesis
hh size 12{h} {} are part of the decision space
h=(x,y)∈Hτh=(x,y)∈Hτ size 12{h= \( x,y \) in H rSup { size 8{τ} } } {}specified by the set of ordered pairs
h=(x,y)h=(x,y) size 12{h= \( x,y \) } {} such that
h=(x,y)∈Hτh=(x,y)∈Hτ size 12{h= \( x,y \) in H rSup { size 8{τ} } } {}if
Rmin<(x−xr)2+(y−yr)2<RmaxRmin<(x−xr)2+(y−yr)2<Rmax size 12{R rSub { size 8{"min"} } < sqrt { \( x - x rSub { size 8{r} } \) rSup { size 8{2} } + \( y - y rSub { size 8{r} } \) rSup { size 8{2} } } <R rSub { size 8{"max"} } } {}
Where we are assuming that the depth of the target is small when compared to its
(x,y)(x,y) size 12{ \( x,y \) } {} coordinates, the receiver is located at
(xr,yr)(xr,yr) size 12{ \( x rSub { size 8{r} } ,y rSub { size 8{r} } \) } {}.
RminRmin size 12{R rSub { size 8{"min"} } } {}is the range at which the echo is noise, not reverberation limited, and
RmaxRmax size 12{R rSub { size 8{"max"} } } {}is the farthest range of interest. For this problem,
hh size 12{h} {} is an index into the target range from the sonar.
The sonar transmits the waveform
m(t)m(t) size 12{m \( t \) } {}for each ping. In most sonar transmitters, the transmitted waveform is narrow-band, that is, the waveform bandwidth is much smaller than its center frequency,
ff size 12{f} {}. This is true because efficient sonar transmitters use resonant mechanical and electrical components to provide maximum electrical to sound power transfer. An approximation therefore is to model the transmitted waveform as an amplitude modulated carrier:
m(t)=sin(2πft)w(t)m(t)=sin(2πft)w(t) size 12{m \( t \) ="sin" \( 2π ital "ft" \) w \( t \) } {},
t=(0,T)t=(0,T) size 12{t= \( 0,T \) } {}
We will assume that the target is motionless, so that Doppler effects can be ignored. We will assume that the sonar receiver is a single sensor, with no directionality characteristics. For each target location hypothesis
h=(x,y)h=(x,y) size 12{h= \( x,y \) } {} we know approximately the received echo time series:
g
(
t
∣
h
)
=
Bm
(
t
−
2R
/
c
)
g
(
t
∣
h
)
=
Bm
(
t
−
2R
/
c
)
size 12{g \( t \lline h \) = ital "Bm" \( t - 2R/c \) } {}
The amplitude
BB size 12{B} {} is related to the propagation loss out to the target hypothesis location, and the reflection characteristics of the target. The time delay
2R/c2R/c size 12{2R/c} {} corresponds to the time it takes for the transmission waveform to reach the target and return to the sonar.
RR size 12{R} {} is the range to the target and c is the effective speed of sound, when including refraction and boundary reflections.
The received echo is band-limited to approximately the same frequency band as the transmission. The receiver bandwidth may be greater than the transmitted bandwidth due to Doppler frequency shifts, but for the present, we are assuming that the target is not moving. Sonar receivers use heterodyne techniques to reduce the data storage of the ping history. The sonar receiver multiplies the ping history by a carrier signal
e−j2πfte−j2πft size 12{e rSup { size 8{ - j2π ital "ft"} } } {}to shift the positive frequency part of the received echo closer to DC. The resulting signal is then low pass filtered to eliminate the shifted negative frequency part of the ping history. Since the original ping history was real, the negative frequency part of the signal spectra carries no additional information. The result is a complex signal with a lower bandwidth, but retains all of the echo related information of the original ping history. This heterodyne process can be done in the analog or digital domain.
A target echo passing through the heterodyne part of the sonar receiver becomes:
r
(
t
∣
h
)
=
Ae
jθ
w
(
t
−
2R
/
c
)
r
(
t
∣
h
)
=
Ae
jθ
w
(
t
−
2R
/
c
)
size 12{r \( t \lline h \) = ital "Ae" rSup { size 8{jθ} } w \( t - 2R/c \) } {}
The phase shift
θθ size 12{θ} {} corresponds to the phase shift due to heterodyne operation; the uncertainty in propagation conditions; and the summation of multi-path arrivals with almost the same time delay, etc.
We will assume that the target echo amplitude,
AejθAejθ size 12{ ital "Ae" rSup { size 8{jθ} } } {},is a complex Gaussian random variable with zero mean and with standard deviation
σ2(h).σ2(h). size 12{σ rSup { size 8{2} } \( h \) "." } {}We are modeling the echo as having the same waveform as the transmission, but with an uncertain phase and amplitude. This is assuming that the target echo amplitude obeys Swerling target type I statistics with unknown phase.
While receiving an echo, we will also receive ambient noise,
q(t)q(t) size 12{q \( t \) } {}, which we will assume to be complex Gaussian noise, with constant power spectral density over the receiver’s bandwidth. The noise power spectral density over the receiver’s bandwidth BW is assumed to be
N0N0 size 12{N rSub { size 8{0} } } {}Pascals^2/Hertz.
We receive L complex valued samples,
yy size 12{y} {} as the ping history after heterodyning. For hypothesis h, the observation in discrete time is:
y(kΔt)=q(kΔt)+r(kΔt−2R/c∣h)y(kΔt)=q(kΔt)+r(kΔt−2R/c∣h) size 12{y \( kΔt \) =q \( kΔt \) +r \( kΔt - 2R/c \lline h \) } {} for ,
k=1,…Lk=1,…L size 12{k=1, dotslow L} {}
where
ΔtΔt size 12{Δt} {} is the digital sample rate after heterodyning, and q is a sample of the noise and reverberation. Note that
r(kΔt−2R(h)/c∣h)=0r(kΔt−2R(h)/c∣h)=0 size 12{r \( kΔt - 2R \( h \) /c \lline h \) =0} {} when
2R(h)/c>kΔt,2R(h)/c>kΔt, size 12{2R \( h \) /c>kΔt,} {} because the echo is delayed. The delay for hypothesis
hh size 12{h} {} in samples, is given by
D
(
h
)
=
2R
cΔt
D
(
h
)
=
2R
cΔt
size 12{D \( h \) = left [ { {2R} over {cΔt} } right ]} {}
where [x] is the nearest integer to x. We choose the sample rate
ΔtΔt size 12{Δt} {} to be small enough to satisfy the Nyquist sampling criteria for the received echo. We will assume that the non-zero part of the echo is N samples long.
We will represent the sampled echo response as a partitioned vector:
r(h)=0D(h)x1Aw0L−D(h)−Nr(h)=0D(h)x1Aw0L−D(h)−N size 12{r \( h \) = left [ matrix {
0 rSub { size 8{D \( h \) x1} } {} ##
Aw {} ##
0 rSub { size 8{L - D \( h \) - N} }
} right ]} {},
Where
w
=
w
(
Δt
)
⋮
w
(
NΔt
)
w
=
w
(
Δt
)
⋮
w
(
NΔt
)
size 12{w= left [ matrix {
w \( Δt \) {} ##
dotsvert {} ##
w \( NΔt \)
} right ]} {}
and the sampled noise and interference as a vector
q=q1⋮qLq=q1⋮qL size 12{q= left [ matrix {
q rSub { size 8{1} } {} ##
dotsvert {} ##
q rSub { size 8{L} }
} right ]} {},
so that the sampled ping history becomes
y
=
q
+
r
(
h
)
y
=
q
+
r
(
h
)
size 12{y=q+r \( h \) } {}
The echo is modeled as a known signal
ww size 12{w} {}, with Gaussian random complex amplitude A, with zero mean and variance
σA(h)2σA(h)2 size 12{σ rSub { size 8{A \( h \) } } rSup { size 8{2} } } {}. We will assume that
wHw=1wHw=1 size 12{w rSup { size 8{H} } w=1} {}, and that
∣A(h)∣2∣A(h)∣2 size 12{ lline A \( h \) rline rSup { size 8{2} } } {} is the energy of the echo, with units Pascals^2-seconds. Since
σA(h)2σA(h)2 size 12{σ rSub { size 8{A \( h \) } } rSup { size 8{2} } } {}is
E∣A(h)∣2E∣A(h)∣2 size 12{E lline A \( h \) rline rSup { size 8{2} } } {}, it has units of Pascals^2-seconds as well. The amplitude of the echo is a function of the target location hypothesis
hh size 12{h} {}. The location of
ww size 12{w} {}in
r(h)r(h) size 12{r \( h \) } {}depends on the location of the target through the time delay
D(h)D(h) size 12{D \( h \) } {}.
Since each element of the random vector
AwAw size 12{Aw} {} is complex Gaussian, the random vector
AwAw size 12{Aw} {}has a complex Gaussian distribution. The probability density of
AwAw size 12{Aw} {} is Gaussian zero mean with covariance matrix
σA(h)2wwHσA(h)2wwH size 12{σ rSub { size 8{A \( h \) } } rSup { size 8{2} } bold "ww" rSup { size 8{H} } } {}. To see this, consider that
E
(
Aw
)
=
E
(
A
)
w
=
0
N
E
(
Aw
)
=
E
(
A
)
w
=
0
N
size 12{E \( Aw \) =E \( A \) w=0 rSub { size 8{N} } } {}
The covariance of
AwAw size 12{Aw} {} is given by:
E
(
Aw
)
(
Aw
)
H
=
E
(
AA
H
)
ww
H
=
σ
A
(
h
)
2
ww
H
E
(
Aw
)
(
Aw
)
H
=
E
(
AA
H
)
ww
H
=
σ
A
(
h
)
2
ww
H
size 12{E \( Aw \) \( Aw \) rSup { size 8{H} } =E \( ital "AA" rSup { size 8{H} } \) bold "ww" rSup { size 8{H} } =σ rSub { size 8{A \( h \) } } rSup { size 8{2} } bold "ww" rSup { size 8{H} } } {}
hence
r(h)r(h) size 12{r \( h \) } {}is zero mean complex Gaussian with covariance matrix
σA(h)2rrHσA(h)2rrH size 12{σ rSub { size 8{A \( h \) } } rSup { size 8{2} } bold "rr" rSup { size 8{H} } } {}.
For the clutter only hypothesis
φ,φ, size 12{φ,} {}y=qy=q size 12{y=q} {}.
We have sampled, heterodyned and possibly re-sampled the noise process
q(t)q(t) size 12{q \( t \) } {}to form
qq size 12{q} {}.
During the period where r is non-zero,
qq size 12{q} {}is a sampled version of the ambient noise, represented as a N by 1 complex Gaussian noise random vector with zero mean and covariance matrix
(N0)IN(N0)IN size 12{ \( N rSub { size 8{0} } \) I rSub { size 8{N} } } {}. This is true because
BWΔt≈1BWΔt≈1 size 12{ ital "BW"Δt approx 1} {} for complex Nyquist sampling of a band-limited signal.
Overall, the noise and reverberation
qq size 12{q} {} is assumed to be complex Gaussian with zero mean and L by L covariance matrix
CC size 12{C} {}.
Because we are assuming that the reverberation dies away before the echoes from the target search arrive,
CC size 12{C} {} has the following partition:
C
=
R
0
0
N
0
I
C
=
R
0
0
N
0
I
size 12{C= left [ matrix {
R {} # 0 {} ##
0 {} # N rSub { size 8{0} } I{}
} right ]} {}
Matrix R has dimensions of
DminxDminDminxDmin size 12{D rSub { size 8{"min"} } ital "xD" rSub { size 8{"min"} } } {}, the minimum delay where the echo interference is dominated by Ambient noise.
Under target hypothesis
hh size 12{h} {} ,
yy size 12{y} {} is Gaussian with has zero mean and covariance matrix
C+σA2rrHC+σA2rrH size 12{C+σ rSub { size 8{A} rSup { size 8{2} } } bold "rr" rSup { size 8{H} } } {}.
The probability density of
yy size 12{y} {}under
hh size 12{h} {} becomes:
p(y∣h)=1πNdet(Cr+C)exp−yH(Cr+C)−1yp(y∣h)=1πNdet(Cr+C)exp−yH(Cr+C)−1y size 12{p \( y \lline h \) = { {1} over {π rSup { size 8{N} } "det" \( C rSub { size 8{r} } +C \) } } "exp" left ( - y rSup { size 8{H} } \( C rSub { size 8{r} } +C \) rSup { size 8{ - 1} } y right )} {},
where
Cr=σA2rrHCr=σA2rrH size 12{C rSub { size 8{r} } =σ rSub { size 8{A} rSup { size 8{2} } } bold "rr" rSup { size 8{H} } } {}.
Under the clutter hypothesis,
φ,φ, size 12{φ,} {} y has zero mean and covariance matrix
CC size 12{C} {}.The probability density of
yy size 12{y} {}under
φφ size 12{φ} {} becomes:
p
(
y
∣
φ
)
=
1
π
N
det
(
C
)
exp
−
y
H
(
C
)
−
1
y
p
(
y
∣
φ
)
=
1
π
N
det
(
C
)
exp
−
y
H
(
C
)
−
1
y
size 12{p \( y \lline φ \) = { {1} over {π rSup { size 8{N} } "det" \( C \) } } "exp" left ( - y rSup { size 8{H} } \( C \) rSup { size 8{ - 1} } y right )} {}