Skip to content Skip to navigation

OpenStax_CNX

You are here: Home » Content » User Manual (Analysis of Speech Signal Spectrums Using the L2 Norm)

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

User Manual (Analysis of Speech Signal Spectrums Using the L2 Norm)

Module by: Nicholas. E-mail the author

Summary: User Manual for Speech Analysis Project

III – User Manual

The software for this project was written in Matlab. The main codes is voiceRecognition.m. The prototype for the voiceRecognition functions is as follows:

trueMatch = voiceRecognition( username, pin, thresh,

candidateName )

It is assumed that each user has a username. In the files packaged with this report, the two users present in the database are ‘Nicholas’ and ‘Andrew’1. The username parameter for the voiceRecognition is a 1D character array with characters equal to the username.

The pin parameter represents the user’s PIN. It is a four element 1D array containing integer values 0 – 9.

thresh is an optional parameter. There is a threshold associated with the final matching algorithm (discussed later). By default, this threshold is set to 0.48; but it can be changed by passing a double value between 0 and 1 into this parameter.

candidateName is an optional parameter. The candidate is the person whose recording will be compared to the username’s database. In practice, the candidate name and the username would always be the same. However, for testing purposes, it was convenient to be able to specify a different candidate – e.g. we would specify the username as ‘Nicholas’ and the candidate as ‘Andrew’ to verify that the software did indeed detect an impostor.

(3.1) shows an example of a voice recognition function call.

Table 1
voiceRecognition( 'Nicholas' , [0,1,2,3] ); (3.1)

Directory Structure

The voiceRecognition software assumes a specific directory structure. There is a directory called ‘recordings’ in the same location as the Matlab working directory. Within the recordings directory are two directories: ‘current’ and ‘person’. The person directory contains the database of users who have previously entered their data into the system – i.e. it is the database of valid users. The current directory contains recordings of candidates.

Within the person directory, each user is granted his/her own directory. For example, there is a directory present named ‘Nicholas’. Within ‘Nicholas’ there should be seven wav file recordings of each pin number. For example, if the pin were 1-8-1-9, then there should be seven recordings of “one”, “eight”, and “nine”. The recordings must be labeled num1.wav through num7.wav.

Within the current directory, there again should be a directory for each candidate that would be presented to the software. In our case, we present no other candidates than those included in the user database. Within this directory, there must be recordings for all PIN numbers. In the above example, there should be present the files one.wav, eight.wav, and nine.wav.

Assumptions

There can be a significant amount of variability in the frequencies that a person uses to generate a word. The assumption made in this software is that the person generates the word attempting to use the same pitch and speed as previously conducted when submitting a phrase for comparison.

We now go on to describe the algorithms implemented in the software.

Footnotes

  1. Andrew is Nicholas’ roommate and was kind enough to allow himself to be recorded for the purposes of this project.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks