<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5//EN" "http://cnx.rice.edu/technology/cnxml/schema/dtd/0.5/cnxml_plain.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:bib="http://bibtexml.sf.net/" id="id8935725">
<name>The Final Step: Identifying the Speaker</name>
<metadata>
  <md:version>1.2</md:version>
  <md:created>2006/12/18 20:51:16 US/Central</md:created>
  <md:revised>2007/01/08 11:19:45.081 US/Central</md:revised>
  <md:authorlist>
      <md:author id="cpasich">
      <md:firstname>Chris</md:firstname>
      
      <md:surname>Pasich</md:surname>
      <md:email>cpasich@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="cpasich">
      <md:firstname>Chris</md:firstname>
      
      <md:surname>Pasich</md:surname>
      <md:email>cpasich@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  

  <md:abstract>Explains how formants are compared, and how a speaker is identified.</md:abstract>
</metadata>
<content>
<section id="id5289470">
<name>Formant Comparisons and Identifying the Speaker</name>
<para id="id9326389">After everything is broken down, all that is
left for the system to do is the easy part – make a simple
comparison between the input formants and the formant in the
database. The first step is in determining which vowel is actually
being spoken. This is simply an examination of the location of the
first two formant peaks. If they both fall within the range of a
specific vowel’s first two formants, they are representing that
vowel. That range is stored within the database. These ranges are
very well defined for each individual vowel and are adjusted to the
members of the group. For example, the first formant of a vowel has
a range that will include formants at frequencies just above the
highest frequency first formant in the group and just below the
lowest frequency first formant. If it does not fall in the range of
the vowel, that vowel is not the correct one, and it continues to
try the next vowel. It repeats this process until either it finds a
vowel or goes through all vowel sounds in the database. If the
formants do not fall within any particular formant range, the vowel
sound will be ignored.</para>
<para id="id7683214">The second step is the actual comparison. The
frequency response of the input vowel sound is multiplied in a dot
product with each member’s previously stored frequency response for
the vowel. This is the vowel that was determined in the first step.
A resulting score matrix is produced from the dot product. The
score matrix will output a value from 0 to 1, with 1 being a
perfect match and a 0 being an entirely incorrect match.</para>
<para id="id11300057">This process is repeated for each vowel sound
in the word. The score matrices are then added together, and the
system identifies the speaker as the individual with the highest
score. If, however, that individual does not pass a threshold
value, then the system determines there is no match.</para>
</section>
</content>
</document>
