<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5//EN" "http://cnx.rice.edu/technology/cnxml/schema/dtd/0.5/cnxml_plain.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:bib="http://bibtexml.sf.net/" id="id12827086">
<name>The Autoregressive Model and Formant Analysis</name>
<metadata>
  <md:version>1.2</md:version>
  <md:created>2006/12/18 20:07:19 US/Central</md:created>
  <md:revised>2006/12/20 11:54:26.044 US/Central</md:revised>
  <md:authorlist>
      <md:author id="cpasich">
      <md:firstname>Chris</md:firstname>
      
      <md:surname>Pasich</md:surname>
      <md:email>cpasich@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="cpasich">
      <md:firstname>Chris</md:firstname>
      
      <md:surname>Pasich</md:surname>
      <md:email>cpasich@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  

  <md:abstract>An explanation of how individual syllables are anaylzed and broken down into vowel sounds and formants.</md:abstract>
</metadata>
<content>
<section id="id12815787">
<name>The Autoregressive Model</name>
<para id="id11417462">Interpreting this signal first begins with
determining an actual equation for the signal. The best way to do
that is by using an autoregressive model. An autoregressive model
is simply a model used to find an estimation of a signal based on
previous input values of the signal. The actual equation for the
model is as follows:</para>
<figure id="id10157962"><name>The Autoregressive Model</name>
<media type="image/png" src="Graphic1.png"/>
<caption>Wikipedia 2006</caption></figure>
<para id="id12762509">The model consists of three parts: a constant
part, an error or noise part, and the autoregressive summation. The
actual summation represents the fact that the current value of the
input depends only on previous values of the input. The variable p
represents the order of the model. The higher the order of the
system, the more accurate a representation it will be. Therefore,
as the order of the system approaches infinity, we get almost an
exact representation of our input system.</para>
<para id="id12512223">This system looks almost exactly like a
differential equation. In fact, this equation can be used to find
the transfer function for the signal.</para>
</section>
<section id="id12898556">
<name>Finding the Formants</name>
<para id="id12196670">Once you have the transfer function, you
merely need to get your enveloped syllables and pass them through
this transfer function. Once you take the frequency response of the
transfer function, you can get a very nice plot as its
output (Figure 1).</para>
<figure id="id12096873"><media type="image/png" src="Graphic2.png">
  <param name="height" value="400"/>
  <param name="width" value="600"/>
</media>

<caption>A sample frequency response.  The formants are the green points at the peaks.</caption></figure>
<para id="id12896073">This gives us something we can actually
interpret. Specifically, you can clearly see the formants of the
vowel – that is, you can see the peak values of the frequency
response. These peaks are what differentiate vowel sounds from one
another. For instance, looking at these vowel sounds, all from the
same person, there is a clear discrepancy in their
appearances (see Sample Formants).</para>
<para id="id12821819">
<figure id="id12821822"><name>Sample Formants</name>
  <subfigure id="subfig1">
    <media type="image/png" src="Graphic3.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>The "a" vowel sound.</caption>
  </subfigure>
  <subfigure id="subfig2">
    <media type="image/png" src="Graphic4.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>The "ah" vowel sound.</caption>
  </subfigure>

</figure>

</para>
<para id="id12604003">
<figure id="id12604005"><name>Sample Formants</name>
<subfigure id="subfig4">
    <media type="image/png" src="Graphic5.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>The "ee" vowel sound.</caption>
  </subfigure>
  <subfigure id="subfig5">
    <media type="image/png" src="Graphic6.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>The "ah" vowel sound.</caption>
  </subfigure>

</figure>

</para>
<para id="id9642841">Examining the first two formants, there are
clear differences between where they occur and their magnitude in
each vowel sound. These peak values will also be different from
person to person, even for the same vowel. For instance, compare
the sound ‘a’ (as in cat) for each member of the group (see Speaker Vowel Comparisons).</para>
<para id="id12766970"><figure id="id12766973"><name>Speaker Vowel Comparisons</name>
<subfigure id="subfig6">
    <media type="image/png" src="Graphic7.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>Damen Hattori's "a" sound.</caption>
  </subfigure>
  <subfigure id="subfig7">
    <media type="image/png" src="Graphic8.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>Chris Pasich's "a" sound.</caption>
  </subfigure></figure>
</para>
<para id="id10449303"><figure id="id10449306"><name>Speaker Vowel Comparisons</name>
<subfigure id="subfig8">
    <media type="image/png" src="Graphic9.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>Matt McDonell's "a" sound.</caption>
  </subfigure>
  <subfigure id="subfig9">
    <media type="image/png" src="Graphic10.png">
        <param name="height" value="250"/>
        <param name="width" value="400"/>
    </media>
    <caption>Josh Long's "a" sound.</caption>
  </subfigure>
</figure>
</para>
<para id="id12729140">Even though the structure of the frequency
responses are similar, the vowel sounds each have slightly
different formants, both in the frequency at which they occur and
the height that they attain. So finally, we have some way to
analyze our signal. All that remains is the final step – comparing
these formants to the formants of the whole group.</para>
</section>
</content>
</document>
