<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="None">
  <name>Perfect Pitch:  Using Software to Alter Your Voice</name>
  <metadata>
  <md:version>1.6</md:version>
  <md:created>2004/12/15 02:22:17 US/Central</md:created>
  <md:revised>2006/09/27 11:38:29.381 GMT-5</md:revised>
  <md:authorlist>
      <md:author id="ahlfing">
      <md:firstname>Robert</md:firstname>
      
      <md:surname>Ahlfinger</md:surname>
      <md:email>ahlfing@rice.edu</md:email>
    </md:author>
      <md:author id="bcheese">
      <md:firstname>Brenton</md:firstname>
      
      <md:surname>Cheeseman</md:surname>
      <md:email>bcheese@rice.edu</md:email>
    </md:author>
      <md:author id="pdoody">
      <md:firstname>Patrick</md:firstname>
      
      <md:surname>Doody</md:surname>
      <md:email>pdoody@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="ahlfing">
      <md:firstname>Robert</md:firstname>
      
      <md:surname>Ahlfinger</md:surname>
      <md:email>ahlfing@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="bcheese">
      <md:firstname>Brenton</md:firstname>
      
      <md:surname>Cheeseman</md:surname>
      <md:email>bcheese@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="pdoody">
      <md:firstname>Patrick</md:firstname>
      
      <md:surname>Doody</md:surname>
      <md:email>pdoody@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>Elec 301</md:keyword>
    <md:keyword>pitch</md:keyword>
    <md:keyword>wyld stallyns</md:keyword>
  </md:keywordlist>

  <md:abstract>This is an overview of three different speach/sound synthesis tools.</md:abstract>
</metadata>

  <content>

<section id="sec1">

<name>Introduction on Speech Synthesis</name>
<section id="sec12">
   <name>Overview</name> 
<para id="overviewp1">

The ability to tweak or manipulate a person’s voice has always been useful.  Before the advent of digital signal processing, however, this task was extremely hard.  In those days, the most sophisticated manipulations could be found in rock and roll synthesizers that used analog devices to distort the noise, producing a pseudo random feel to the voice.  Other naïve approaches could be taken to alter somebody’s voice, such as changing the playback speed of the clip or modulating the signal.  However, these techniques just resulted in making the person speak with a lower pitch that sounded slurred or a higher pitch that resembled Alvin the Chipmunk.  
</para>
</section>

<section id="sec13">
<name>Goals</name>
<para id="overviewp2">
The goal of this project was to develop a more sophisticated set of voice manipulation tools using digital signal processing by developing software in Matlab.  The first and most complicated tool raises or lowers the pitch of a recorded voice without changing the length of the sound or otherwise changing the characteristics of the voice.  The second changes the length of the clip without altering the pitch of the voice.  The resulting voices from both of these tools should sound as if the original clip had been recorded anew while instructing the person doing the talking to speak more slowly, more quickly, or with a higher or lower pitch.  Finally, the third tool randomizes the voice in order to mask the identity of the speaker yet preserve her ability to communicate.
</para>
</section>

<section id="sec14">
<name>Applications and Examples</name>
<para id="overviewp3">

As you can imagine, there are several potential applications for our new software.  An out of tune singer can go back after a recording and tweak his or her voice to match precisely the correct tone regardless of whether the problem persists for the duration of the song or a fraction of a second.  If a newscaster’s segment goes over or under the preferred time allotment by a few seconds, his or her speech may be reduced or extended by exactly the necessary amount.

	<table frame="all" id="examples">
<name>Pitch Shifted and Randomized Speech Examples</name>
<tgroup cols="2" align="left" colsep="1" rowsep="1"><colspec colnum="2" colname="c2"/>
	    <colspec colnum="4" colname="c4"/>
	    <tbody valign="top">
	      <row>
	        <entry>Unaltered voice</entry>
	        <entry><link src="fudge.wav">Original</link></entry>
	      </row>
	      <row>
	        <entry>Pitch Shifted Voice</entry>
	        <entry>
<link src="fudge100.wav">Up</link> 
<link src="fudge-50.wav">Down</link></entry>
	      </row>
              <row>
	        <entry>Randomized Voice</entry>
	        <entry><link src="randfudge.wav">Random</link></entry>
	      </row>
	    </tbody>
	  


</tgroup>
</table>


<table frame="all" id="examples2">
<name>Length Changer Speech Examples</name>
<tgroup cols="2" align="left" colsep="1" rowsep="1"><colspec colnum="2" colname="c2"/>
	    <colspec colnum="4" colname="c4"/>
	    <tbody valign="top">
	      <row>
	        <entry>Unaltered voice</entry>
	        <entry><link src="male.wav">Original</link></entry>
	      </row>
	      <row>
	        <entry>Length Changed Voice</entry>
	        <entry>
<link src="male2.wav">Slower</link>
<link src="male3.wav">Faster</link>
</entry>
	      </row>
              	    </tbody>
	  
</tgroup>
</table>


</para>
</section>
</section>
  </content>
  
</document>
