<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="Module.2003-12-08.0101">
  <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Music Classification by Genre: Power Spectral Density</name>
  <metadata xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
  <md:version xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">1.2</md:version>
  <md:created xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2003/12/08 15:01:02 US/Central</md:created>
  <md:revised xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2003/12/08 15:12:51.348 US/Central</md:revised>
  <md:authorlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:author xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="mchu">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Melodie</md:firstname>
      <md:othername xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">M</md:othername>
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Chu</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">mchu@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:maintainer xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="mitali">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Mitali</md:firstname>
      
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Banerjee</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">mitali@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">power</md:keyword>
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">spectral</md:keyword>
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">density</md:keyword>
  </md:keywordlist>

  <md:abstract xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">The power spectral density is a measure of how the power in a signal changes over frequency.  </md:abstract>
</metadata>

  <content xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="pwrspec">
Our program essentially breaks the time-domain signal into windows and computes the norm squared of the FFT of each window.  It then averages the magnitude squared of the FFT coefficients of each window, then represents it in decibels.  We then have a vector approximately length 100 that represents the power in the frequency domain.  This is a measure of exactly what frequencies are present and at what magnitude.  Rather than using a single number to characterize the whole signal, our power spectral density program returns a vector representing more subtle changes in the spectrum.  The decibel scale helps distinguish and differentiate between genres even further, fanning out the differences between genres.  
    </para> 

<figure xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="psd">
  <media xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" type="image/jpeg" src="psd.gif"/>
</figure>
  
<section xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="results">
  <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Results</name>
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="pwrspecresults">
The power spectral density was great at showing patterns between genres.  Rap has the most distinct pattern, with a sudden downward slope (red).  Classical also had a distinctive pattern, with the smallest power at all frequencies.  Jazz, punk, and country are all near each other, but at higher frequencies, begin to fan out.  Looking closely at the envelopes, techno spans the largest area, encapsulating almost all of jazz, punk, and country.  This is one reason why techno could not be distinguished very well from those genres.  
    </para> 

<figure xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="powerspec">
  <media xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" type="image/jpeg" src="powerspec.gif"/>
</figure>

</section>


  </content>
  
</document>
