<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="Module.2003-12-08.0101">
  <name>Music Classification by Genre: Power Spectral Density</name>
  <metadata>
  <md:version>1.2</md:version>
  <md:created>2003/12/08 15:01:02 US/Central</md:created>
  <md:revised>2003/12/08 15:12:51.348 US/Central</md:revised>
  <md:authorlist>
    <md:author id="mchu">
      <md:firstname>Melodie</md:firstname>
      <md:othername>M</md:othername>
      <md:surname>Chu</md:surname>
      <md:email>mchu@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="mitali">
      <md:firstname>Mitali</md:firstname>
      
      <md:surname>Banerjee</md:surname>
      <md:email>mitali@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>power</md:keyword>
    <md:keyword>spectral</md:keyword>
    <md:keyword>density</md:keyword>
  </md:keywordlist>

  <md:abstract>The power spectral density is a measure of how the power in a signal changes over frequency.  </md:abstract>
</metadata>

  <content>
    <para id="pwrspec">
Our program essentially breaks the time-domain signal into windows and computes the norm squared of the FFT of each window.  It then averages the magnitude squared of the FFT coefficients of each window, then represents it in decibels.  We then have a vector approximately length 100 that represents the power in the frequency domain.  This is a measure of exactly what frequencies are present and at what magnitude.  Rather than using a single number to characterize the whole signal, our power spectral density program returns a vector representing more subtle changes in the spectrum.  The decibel scale helps distinguish and differentiate between genres even further, fanning out the differences between genres.  
    </para> 

<figure id="psd">
  <media type="image/jpeg" src="psd.gif"/>
</figure>
  
<section id="results">
  <name>Results</name>
    <para id="pwrspecresults">
The power spectral density was great at showing patterns between genres.  Rap has the most distinct pattern, with a sudden downward slope (red).  Classical also had a distinctive pattern, with the smallest power at all frequencies.  Jazz, punk, and country are all near each other, but at higher frequencies, begin to fan out.  Looking closely at the envelopes, techno spans the largest area, encapsulating almost all of jazz, punk, and country.  This is one reason why techno could not be distinguished very well from those genres.  
    </para> 

<figure id="powerspec">
  <media type="image/jpeg" src="powerspec.gif"/>
</figure>

</section>


  </content>
  
</document>
