Skip to content Skip to navigation

Connexions

You are here: Home » Content » Conclusion

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

This feature requires Javascript to be enabled.

Conclusion

Module by: Oleg Pesok

Summary: This module contains a conclusion and summary of work done on a real-time laugh track removal system. It is part of a larger series discussing the implementation of this system.

Summary

We have looked at the various ways that a laugh track can be detected and how to implement a laugh track detection algorithm as a Directshow filter. The most effective way of detecting a laugh track appears to be looking at the waveform in the time domain instead of in the frequency domain. The algorithm to use for envelope detection can be changed and become increasingly more sophisticated as we attempt to catch and process correctly some of the more difficult cases.

Detection Techniques

There are multiple ways to generate the envelope we use for detecting a laugh track. The simplest method is to square the signal and then low pass filter it, which gives quite good results. An even better algorithm involves using a Hilbert transform to generate the envelope, but this was too computationally expensive to implement in our Directshow filter.

We have also briefly discussed some of the ways to detect the envelope once we have generated a signal to pick out laughs from. On the simpler range of the spectrum, we can have a width and height thresholds for the laugh portion of the signal's envelope. On the other hand, we could also fit laugh tracks to polynomial curves and then use a Support Vector Machine to detect laugh tracks based on a database of positive and negative matches for laugh tracks.

In the end, we found that detecting and removing laugh tracks from an audio signal is much more complicated than it may appear on the surface. As a human, it is very easy to spot a laugh in the signal, but to create a system that can do this automatically is more complicated. Another interesting aspect of the system is how to remove the laugh tracks, which can be complicated because people often talk over the laugh tracks in our signals. To remove only the laugh track while leaving in the human voice signals is something that needs further exploration.

Comments, questions, feedback, criticisms?

Send feedback