Skip to content Skip to navigation

Connexions

You are here: Home » Content » Voice Randomization

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the authors

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual Connexions member, a community, or a respected organization.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • Rice University ELEC 301 Projects

    This module is included inLens: Rice University ELEC 301 Project Lens
    By: Rice University ELEC 301As a part of collection:"ELEC 301 Projects Fall 2004"

    Click the "Rice University ELEC 301 Projects" link to see all content affiliated with them.

Recently Viewed

This feature requires Javascript to be enabled.

Voice Randomization

Module by: Robert Ahlfinger, Brenton Cheeseman, Patrick Doody

Summary: This is an overview of the techniques used to develope a voice randomization program.

Initial Approach

When two people speak with the same pitch, there is still no mistaking one for the other; the uniqueness of a voice goes beyond its tone. The placement of harmonics, then, clearly does not make a voice distinguishable since two people with identical pitch have harmonics at exactly the same locations. Rather, the ability to identify a voice comes from the relative height of each harmonic to the next, just like the heights of each harmonic on a clarinet and a guitar make these instruments sound different even as they play the same note.

Figure 1: DFT of one 512 sample chunck of a speech signal after it has had it pitches randomly altered.
DFT of Randomized Signal
DFT of Randomized Signal (random.bmp)

With this in mind, our first algorithm tackled the problem by first using the harmonic detection described earlier to pinpoint the location of each harmonic. Using this information, the height of each harmonic was randomly lowered or raised by a slight amount. Usually, though, the resulting voice sounded just like the original with some noise added in on top of it. After fooling around with this concept for some time to no avail, we reached the conclusion that the idea is solid, but that to make up a new voice requires much more finesse than simply making the magnitude of each harmonic higher or lower. Without perfectly adapting the phases and making sure that the envelope of the magnitudes is a shape that can be comprehended by a human ear as real speech, the only result is linearly adding a new signal to our old one. The DFT of the new signal is equal to the additions we made to the harmonics of the voice.

Simplification of Process Using the Speech Synthesizer

The second attempt at a voice randomizer directly utilizes our pitch shifting algorithm and works much better. First, the signal is matricized just like before. But instead of processing each chunk in the same way, our algorithm asks the pitch shifter to shift each chunk separately, specifying a different and random shift every time. The result is a voice with a pitch that changes wildly and extremely quickly, making it impossible to tell who it is with your raw hearing. The main drawback with this technique is that there is no true security or identity masking. The NSA could easily break the signal into the same 512 sample long chunks and analyze them individually along with a normal sample of the voice to determine a potential match. However, for certain purposes this randomizer performs superbly.

Randomized Speech Examples
Unaltered voice Original
Randomized Voice Random

Comments, questions, feedback, criticisms?

Send feedback