Skip to content Skip to navigation

OpenStax_CNX

You are here: Home » Content » Voice Randomization

Navigation

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • Rice University ELEC 301 Projects

    This module is included inLens: Rice University ELEC 301 Project Lens
    By: Rice University ELEC 301As a part of collection: "ELEC 301 Projects Fall 2004"

    Click the "Rice University ELEC 301 Projects" link to see all content affiliated with them.

  • Rice Digital Scholarship

    This module is included in aLens by: Digital Scholarship at Rice UniversityAs a part of collection: "ELEC 301 Projects Fall 2004"

    Click the "Rice Digital Scholarship" link to see all content affiliated with them.

Also in these lenses

  • Lens for Engineering

    This module is included inLens: Lens for Engineering
    By: Sidney BurrusAs a part of collection: "ELEC 301 Projects Fall 2004"

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

Voice Randomization

Module by: Robert Ahlfinger, Brenton Cheeseman, Patrick Doody. E-mail the authors

Summary: This is an overview of the techniques used to develope a voice randomization program.

Initial Approach

When two people speak with the same pitch, there is still no mistaking one for the other; the uniqueness of a voice goes beyond its tone. The placement of harmonics, then, clearly does not make a voice distinguishable since two people with identical pitch have harmonics at exactly the same locations. Rather, the ability to identify a voice comes from the relative height of each harmonic to the next, just like the heights of each harmonic on a clarinet and a guitar make these instruments sound different even as they play the same note.

Figure 1: DFT of one 512 sample chunck of a speech signal after it has had it pitches randomly altered.
DFT of Randomized Signal
DFT of Randomized Signal (random.bmp)

With this in mind, our first algorithm tackled the problem by first using the harmonic detection described earlier to pinpoint the location of each harmonic. Using this information, the height of each harmonic was randomly lowered or raised by a slight amount. Usually, though, the resulting voice sounded just like the original with some noise added in on top of it. After fooling around with this concept for some time to no avail, we reached the conclusion that the idea is solid, but that to make up a new voice requires much more finesse than simply making the magnitude of each harmonic higher or lower. Without perfectly adapting the phases and making sure that the envelope of the magnitudes is a shape that can be comprehended by a human ear as real speech, the only result is linearly adding a new signal to our old one. The DFT of the new signal is equal to the additions we made to the harmonics of the voice.

Simplification of Process Using the Speech Synthesizer

The second attempt at a voice randomizer directly utilizes our pitch shifting algorithm and works much better. First, the signal is matricized just like before. But instead of processing each chunk in the same way, our algorithm asks the pitch shifter to shift each chunk separately, specifying a different and random shift every time. The result is a voice with a pitch that changes wildly and extremely quickly, making it impossible to tell who it is with your raw hearing. The main drawback with this technique is that there is no true security or identity masking. The NSA could easily break the signal into the same 512 sample long chunks and analyze them individually along with a normal sample of the voice to determine a potential match. However, for certain purposes this randomizer performs superbly.

Table 1: Randomized Speech Examples
Unaltered voice Original
Randomized Voice Random

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks