The key to PSOLA is the determination and utilization of pitch markers in the original signals. The idea is that these markers should be equally spaced throughout the signal (at intervals equal to the detected fundamental period), but also that they should be placed at a location for which the signal has a maximum value (a peak). These two constraints are often in conflict, especially since our assumption that the fundamental period is constant for the entire window is not entirely true. As a result, following the highest peak in the signal from period to period may require relaxing the requirement that the markers be exactly equally spaced. On the other hand, if we only follow the maximum peak without regard for the fundamental period, our markers no longer have any regard for the pitch of the window and are not useful.
In order to strike this compromise, we created a matrix where each column contains two periods of the signal and the center row starts at 0 and increments by one period each column. Then we used a dynamic path finding algorithm (created by Vladimir Goncharoff and Patrick Gries from the University of Chicago in Illinois) to find a path that went through the maximum peak as much as possible, but which did not exceed a given slope as it went through the matrix. Since a slope of 0 (horizontal line) means the markers are equally spaced, the slope is the factor that is adjusted to strike the compromise between following peaks and maintaining periodicity. Empirically, we found a suitable value of this slope to be around 4. In the diagram below these pitch marks are labeled as mi-1, mi and mi+1.
| Pitch Markers across windows |
|---|
![]() |
The matrix described is pictured graphically in the top graph of the figure above (cyan is zero, dark blue is negative, red is positive). Below that is a matrix that shows the two periods around these pitch markers (found by this path), which the pitch marker itself in the center of each column. As you can see, the peaks seem to move across the matrix in a straight line, meaning that when we overlap and add these segments, the peaks will be added on top of one another. This reduces phase problems with constructive and destructive interference between the peaks (which is why the algorithm is pitch-synchronous).
Having marked the boundaries of the regions to extract from the original signal, their new locations need to be defined (where they will end up in the output signal). A vector of new pitch markers is created, which begins with the first old pitch marker (found above), which is the phase offset, and then equally spaced at intervals equal to the desired fundamental period. For each new marker, the closest marker in the original signal is found and the two periods centered around that marker are Hanning windowed and copied to the output signal, centered about the new marker. Depending on whether the frequency is being raised or lowered, some pitch markers in the original signal may be used more than once, or not at all. The result of all this is a signal whose waveform retains the shape of the original, but has a shorter or longer period (depending on the amount of shift and in which direction). Hence, the pitch is shifted without altering the qualities of the voice that produced the sound.
| Original signal modified using PSOLA algorithm |
|---|
![]() |







Results







"assafdf"