Fundamental frequency (F0 or "pitch") is the physical
variable that reflects what we perceive as intonation or tone of voice.
It corresponds to the frequencies of the large, regular oscillations perceived
in the voiced elements of speech. These oscillations originate in the vibrations
of the larynx. Intonation can distinguish a question from a response,
express exclamation, or emphasize certain words in a sentence. Intonation
is used differently from one language or dialect to another. Certain languages
such as Chinese or Vietnamese use tone to distinguish one word from another.
The voice's fundamental frequency is not the only physical variable associated
with intonation. In English, syllables that seem to be "high" are
also characterized by greater amplitudes and are longer than those perceived
as being low.
In sum, the analysis of temporal structure attempts to identify the
major highs and lows, visible in voiced segments of the speech signal.
The correct identification of the temporal distance between two peaks or
two valleys permits us to calculate the fundamental frequency, in other
words, how many such peaks or valleys are produced per second. (Taken
from an explanation by E. Keller in the Signalyze˘ Users Manual, 1992.)