Parameterization of vocal tract area functions by empirical orthogonal modes

Brad H Story, Ingo R. Titze

Research output: Contribution to journalArticle

60 Citations (Scopus)

Abstract

A set of ten vowel area functions, based on MRI measurements, has been parameterized by an "empirical orthogonal mode decomposition" which accurately represents each area function as the sum of the mean area function and proportional amounts of a series of orthogonal basis functions. The mean area function was found to possess a formant structure similar to that of a uniform tube (i.e., nearly equally spaced formants) suggesting that empirical orthogonal modes are perturbations on the mean (∼ neutral) vowel shape much like past vocal tract analyses have considered perturbations on a uniform tube. The acoustic characteristics of the two most significant empirical orthogonal modes were examined, showing that both modes tend to increase the first formant as the modal amplitude coefficients are both increased from negative to positive values. However, the second formant was found to decrease in frequency for increasing values of the first modal coefficient and to increase for increasing values of the second mode coefficient. Next, a mapping between F1-F2 formant pairs and vocal tract area functions is proposed which is largely one-to-one but was initially limited by a constant vocal tract length. A possible method to include variable vocal tract length and higher ordered orthogonal modes in the mapping is given. The mode-to-formant mapping suggested the possibility of an inverse mapping to determine physiologically realistic area functions from a speech waveform and a simple example is presented. Finally, empirical orthogonal modes for a collection of ten vowels and eight consonants were derived and showed many similarities to those for the vowel-only case.

Original languageEnglish (US)
Pages (from-to)223-260
Number of pages38
JournalJournal of Phonetics
Volume26
Issue number3
StatePublished - Jul 1998
Externally publishedYes

Fingerprint

Acoustics
Values
Vocal Tract
acoustics
Formants

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Parameterization of vocal tract area functions by empirical orthogonal modes. / Story, Brad H; Titze, Ingo R.

In: Journal of Phonetics, Vol. 26, No. 3, 07.1998, p. 223-260.

Research output: Contribution to journalArticle

@article{01ee2cbc4e6b42ffbe3c805ffd77ba5e,
title = "Parameterization of vocal tract area functions by empirical orthogonal modes",
abstract = "A set of ten vowel area functions, based on MRI measurements, has been parameterized by an {"}empirical orthogonal mode decomposition{"} which accurately represents each area function as the sum of the mean area function and proportional amounts of a series of orthogonal basis functions. The mean area function was found to possess a formant structure similar to that of a uniform tube (i.e., nearly equally spaced formants) suggesting that empirical orthogonal modes are perturbations on the mean (∼ neutral) vowel shape much like past vocal tract analyses have considered perturbations on a uniform tube. The acoustic characteristics of the two most significant empirical orthogonal modes were examined, showing that both modes tend to increase the first formant as the modal amplitude coefficients are both increased from negative to positive values. However, the second formant was found to decrease in frequency for increasing values of the first modal coefficient and to increase for increasing values of the second mode coefficient. Next, a mapping between F1-F2 formant pairs and vocal tract area functions is proposed which is largely one-to-one but was initially limited by a constant vocal tract length. A possible method to include variable vocal tract length and higher ordered orthogonal modes in the mapping is given. The mode-to-formant mapping suggested the possibility of an inverse mapping to determine physiologically realistic area functions from a speech waveform and a simple example is presented. Finally, empirical orthogonal modes for a collection of ten vowels and eight consonants were derived and showed many similarities to those for the vowel-only case.",
author = "Story, {Brad H} and Titze, {Ingo R.}",
year = "1998",
month = "7",
language = "English (US)",
volume = "26",
pages = "223--260",
journal = "Journal of Phonetics",
issn = "0095-4470",
publisher = "Academic Press Inc.",
number = "3",

}

TY - JOUR

T1 - Parameterization of vocal tract area functions by empirical orthogonal modes

AU - Story, Brad H

AU - Titze, Ingo R.

PY - 1998/7

Y1 - 1998/7

N2 - A set of ten vowel area functions, based on MRI measurements, has been parameterized by an "empirical orthogonal mode decomposition" which accurately represents each area function as the sum of the mean area function and proportional amounts of a series of orthogonal basis functions. The mean area function was found to possess a formant structure similar to that of a uniform tube (i.e., nearly equally spaced formants) suggesting that empirical orthogonal modes are perturbations on the mean (∼ neutral) vowel shape much like past vocal tract analyses have considered perturbations on a uniform tube. The acoustic characteristics of the two most significant empirical orthogonal modes were examined, showing that both modes tend to increase the first formant as the modal amplitude coefficients are both increased from negative to positive values. However, the second formant was found to decrease in frequency for increasing values of the first modal coefficient and to increase for increasing values of the second mode coefficient. Next, a mapping between F1-F2 formant pairs and vocal tract area functions is proposed which is largely one-to-one but was initially limited by a constant vocal tract length. A possible method to include variable vocal tract length and higher ordered orthogonal modes in the mapping is given. The mode-to-formant mapping suggested the possibility of an inverse mapping to determine physiologically realistic area functions from a speech waveform and a simple example is presented. Finally, empirical orthogonal modes for a collection of ten vowels and eight consonants were derived and showed many similarities to those for the vowel-only case.

AB - A set of ten vowel area functions, based on MRI measurements, has been parameterized by an "empirical orthogonal mode decomposition" which accurately represents each area function as the sum of the mean area function and proportional amounts of a series of orthogonal basis functions. The mean area function was found to possess a formant structure similar to that of a uniform tube (i.e., nearly equally spaced formants) suggesting that empirical orthogonal modes are perturbations on the mean (∼ neutral) vowel shape much like past vocal tract analyses have considered perturbations on a uniform tube. The acoustic characteristics of the two most significant empirical orthogonal modes were examined, showing that both modes tend to increase the first formant as the modal amplitude coefficients are both increased from negative to positive values. However, the second formant was found to decrease in frequency for increasing values of the first modal coefficient and to increase for increasing values of the second mode coefficient. Next, a mapping between F1-F2 formant pairs and vocal tract area functions is proposed which is largely one-to-one but was initially limited by a constant vocal tract length. A possible method to include variable vocal tract length and higher ordered orthogonal modes in the mapping is given. The mode-to-formant mapping suggested the possibility of an inverse mapping to determine physiologically realistic area functions from a speech waveform and a simple example is presented. Finally, empirical orthogonal modes for a collection of ten vowels and eight consonants were derived and showed many similarities to those for the vowel-only case.

UR - http://www.scopus.com/inward/record.url?scp=0032116042&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032116042&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0032116042

VL - 26

SP - 223

EP - 260

JO - Journal of Phonetics

JF - Journal of Phonetics

SN - 0095-4470

IS - 3

ER -