<学術雑誌論文>
A method for estimating vocal-tract shape from a target speech spectrum

作成者
本文言語
出版者
発行日
収録物名
開始ページ
終了ページ
出版タイプ
アクセス権
権利関係
関連DOI
関連URI
関連HDL
概要 We present a method to simultaneously estimate the cross-sectional area and length of the vocal tract from a speech spectrum. An iterative procedure determines the vocal-tract shape by gradually optim...izing the parameter values to produce the target speech spectrum. The vocal-tract shape is updated in each iteration using a sensitivity function representing the change in formant frequency caused by a slight perturbation of the vocal-tract shape. Our method effectively optimizes the vocal-tract shape when combined with the perturbation relationship between the speech spectrum parameters (i.e., cepstral parameters) and formants. The estimation accuracy is examined using area function data for 10 English vowels (Story and Titze, J. Phon., 26, 223–260, 1998). The resulting average errors are 0.36 cm2 for the cross-sectional area and 0.21 cm for the vocal-tract length. This corresponds to a 17.6% and 1.24% error, respectively. The formant frequency recovered from the estimated vocal-tract shape has an error of less than 4% for each of the first four formants. We also determine that the fundamental frequency of the target speech spectrum has an influence on the estimation accuracy.続きを見る

本文ファイル

pdf 7178814 pdf 908 KB 17  

詳細

PISSN
EISSN
NCID
レコードID
主題
タイプ
助成情報
登録日 2024.05.28
更新日 2024.05.29