Throat to Acoustic Speech Mapping for Spectral Parameter Correction using Artificial Neural Network Approach - 九大コレクション | 九州大学附属図書館

検索結果一覧に戻る

＜会議発表論文＞
Throat to Acoustic Speech Mapping for Spectral Parameter Correction using Artificial Neural Network Approach

作成者	作成者名 Subrata Kumer Paul 所属機関所属機関名 Department of Computer Science and Engineering, University of Rajshahi
	作成者名 Rakhi Rani Paul 所属機関所属機関名 Department of Computer Science and Engineering, University of Rajshahi
	著者識別子 60740363 作成者名 Nishimura, Masafumi 西村, 雅史ニシムラ, マサフミ所属機関所属機関名 Faculty of Informatics, Shizuoka University 静岡大学
	作成者名 Hamid, Md. Ekramul 所属機関所属機関名 Department of Computer Science and Engineering, University of Rajshahi
本文言語	英語
出版者	Interdisciplinary Graduate School of Engineering Sciences, Kyushu University
出版者	九州大学大学院総合理工学府
発行日	2020-10-22
収録物名	Proceedings of International Exchange and Innovation Conference on Engineering & Sciences (IEICES)
巻	6
開始ページ	238
終了ページ	242
会議情報	会議名 International Exchange and Innovation Conference on Engineering & Science(IEICES) 回次 6 主催機関 Interdisciplinary Graduate School Of Engineering Sciences (IGSES) Kyushu University 開催期間 2020-10-22～2020-10-23 開催地 Kyushu University 九州大学開催国日本
出版タイプ	Version of Record
アクセス権	open access
Crossref DOI	https://doi.org/10.5109/4102497
概要	In throat microphone (TM), two skin attached piezo-electric sensors can capture speech sound signals from the tissue vibration. Because of their small bandwidth, throat microphone recorded speech is r...obust to the surrounding noise but suffers from intelligibility and naturalness problems. This study addresses the issue of improving the perceptual quality of the throat microphone speech is based on the statistical mapping between the features of TM and AM speech using the Artificial Neural Network approach for correction of vocal tract parameters and spectral envelope. The target is for natural man machine communication especially for vocal tract affected people. This paper exploits the nonlinear mapping property of Multi-Layered Feed Forward Neural Network (MLFFNN) for estimation of high-frequency components (4-8kHz) from the low-frequency band (0-4kHz) of TM signal. The proposed algorithm is tested using ATR503 Dataset. The simulation results show a noticeable performance in the field of speech communication in adverse environments.続きを見る

本文ファイル

ファイル	ファイルタイプ	サイズ	閲覧回数	説明
p238	pdf	769 KB	445

詳細

EISSN	2434-1436
レコードID	4102497
主題	Multi-Layered Feed Forward Neural Network
	Mel Frequency Cepstral Coefficient
	Speech spectra
	Linear Prediction coefficients
登録日	2020.10.26
更新日	2024.01.11