Forensic Applications using Cosine Distance Feature and Cepstral Coefficient for Speaker Recognition - 九大コレクション | 九州大学附属図書館

＜学術雑誌論文＞
Forensic Applications using Cosine Distance Feature and Cepstral Coefficient for Speaker Recognition

作成者	作成者名 M. Sreenivasa Reddy 所属機関所属機関名 Department of Mechanical Engineering, Aditya University
作成者	作成者名 V. Satyanarayana 所属機関所属機関名 Department of ECE, Aditya University
本文言語	英語
出版者	Transdisciplinary Research and Education Center for Green Technologies, Kyushu University
出版者	九州大学グリーンテクノロジー研究教育センター
発行日	2024-06
収録物名	Evergreen
巻	11
号	2
開始ページ	1320
終了ページ	1325
出版タイプ	Version of Record
アクセス権	open access
Crossref DOI	https://doi.org/10.5109/7183442
権利関係	Creative Commons Attribution 4.0 International
概要	Speaker Recognition (SR) uses a person's voice to identify them. Due to their high performance and capability to recompense for session/channel inconsistencies, i-vectors have recently gained populari...ty as SRS input features. Additional speaker-specific perceptual cues can be derived from behaviors and learned characteristics, such as vocabulary selection, accent, intonation style, and emotional aspects. Humans also use the speaker's sound signature similarity to known speakers to improve sound recognition precision. We need a new feature vector representation that compares a mark speaker's speech to a set of reference speaker’s (codebook/dictionary). The speaker's utterance is encoded as cosine distance feature vectors (CDF). Back-end classifiers use SVMs (CDF-SVM). As a result, an SVM classifier with an intersection kernel captures the most acoustic similarities between target and reference speakers. Determining speaker discrimination is more important with reference speakers that are acoustically similar. Using CDF sparingly improves discriminative power by keeping only a few large values that correspond to the most similar reference speakers and setting all other elements to 0. On the core shorting condition of NIST's 2008 SRE databases, CDF-SVM outperforms SR systems using I-Vectors.続きを見る

本文ファイル

ファイル	ファイルタイプ	サイズ	閲覧回数	説明
p1320-1325	pdf	305 KB	190

詳細

PISSN	2189-0420
EISSN	2432-5953
レコードID	7183442
査読有無	査読有
主題	Acoustic coefficients
	CDF
	Feature extraction
	MFCC
	Speaker recognition
登録日	2024.07.12
更新日	2024.07.16