<学術雑誌論文>
Forensic Applications using Cosine Distance Feature and Cepstral Coefficient for Speaker Recognition

作成者
本文言語
出版者
発行日
収録物名
開始ページ
終了ページ
出版タイプ
アクセス権
Crossref DOI
権利関係
概要 Speaker Recognition (SR) uses a person's voice to identify them. Due to their high performance and capability to recompense for session/channel inconsistencies, i-vectors have recently gained populari...ty as SRS input features. Additional speaker-specific perceptual cues can be derived from behaviors and learned characteristics, such as vocabulary selection, accent, intonation style, and emotional aspects. Humans also use the speaker's sound signature similarity to known speakers to improve sound recognition precision. We need a new feature vector representation that compares a mark speaker's speech to a set of reference speaker’s (codebook/dictionary). The speaker's utterance is encoded as cosine distance feature vectors (CDF). Back-end classifiers use SVMs (CDF-SVM). As a result, an SVM classifier with an intersection kernel captures the most acoustic similarities between target and reference speakers. Determining speaker discrimination is more important with reference speakers that are acoustically similar. Using CDF sparingly improves discriminative power by keeping only a few large values that correspond to the most similar reference speakers and setting all other elements to 0. On the core shorting condition of NIST's 2008 SRE databases, CDF-SVM outperforms SR systems using I-Vectors.続きを見る

本文ファイル

pdf p1320-1325 pdf 305 KB 190  

詳細

PISSN
EISSN
レコードID
査読有無
主題
登録日 2024.07.12
更新日 2024.07.16

この資料を見た人はこんな資料も見ています