<会議発表論文>
Hyper Column Model vs. Fast DCT for Feature Extraction in Visual Arabic Speech Recognition

作成者
本文言語
発行日
雑誌名
開始ページ
終了ページ
出版タイプ
アクセス権
概要 Recently, the multimedia signal processing community has shown increasing interest for research development on visual speech recognition domain. In this paper we present a novel visual speech recognit...ion approach based on our model hyper column model (HCM). HCM is used for feature extraction task. The extracted features are modeled by Gaussian distributions through using hidden Markov model (HMM). The proposed system, HCM and HMM, can be used for any visual recognition task. We use it here to comprise a complete lip-reading system and evaluate its performance using Arabic database set. According to our knowledge, this is the first time that visual speech recognition is applied for Arabic language. Toward fair evaluation we compare our accuracy results with those using fast discrete cosine transform (FDCT) approach, in a separate experiment and using same data set and conditions of HCM experiment. Comparison turns out that HCM shows higher recognition accuracy than FDCT for Arabic sentences and words. HCM does not provide higher accuracy only but also it capable to achieve shift invariant recognition whereas FDCT can not.続きを見る

本文情報を非表示

AlaaISSPIT05 pdf 395 KB 168  

詳細

レコードID
査読有無
関連情報
主題
タイプ
登録日 2009.04.22
更新日 2017.02.28