• 대한전기학회
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • 한국과학기술단체총연합회
  • 한국학술지인용색인
  • Scopus
  • crossref
  • orcid
Title Monosyllable Speech Recognition through Facial Movement Analysis
Authors 강동원(Kang, Dong-Won) ; 서정우(Seo, Jeong-Woo) ; 최진승(Choi, Jin-Seung) ; 최재봉(Choi, Jae-Bong) ; 탁계래(Tack, Gye-Rae)
DOI https://doi.org/10.5370/KIEE.2014.63.6.813
Page pp.813-819
ISSN 1975-8359
Keywords 3-D motion capture system ; Facial motion ; Hidden Markov model ; Speech recognition
Abstract The purpose of this study was to extract accurate parameters of facial movement features using 3-D motion capture system in speech recognition technology through lip-reading. Instead of using the features obtained through traditional camera image, the 3-D motion system was used to obtain quantitative data for actual facial movements, and to analyze 11 variables that exhibit particular patterns such as nose, lip, jaw and cheek movements in monosyllable vocalizations. Fourteen subjects, all in 20s of age, were asked to vocalize 11 types of Korean vowel monosyllables for three times with 36 reflective markers on their faces. The obtained facial movement data were then calculated into 11 parameters and presented as patterns for each monosyllable vocalization. The parameter patterns were performed through learning and recognizing process for each monosyllable with speech recognition algorithms with Hidden Markov Model (HMM) and Viterbi algorithm. The accuracy rate of 11 monosyllables recognition was 97.2%, which suggests the possibility of voice recognition of Korean language through quantitative facial movement analysis.