A support vector machine-based dynamic network for visual speech recognition applications (Q1424532)

Summary: Visual speech recognition is an emerging research field. In this paper, we examine the suitability of support vector machines for visual speech recognition. Each word is modeled as a temporal sequence of visemes corresponding to the different phones realized. One support vector machine is trained to recognize each viseme and its output is converted to a posterior probability through a sigmoidal mapping. To model the temporal character of speech, the support vector machines are integrated as nodes into a Viterbi lattice. We test the performance of the proposed approach on a small visual speech recognition task, namely the recognition of the first four digits in English. The word recognition rate obtained is at the level of the previons best reported rates.

0 references

zbMATH Keywords

mouth shape recognition

0 references

visemes

0 references

support vector machines

0 references

Viterbi lattice

0 references

describes a project that uses

SVMlight