Audio Visual Speech Recognition (AVSR) presents important advantages in several applications when compared with Automatic Speech Recognition (ASR) standard approaches. Such is the case, for instance, with in-car applications and other acoustically noisy environments. AVSR also is a relatively recent topic in ASR, so its potential is far from being completely explored. Moreover, the AVSR approaches often require specially ingenious solutions to deal with the essential problem of combining efficiently the multiple feature streams that convey the linguistic information. Certainly these are some of the main arguments that have been contributing to the fact that AVSR has became my favourite interest in ASR.

Two main goals have been pursued in the AVSR topic: (1) keeping up with proven approaches, applying the acquired know-how on the development (based on efficient prototyping tools such as GMTK, CMU-LMtk and libraries such as OpenCV) of useful systems; and (2) trying to create innovative solutions to mitigate some important problems, adapting or combining existing techniques with new ones eventually inspired on less related knowledge domains.

(last change: 23/11/2023)