Acoustic-Visual Speech Synthesis by Bimodal Unit Concatenation: Toward a true acoustic-visual speech synthesis
This project is funded by The French National Research Agency (L'Agence nationale de la recherche ANR
) within the programme ANR Jeunes Chercheuses et Jeunes Chercheurs.
This project is funded for a period of 4 years, started on 01/2009 until 12/2012.
The goal of this project is to investigate a new approach of a text-to-acoustic-visual speech synthesis which is able to animate a 3D talking head and to provide the associated acoustic speech.
In our project, we consider both channels acoustic and visual simultaneously, as one true bimodal signal. As for Text-To-(acoustic)-Speech synthesis by non-uniform unit selection, the selection and the concatenation of units, coming from an acoustic corpus, could be extended to units coming from a bimodal corpus. This approach will contribute to two research fields: In acoustic-only synthesis by augmenting acoustic features by visual information (mainly the human face) for the selection step and in the acoustic-visual synthesis by keeping the strict relation between the acoustic and the visual component of the bimodal signal during the whole process of the synthesis.
Slim Ouni (Associate Professor) - Principal Investigator
Vincent Colotte (Associate Professor)
Asterios Toutios (Postdoc)
Utpala Musti (PhD. Student)
Marie-Odile Berger (Research director INRIA)
Brigitte Wrobel-Dautcourt (Associate Professor)