ViSAC

Acoustic-Visual Speech Synthesis by Bimodal Unit Concatenation: Toward a true acoustic-visual speech synthesis
This project is funded by The French National Research Agency (L'Agence nationale de la recherche ANR) within the programme ANR Jeunes Chercheuses et Jeunes Chercheurs.
This project is funded for a period of 4 years, started on 01/2009 until 12/2012.

Presentation

The goal of this project is to investigate a new approach of a text-to-acoustic-visual speech synthesis which is able to animate a 3D talking head and to provide the associated acoustic speech.

In our project, we consider both channels acoustic and visual simultaneously, as one true bimodal signal. As for Text-To-(acoustic)-Speech synthesis by non-uniform unit selection, the selection and the concatenation of units, coming from an acoustic corpus, could be extended to units coming from a bimodal corpus. This approach will contribute to two research fields: In acoustic-only synthesis by augmenting acoustic features by visual information (mainly the human face) for the selection step and in the acoustic-visual synthesis by keeping the strict relation between the acoustic and the visual component of the bimodal signal during the whole process of the synthesis.

Participants

Speech Group

Slim Ouni (Associate Professor) - Principal Investigator
Vincent Colotte (Associate Professor)
Asterios Toutios (Postdoc)
Utpala Musti (PhD. Student)

Magrit Group

Marie-Odile Berger (Research director INRIA)
Brigitte Wrobel-Dautcourt (Associate Professor)

Open Position

Ingénieur en informatique ^{(maj: 07/09/2011)}