[PhD defence] 24/10/2025 - Lucas Maison: "Robustness of neural models for automatic speech processing" (UPR LIA)
Lucas Maison will publicly defend his thesis on Friday 24 October 2025, entitled "Robustness of neural models for automatic speech processing"..
Date and place
Oral defense scheduled on Friday 24 October 2025 at 1.30pm
Venue: CERI, University of Avignon, 339 Chemin des Meinajaries, 84000 Avignon
Room: Amphi Blaise
Discipline
Computer Science
Laboratory
UPR 4128 LIA - Avignon Computing Laboratory
Composition of the jury
Mr Yannick ESTEVE | Avignon University | Thesis supervisor |
Mr Benjamin LECOUTEUX | LIG/GETALP | Rapporteur |
Ms Irina ILLINA | LORIA-INRIA | Rapporteur |
Mr François CAPMAN | Thales SIX | Thesis co-supervisor |
Mr Jean-François BONASTRE | Inria Defence & Security | Examiner |
Ms Marcely ZANON BOITO | Naver Labs | Examiner |
Summary
Automatic speech recognition has become a popular tool with many applications; it also serves as an intermediate step for other speech-related tasks, such as understanding spoken language or speech synthesis. In automatic speech recognition, the speech signal is first emitted by the speaker, transmitted through the environment, before being captured by a recording device and processed by a machine learning model. However, each of these stages can be a source of variability and lead to transcription errors, which affects the robustness of the system. In this thesis, we study various factors influencing the processing of speech by machines. More specifically, we focus on models pre-trained in French and refined for speech recognition. We begin by presenting our work on accent robustness. Through a number of experiments, we evaluate the resilience of the model to variations in accents and explore different ways of bridging the gaps between them. In particular, we examine the impact of the proportions of accented voices in the training set. In addition, we present CEREALES, a new dataset for Québécois French. In addition to accents, we are also interested in the impact of demographic variables on speech recognition performance. Using the Common Voice corpus, we highlight the model's biases and attempt to reduce them by using deliberately biased training sets. Finally, the last chapter explores the issue of acoustic robustness using keyword recognition models: we show how ID and OOD performance are correlated and study how training data or different pre-processing influence robustness.
Keywords : robustness, speech recognition, self-supervised learning, machine learning, speech
Updated on 16 October 2025