Accéder directement au contenu

Sahar Ghannay

41
Documents
Identifiants chercheurs

Présentation

Associate professort at LIMSI, CNRS, Université Paris-Saclay ------------------------------------------------------------ Team: [ILES](https://www.limsi.fr/en/research/iles/home) Email: sahar.ghannay@limsi.fr Website: <https://saharghannay.github.io> Short Bio --------- Sahar Ghannay is an associate professor at Université Paris-Saclay, in the CNRS, [LISN](https://www.lisn.upsaclay.fr) research center, since September 2018. She received a PhD in Computer Science from Le Mans University on Septembre 2017. Her thesis work is part of the ANR [VERA](https://anr.fr/Project-ANR-12-BS02-0006) (AdVanced ERror Analysis for speech recognition) project. During her PhD, she spent a few months as @ visiting researcher at Apple within the Siri Speech team. As a postdoctoral researcher at [LIUM](https://lium.univ-lemans.fr/), she worked on neural end-to-end systems for the detection of named entities, speech understanding, as part of the Chist-Era [M2CR](https://projets-lium.univ-lemans.fr/m2cr/) (Multimodal Multilingual Continuous Representation for Human Language Understanding) project. Her main research interests are continuous representations learning and their application to natural language processing and speech recognition tasks, semantic information extraction form spoken and writen language and dialog system. CV == Education --------- - PHD in computer science, at LIUM, Le Mans université, 2017 - MS in computer science, Le Mans université, 2013 - BS in computer science, Le Mans université and université de sfax, 2011 Work Experience ---------------- - 2018 - now: Associate Professort at LISN, CNRS, Université Paris-Saclay - 2017-2018: Post-doc at LIUM, Le Mans université - 2017 (4 moths): Internship at Apple within the Siri Speech team at Cupertino - April 2013-Sept. 2014: research engineer

Publications

Image document

New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark

Nadège Alavoine , Gaëlle Laperriere , Christophe Servan , Sahar Ghannay , Sophie Rosset
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024, Torino, Italy
Communication dans un congrès hal-04523286v1
Image document

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Pierre Lepagnol , Thomas Gerald , Sahar Ghannay , Christophe Servan , Sophie Rosset
LREC-COLING 2024, May 2024, TURIN, Italy
Communication dans un congrès hal-04519930v1
Image document

mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

Christophe Servan , Sahar Ghannay , Sophie Rosset
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, May 2024, Torino, Italy
Communication dans un congrès hal-04520797v1
Image document

Projet Gender Equality Monitor (GEM)

Gilles Adda , François Buet , Sahar Ghannay , Cyril Grouin , Camille Guinaudeau
18e Conférence en Recherche d'Information et Applications, 16e Rencontres Jeunes Chercheurs en RI, 30e Conférence sur le Traitement Automatique des Langues Naturelles, 25e Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues, 2023, Paris, France. pp.21-21
Communication dans un congrès hal-04208588v1

Specialized Semantic Enrichment of Speech Representations

G. Laperrière , Ha Nguyen , Sahar Ghannay , Bassam Jabaian , Yannick Estève
2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Jun 2023, Rhodes Island, France. pp.1-5, ⟨10.1109/ICASSPW59220.2023.10193452⟩
Communication dans un congrès hal-04425710v1
Image document

Continual self-supervised domain adaptation for end-to-end speaker diarization

Juan Manuel Coria , Hervé Bredin , Sahar Ghannay , Sophie Rosset
IEEE Spoken Language Technology Workshop (SLT 2022), IEEE Speech and Language Processing Technical Committee, Jan 2023, Doha, Qatar. à paraître
Communication dans un congrès hal-03824546v1
Image document

Analyzing BERT Cross-lingual Transfer Capabilities in Continual Sequence Labeling

Juan Manuel Coria , Mathilde Veron , Sahar Ghannay , Guillaume Bernard , Hervé Bredin
First Workshop on Performance and Interpretability Evaluations of Multimodal, Multipurpose, Massive-Scale Models, Oct 2022, virtual, South Korea
Communication dans un congrès hal-03824597v1
Image document

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Oralie Cattan , Sahar Ghannay , Christophe Servan , Sophie Rosset
INTERSPEECH 2022, Sep 2022, Incheon, South Korea
Communication dans un congrès hal-03715340v2
Image document

Evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools

Nesrine Bannour , Sahar Ghannay , Aurélie Névéol , Anne-Laure Ligozat
EMNLP, Workshop SustaiNLP, Nov 2021, Punta Cana, Dominican Republic
Communication dans un congrès hal-03435068v1
Image document

OVERLAP-AWARE LOW-LATENCY ONLINE SPEAKER DIARIZATION BASED ON END-TO-END LOCAL SEGMENTATION

Juan Manuel Coria , Hervé Bredin , Sahar Ghannay , Sophie Rosset
IEEE Automatic Speech Recognition and Unserstanding Workshop, Dec 2021, Cartagena, Colombia
Communication dans un congrès hal-03375330v1

A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification

Juan Manuel Coria , Hervé Bredin , Sahar Ghannay , Sophie Rosset
International Conference on Statistical Language and Speech Processing, Oct 2020, Cardiff, United Kingdom. pp.137-148, ⟨10.1007/978-3-030-59430-5_11⟩
Communication dans un congrès hal-02989334v1
Image document

A Metric Learning Approach to Misogyny Categorization

Juan Manuel Coria , Sahar Ghannay , Sophie Rosset , Hervé Bredin
Workshop on Representation Learning for NLP, Jul 2020, Online, France. pp.89-94, ⟨10.18653/v1/2020.repl4nlp-1.12⟩
Communication dans un congrès hal-02989293v1
Image document

What is best for Spoken Language Understanding: Small but Task-dependant Embeddings or Huge but Out-of-domain Embeddings?

Sahar Ghannay , Antoine Neuraz , Sophie Rosset
45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020), May 2020, Barcelona, Spain. pp.8114-8118, ⟨10.1109/ICASSP40776.2020.9053278⟩
Communication dans un congrès hal-02503694v1

Error analysis applied to end-to end spoken language understanding

Antoine Caubrière , Sahar Ghannay , Natalia Tomashenko , Renato de Mori , Antoine Laurent
45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020), May 2020, Barcelona, Spain. pp.8514-8518, ⟨10.1109/ICASSP40776.2020.9054455⟩
Communication dans un congrès hal-02465899v1
Image document

Neural Networks approaches focused on French Spoken Language Understanding: application to the MEDIA Evaluation Task

Sahar Ghannay , Christophe Servan , Sophie Rosset
In Proceedings of The 28th International Conference on Computational Linguistics (COLING’2020), 2020, Dec 2020, Barcelona (online), Spain
Communication dans un congrès hal-03007482v1
Image document

A Cooking Knowledge Graph and Benchmark for Question Answering Evaluation in Lifelong Learning Scenarios

Mathilde Veron , Anselmo Peã±as , Guillermo Echegoyen , Somnath Banerjee , Sahar Ghannay
International Conference on Applications of Natural Language to Information Systems, Elisabeth Métais and Farid Meziane and Helmut Horacek and Philipp Cimiano, Jun 2020, Saarbrücken, Germany
Communication dans un congrès hal-03006228v1
Image document

Experiments from LIMSI at the French Named Entity Recognition Coarse-grained task

Sahar Ghannay , Cyril Grouin , Thomas Lavergne
Conference and Labs of the Evaluation Forum, Sep 2020, Thessaloniki, Greece
Communication dans un congrès hal-04395545v1
Image document

Lifelong learning and task-oriented dialogue system: what does it mean?

Mathilde Veron , Sahar Ghannay , Anne-Laure Ligozat , Sophie Rosset
International Workshop on Spoken Dialogue Systems Technology, Apr 2019, Siracusa, Italy
Communication dans un congrès hal-02301089v1
Image document

End-to-end named entity and semantic concept extraction from speech

Sahar Ghannay , Antoine Caubrière , Yannick Estève , Nathalie Camelin , Edwin Simonnet
IEEE Spoken Language Technology Workshop, Dec 2018, Athens, Greece
Communication dans un congrès hal-01987740v2
Image document

Simulating ASR errors for training SLU systems

Edwin Simonnet , Sahar Ghannay , Nathalie Camelin , Yannick Estève
LREC 2018, May 2018, Miyazaki, Japan
Communication dans un congrès hal-01715923v1
Image document

Simulation d'erreurs de reconnaissance automatique dans un cadre de compréhension de la parole

Edwin Simonnet , Sahar Ghannay , Nathalie Camelin , Yannick Estève
XXXIIe Journées d'Etudes sur la Parole (JEP 2018), Jun 2018, Aix-en-Provence, France
Communication dans un congrès hal-01757770v1

Représentations de phrases dans un espace continu spécifiques à la tâche de détection d'erreurs

Sahar Ghannay , Yannick Estève , Nathalie Camelin
XXXIIe Journées d'Etudes sur la Parole (JEP 2018), Jun 2018, Aix-en-Provence, France
Communication dans un congrès hal-01757774v1

Task Specific Sentence Embeddings for ASR Error Detection

Sahar Ghannay , Yannick Estève , Nathalie Camelin
Interspeech 2018, Sep 2018, Hyderabad, India. ⟨10.21437/Interspeech.2018-2211⟩
Communication dans un congrès hal-01870864v1
Image document

Enriching confusion networks for post-processing

Sahar Ghannay , Yannick Estève , Nathalie Camelin
Statistical Language and Speech Processing 2017, Oct 2017, Le Mans, France
Communication dans un congrès hal-01585768v1
Image document

ASR error management for improving spoken language understanding

Edwin Simonnet , Sahar Ghannay , Nathalie Camelin , Yannick Estève , Renato de Mori
Interspeech 2017, Aug 2017, Stockholm, Sweden
Communication dans un congrès hal-01526298v1

Evaluation of acoustic word embeddings

Sahar Ghannay , Yannick Estève , Nathalie Camelin , Paul Deléglise
RepEval@ACL 2016: The 1st Workshop on Evaluating Vector-Space Representations for NLP, 2016, Berlin, Germany
Communication dans un congrès hal-01433181v1

Acoustic word embeddings for ASR error detection

Sahar Ghannay , Yannick Estève , Nathalie Camelin , Paul Deléglise
Interspeech 2016, 2016, San Francisco (CA, USA), Unknown Region
Communication dans un congrès hal-01433176v1

Recent improvements on error detection for automatic speech recognition

Yannick Estève , Sahar Ghannay , Nathalie Camelin
1st International Workshop on Multimodal Media Data Analytics (MMDA 2016), in Conjunction with the 22nd European Conference on Artificial Intelligence, 2016, The Hague The, Netherlands
Communication dans un congrès hal-01433168v1

Utilisation des représentations continues des mots et des paramètres prosodiques pour la détection d’erreurs dans les transcriptions automatiques de la parole

Sahar Ghannay , Yannick Estève , Nathalie Camelin , Camille Dutrey , Fabian Santiago
31ème Journées d’Études sur la Parole, 2016, Paris, France
Communication dans un congrès hal-01450277v1

Word embedding evaluation and combination

Sahar Ghannay , Benoit Favre , Yannick Estève , Nathalie Camelin
10th edition of the Language Resources and Evaluation Conference (LREC 2016), 2016, Portorož, Slovenia
Communication dans un congrès hal-01433185v1

Which ASR errors are hard to detect?

Sahar Ghannay , Nathalie Camelin , Yannick Estève
Workshop Errors by Humans and Machines in multimedia, multimodal and multilingual data processing (ERRARE 2015), 2015, Sinaia, Romania
Communication dans un congrès hal-01433201v1

Word embeddings combination and neural networks for robustness in ASR error detection

Sahar Ghannay , Yannick Estève , Nathalie Camelin
2015 European Signal Processing Conference (EUSIPCO 2015), 2015, Nice, France
Communication dans un congrès hal-01433210v1

Combining continous word representation and prosodic features for ASR error prediction

Sahar Ghannay , Yannick Estève , Nathalie Camelin , Camille Dutrey , Fabian Santiago
3rd International Conference on Statistical Language and Speech Processing (SLSP 2015), 2015, Budapest, Hungary
Communication dans un congrès hal-01433203v1

Using Hypothesis Selection Based Features for Confusion Network MT System Combination

Sahar Ghannay , Loïc Barrault
Third Workshop on Hybrid Approaches to Translation (HyTra), EACL 2014, 2014, Gothenburg, Sweden
Communication dans un congrès hal-01433229v1