A busy week for the members of the TALN team during the CORIA-TALN 2023 French national conference (June 5 to 10, 2023) on natural language processing (TAL) and information retrieval (IR)!

On Monday morning, organization of the ARTS workshop (Analysis and Research of Scientific Texts), a place of meeting and exchange for researchers in TAL and IR interested in scientific texts. A very large number of conference participants came to listen to Mathieu Constant (ATILF) talk about the construction of a dataset of scientific publications for NLP and text mining from ISTEX. The workshop also included 13 scientific articles presented in the form of “booster sessions” and posters. In the team, presentations by Léane Jourdan, Oumaima El Khettari and Maël Houbre, as well as collaborations with Yanis Labrak and Mickaël Rouvier from LIA - Avignon University. Proceedings and program: https://arts2023.sciencesconf.org Organized with Florian Boudin, Béatrice Daille, Richard Dufour, Oumaima El Khettari, Maël Houbre, Léane Jourdan and Nihel Kooli.

On Monday afternoon, co-organization with the LIA (Avignon University) of the feedback workshop for the DEFT campaign (Défi Fouille de Textes) dedicated to the implementation of approaches for the automatic selection of answers in medical MCQs. 6 teams participated in the campaign through various approaches. Large Language Models (LLMs), however, have come out on top on this type of task. Proceedings, presentation slides and program: https://deft2023.univ-avignon.fr Organized with Adrien Bazoge, Béatrice Daille, Richard Dufour, Yanis Labrak, Emmanuel Morin and Mickael Rouvier.

On Wednesday morning, Thibault Bañeras-Roux presented in a poster session his work on the proposal of a dataset integrating human perception applied to the evaluation of speech transcription metrics (HATS). Article: https://hal.science/hal-04111840/ Work done with Jane Wottawa, Michael Rouvier, Richard Dufour and Teva Merlin Article also accepted in the international conference TSD 2023.

Thursday morning, Adrien Bazoge and Yanis Labrak presented in a poster session DrBERT, a robust pre-trained language model in French for the biomedical and clinical fields. Article: https://coria-taln-2023.sciencesconf.org/458407/document Work done with Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Beatrice Daille and Pierre-Antoine Gourraud Article also accepted in the ACL 2023 international conference.