Accéder directement au contenu

Identifiants chercheur

Export Publications

Exporter les publications affichées :
Nombre de documents

10

Pedro Javier Ortiz Suárez


Je suis doctorant en informatique à Sorbonne Université et à l'équipe de recherche ALMAnaCH à Inria.


Communication dans un congrès9 documents

  • Murielle Fabre, Pedro Javier Ortiz Suárez, Benoît Sagot, Éric Villemonte de la Clergerie. French Contextualized Word-Embeddings with a sip of CaBeRnet: a New French Balanced Reference Corpus. CMLC-8 - 8th Workshop on the Challenges in the Management of Large Corpora, May 2020, Marseille, France. ⟨hal-02678358⟩
  • Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoan Dupont, Laurent Romary, et al.. Les modèles de langue contextuels Camembert pour le français : impact de la taille et de l'hétérogénéité des données d'entrainement. JEP-TALN-RECITAL 2020 - 33ème Journées d’Études sur la Parole, 27ème Conférence sur le Traitement Automatique des Langues Naturelles, 22ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues, Jun 2020, Nancy, France. pp.54-65. ⟨hal-02784755v3⟩
  • Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot. A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages. ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle, United States. ⟨hal-02863875v2⟩
  • Pedro Javier Ortiz Suárez, Yoann Dupont, Benjamin Muller, Laurent Romary, Benoît Sagot. Establishing a New State-of-the-Art for French Named Entity Recognition. LREC 2020 - 12th Language Resources and Evaluation Conference, May 2020, Marseille, France. ⟨hal-02617950v2⟩
  • Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, et al.. CamemBERT: a Tasty French Language Model. ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle, United States. ⟨hal-02889805⟩
  • Djamé Seddah, Farah Essaidi, Amal Fethi, Matthieu Futeral, Benjamin Muller, et al.. Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell. ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle, Canada. ⟨hal-02889804⟩
  • Mohamed Khemakhem, Ioana Galleron, Geoffrey Williams, Laurent Romary, Pedro Javier Ortiz Suárez. How OCR Performance can Impact on the Automatic Extraction of Dictionary Content Structures. 19th annual Conference and Members’ Meeting of the Text Encoding Initiative Consortium (TEI) -What is text, really? TEI and beyond, Sep 2019, Graz, Austria. ⟨hal-02263276⟩
  • Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot. Preparing the Dictionnaire Universel for Automatic Enrichment. 10th International Conference on Historical Lexicography and Lexicology (ICHLL), Jun 2019, Leeuwarden, Netherlands. ⟨hal-02131598⟩
  • Pedro Javier Ortiz Suárez, Benoît Sagot, Laurent Romary. Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures. 7th Workshop on the Challenges in the Management of Large Corpora (CMLC-7), Jul 2019, Cardiff, United Kingdom. ⟨10.14618/IDS-PUB-9021⟩. ⟨hal-02148693⟩

Pré-publication, Document de travail1 document

  • Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, et al.. CamemBERT: a Tasty French Language Model. 2019. ⟨hal-02445946⟩