Analyse automatique du grec ancien par réseau de neurones. Évaluation sur le corpus De Thessalonica Capta

Authors

  • Bastien Kindt
  • Chahan Vidal-Gorène
  • Saulo Delle Donne

DOI:

https://doi.org/10.14428/babelao.vol1011.2022.65073

Keywords:

Natural Language Processing (NLP), Lemmatisation, POS-tagging, Ancient Greek, John Anagnostes, Eusthatios of Thessalonike, John Kaminiates

Abstract

The DTC corpus brings together historical texts written in Greek during the Byzantine period. These texts were analyzed semi-automatically (lemmatization and POS-tagging) by using computer tools and linguistic resources of the GREgORI project (UCLouvain, Louvain-la-Neuve, Belgium) specialized in the NLP of Greek and the languages of the Christian East. A second analysis was carried out in collaboration with the company Calfa (Paris, France) developping NLP tools for Armenian and implementing approach relating to artificial intelligence. This second analysis is performed by a neural network. This study compares and evaluates the results produced by the two methods and proposes a hybrid approach for the processing of the languages concerned.

Published

2022-02-24

How to Cite

[1]
B. Kindt, C. Vidal-Gorène, and S. Delle Donne, “Analyse automatique du grec ancien par réseau de neurones. Évaluation sur le corpus De Thessalonica Capta”, BABELAO, vol. 1011, pp. 537–562, Feb. 2022.

Issue

Section

Miscellanea