Universal dependencies of old English. Automatic parsing with a computational model of language

  1. Domínguez Barragán, Sara
Supervised by:
  1. Francisco Javier Martín Arista Director
  2. Ana Elvira Ojanguren López Director

Defence university: Universidad de La Rioja

Fecha de defensa: 19 February 2024

Committee:
  1. María Luisa Carrió Pastor Chair
  2. Juan Antonio Cutillas Espinosa Secretary
Department:
  1. Filologías Modernas
Doctoral Programme:
  1. Programa de Doctorado en Filología Inglesa por la Universidad de La Rioja

Type: Thesis

Institutional repository: lock_openOpen access Editor

Abstract

This thesis falls within the fields of Historical Linguistics, Corpus Linguistics and Natural Language Processing. To be more precise, it deals with Old English morphology and syntax and aims at assessing the accuracy of an automated model of annotation of Old English that applies the framework of Universal Dependencies (de Marneffe et al. 2021). To do so, a golden corpus of 25,000 words has been annotated manually with the tagsets of the Old English-Present-Day English parallel corpus ParCorOEv2 (Martín Arista et al. 2021). Then, the dataset and the tagset from the golden corpus have been adapted to conform to the requirements of Universal Dependencies in terms of tokenization, morphological annotation, syntactic annotation and CoNLL-U format. At the same time, the raw dataset has been annotated automatically and the outcome of automatic annotation has been compared with the results of manual annotation. Such comparison involves both the method and the annotation itself. As for the method, it has been assessed and compared with a recent publication, with respect to which this approach shows higher accuracy and better performance in general terms. As regards the annotation, the main areas of error of the automatic assignment of features, categories and functions have been identified. The main conclusions of this research point to the limitations of the model as to its performance in the task of automatic annotation of Old English with Universal Dependencies, while providing some suitable solutions to address these limitations.