Automatic Lemmatization of Old English Class III Strong Verbs (L-Y) with ALOEV3

  1. Roberto Torre Alonso
Revista:
Journal of English Studies

ISSN: 1576-6357

Año de publicación: 2022

Número: 20

Páginas: 237-266

Tipo: Artículo

DOI: 10.18172/JES.5324 DIALNET GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: Journal of English Studies

Resumen

Este artículo presenta ALOEV3, un lematizador basado en la Generación Morfológica que permite la lematización automática basada en tipos de los verbos fuertes de la clase III del inglés antiguo que comienzan por las letras L-Y. El lematizador opera sobre la base de las reglas de alternancia flexiva, derivativa y morfofonológica propias de esta clase. Las formas generadas se contrastan con los dos corpus de referencia del inglés antiguo, a saber, el "Dictionary of Old English Corpus y el York-Toronto-Helsinki Parsed Corpus of Old English Prose", a fin de validar su atestiguación y asignarles el lema correspondiente. Los resultados muetran que al 97% de las forma generadas se les asigna un único lema. El resto de formas flexivas generadas (38 de 1.256) muestran competencia entre dos lemas. Esto implica que a pesar del algo grado de precisión del lematizador, aún es necesaria la desambiguación contextual basada en tokens. No obstante, la competición de lemas esta restringida a un número limitado de pares de lemas y sus derivados. Aunque centrada en sólo una clase de verbos, esta investigación confirma que explorar procesos de lematización automática contribuye al campo de la lexicografía del inglés antiguo, bien mediante la lematización automática de las formas atestiguadas o la identificación de zonas grises que requieren revisión manual.

Referencias bibliográficas

  • (1899). The Holy Bible Translated from the Latin Vulgate (Douay Rheims Version). London: Tan books and publishers.
  • Biber, D., S. Conrad and R. Reppen 1998. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.
  • Bosworth, J. and T.N. Toller. 1973 (1898). An Anglo-Saxon Dictionary. Oxford: Oxford University Press.
  • Brunner, K. 1965. Altenglische Grammatik, nach der Angelsachsischen Grammatik von Eduard Sievers. Berlin: Max Niemeyer.
  • Campbell, A. 1987 (1959). Old English Grammar. Oxford: Oxford University Press.
  • Clark Hall, J. R. 1996. A Concise Anglo-Saxon Dictionary. Supplement by Herbert D. Merritt. Toronto: University of Toronto Press.
  • Ferrés, D., A. AbuRa’ed and H. Saggion. 2017. “Spanish morphological generation with wide-coverage lexicons and decision trees”. Procesamiento del Lenguaje Natural 58: 109-116.
  • Forcada, M.L., M. Ginestí-Rosell, J. Nordfalk, J. O’Regan, S. Ortiz-Rojas, J.A. Pérez-Ortiz, F. Sánchez-Martínez, G. Ramírez-Sánchez and F. Tyers. 2011. “Apertium: A free/open-source platform for translation”. Machine Translation 25 (2): 127-144.
  • García Fernández, L. 2020. Lemmatising Old English on a Relational Database. Preterite-Present, Contracted, Anomalous and Strong VII Verbs. Munich: Utzverlag.
  • García García, L. 2019. “The basic valency orientation of Old English and the causative ja- formation: A synchronic and diachronic approach”. English Language and Linguistics 24 (1): 153-177. http://doi.org/10.1017/S1360674318000345
  • García García, L. and E. Ruiz Narbona. 2021. “Lability in Old English verbs: Chronological and textual distribution”. Anglia: Journal of English Philology 139 (2): 283-326. http://doi.org/10.1515/ang-2021-0022
  • Healey, A., ed., with J. Price and X. Xiang. 2009. The Dictionary of Old English Web Corpus. Dictionary of Old English Project, Centre for Medieval Studies, University of Toronto.
  • Healey, A., ed. 2018. The Dictionary of Old English: A to I. Dictionary of Old English Project, Centre for Medieval Studies, University of Toronto.
  • Hogg, R. and R.D. Fulk. 2011. A Grammar of Old English. Oxford: Wiley-Blackwell.
  • Hostetter, A.K. n.d. The Old English Narrative Poetry Project: Beowulf. Accessed September 2. https: //oldenglishpoetry.camden.rutgers.edu/beowulf/
  • Kastovsky, D. 1992. “Semantics and vocabulary”. The Cambridge History of the English Language I: The Beginnings to 1066. Ed. R. Hogg. Cambridge: Cambridge University Press. 290-408.
  • Khemakhem, A., B. Gargouri, A. Ben Hamadou and G. Francopoulo. 2015. “ISO standard modeling of a large Arabic dictionary”. Journal of Natural Language Engineering 22 (6): 849-879.
  • Krygier, M. 1994. The Disintegration of the English Strong Verb System. Frankfurt am Main: Peter Lang.
  • Levin, S.R. 1964. “A reclassification of the Old English strong verbs”. Language 40 (2): 156-161.
  • Manjavacas, E., A. Kádár and M. Kestemont. 2019. “Improving lemmatization of non-standard languages with joint”. Proceedings of NAACL-HLT 2019. Ed. J. Burstein, Doran, Christy and T. Solorio. Minneapolis: ACL. 1493-1503.
  • Martín Arista, J. 2012a. “Lexical database, derivational map and 3D representation”. RESLA-Revista Española de Lingüística Aplicada (Extra 1): 119-144.
  • Martín Arista, J. 2012b. “The Old English prefix ge-: A panchronic reappraisal”. Australian Journal of Linguistics 32 (4): 411-433. http://doi.org/10.1080/07268602.2012.744264.
  • Martín Arista, J. 2013a. “Recursivity, derivational depth and the search for Old English lexical primes”. Studia Neophilologica 85(1): 1-21. http://doi.org/10.1080/00393274.2013.771829.
  • Martín Arista, J. 2013b. Nerthus. Lexical database of Old English: From word-formation to meaning construction. Research Seminar, School of English, University of Sheffield.
  • Martín Arista, J., ed. 2016. NerthusV3. Online Lexical Database of Old English. Nerthus Project, Universidad de La Rioja. www.nerthusproject.com.
  • Martín Arista, J. 2017a. “El paradigma derivativo del inglés antiguo”. Onomázein 37: 144-169.
  • Martín Arista, J. 2017b. The design and implementation of a pilot parallel corpus of Old English. Paper presented at the SHELL Session of the 2017 International Medieval Conference, University of Leeds, July 4.
  • Martín Arista, J. 2018. “The semantic poles of Old English. Toward the 3D representation of complex polysemy”. Digital Scholarship in the Humanities 33 (1): 96-111.
  • Martín Arista, J. 2019. “Another look at Old English zero derivation and alternations”. Atlantis 41 (1): 163-182.
  • Martín Arista, J. 2020a. “Old English rejoice verbs. Derivation, grammatical behaviour and class membership”. Poetica 93: 133-153.
  • Martin Arista, J. 2020b. “Further remarks on the deflexion and grammaticalization of the Old English past participle with habban”. International Journal of English Studies 20 (1): 51-71.
  • Martín Arista, J. 2021a. “The syntax and semantics of the Old English predicative construction”., Language Change and Linguistic Theory in the 21st Century. Eds. N. Lavidas and K. Nikiforidou. Leiden: Brill. Forthcoming.
  • Martín Arista, J. 2021b. “Word alignment in a parallel corpus of Old English prose. From asymmetry to inter-syntactic annotation”. Corpora in Translation Research: Recent Advances and Applications. Eds. J. Lavid-López, C. Maíz-Arévalo and J.R. Zamorano. Amsterdam: John Benjamins. 76-100.
  • Martín Arista, J. and A.E. Ojanguren López. 2018. Doing electronic lexicography of Old English with a knowledge-base. Workshop delivered at the Consolidated Library of Anglo-Saxon Poetry (CLASP) Project, Faculty of English Language and Literature of the University of Oxford.
  • Martín Arista, J., S. Domínguez Barragán, L. García Fernández, E, Ruíz Narbona, R. Torre Alonso, and R. Vea Escarza (Comp.) 2021. ParCorOEv2. An Open Access Annotated Parallel Corpus Old English-English. Nerthus Project, Universidad de La Rioja, www.nerthusproject.com.
  • Mateo Mendaza, R. 2014. “The Old English adjectival affixes ful-and -ful: a text-based account on productivity”. NOWELE-North-Western European Language Evolution 67 (1): 77-94. http://doi.org/10.1075/nowele.67.1
  • Mateo Mendaza, R. 2015. “Matching productivity indexes and diachronic evolution. The Old English affixes ful-, -isc, -cund and -ful”. Canadian Journal of Linguistics 60 (1): 1-24.
  • Mateo Mendaza, R. 2016a. “The search for Old English semantic primes: The case of HAPPEN”. Nordic Journal of English Studies 15: 71-99.
  • Mateo Mendaza, R. 2016b. “The Old English exponent for the semantic prime MOVE”. Australian Journal of Linguistics 34 (4): 542-559. http://doi.org/10.1080/07268602.2016.1169976
  • von Mengden, F. 2011. “Ablaut or transfixation? On the Old English strong verbs”. More than Words: English Lexicography and Lexicology Past and Present. Eds. R. Bauer and U. Krischke. Frankfurt am Main: Peter Lang. 123-139.
  • Metola Rodríguez, D. 2015. Lemmatisation of Old English strong verbs on a lexical database. Unpublished Ph. D. thesis. University of La Rioja, Spain.
  • Metola Rodríguez, D. 2017. “Strong verb lemmas from a corpus of Old English. Advances and issues”. Revista de Lingüística y Lenguas Aplicadas 12: 65-76.
  • Novo Urraca, C. 2015. “Old English deadjectival paradigms. Productivity and recursivity”. NOWELE-North-Western European Language Evolution 68 (1): 61-80.
  • Novo Urraca, C. 2016a. “Old English suffixation. Content and transposition”. English Studies 97: 638-655.
  • Novo Urraca, C. 2016b. “Morphological relatedness and the typology of adjectival formation in Old English”. Studia Neophilologica 88 (1): 43-55.
  • Novo Urraca, C. and A. E. Ojanguren López. 2018. “Lemmatising treebanks. Corpus annotation with knowledge bases”. RAEL: Revista electrónica de Lingüística Aplicada 17 (1): 99-120.
  • Oflazer, K. and M. Saraçlar, eds. 2018. Turkish Natural Language Processing. Cham: Springer. http://doi.org/10.1007/978-3-319-90165-7
  • Ojanguren López, A. E. 2020. “The semantics and syntax of Old English end verbs”. Atlantis 42 (1): 163-188.
  • Pintzuk, S. and L. Plug. 2001. The York-Helsinki Parsed Corpus of Old English Poetry. Department of Language and Linguistic Science, University of York.
  • Reiter, E. and R. Dale. 1997. “Building applied natural language generation systems”. Natural Language Engineering 3 (1): 57-87. http://doi.org/10.1017/S1351324997001502
  • Rissanen, M., M. Kytö, L. Kahlas-Tarkka, M. Kilpiö, S. Nevanlinna, I. Taavitsainen, T. Nevalainen, and H. Raumolin-Brunberg, comp. 1991. The Helsinki Corpus of English Texts. Department of English, University of Helsinki.
  • Riyeff, J. 2017. The Old English rule of Saint Benedict with related Old English texts. Collegeville: Liturgical press.
  • Taylor, A., A. Warner, S. Pintzuk and F. Beths. 2003. The York-Toronto-Helsinki Parsed Corpus of Old English Prose. York: University of York.
  • Tapsai, C., H. Unger and P. Meesad. 2021. Thai Natural Language Processing. Cham: Springer.
  • Tío Sáenz, M. 2019. The lemmatisation of Old English Weak verbs of a relational database. Unpublished Ph. D. thesis. University of La Rioja: Spain.
  • Vea Escarza, R. 2013. “Old English adjectival affixation. Structure and function”. Studia Anglica Posnaniensia 48 (2-3): 5-25.
  • Vea Escarza, R. 2016a. “Recursivity and inheritance in the formation of Old English nouns and adjectives”. Studia Neophilologica 88: 1-23. http://doi.org/10.1080/00393274.2015.1049830
  • Vea Escarza, R. 2016b. “Old English affixation. A structural-functional analysis”. Nordic Journal of English Studies 15 (1): 101-119.
  • Vea Escarza, R. 2018. “Las funciones y categorías de los nombres y adjetivos afijados del inglés antiguo”. Onomázein 41: 208-226.