The role of metrics to assess the quality of British teenage language translation into Spanish and Italian using machine translation tools
- Canga Alonso, Andrés 1
- Napoletano, Maria Cira 1
-
1
Universidad de La Rioja
info
ISSN: 2175-7968, 1414-526X
Year of publication: 2025
Volume: 45
Issue: 1
Type: Article
beta Ver similares en nube de resultadosMore publications in: Cadernos de tradução
Abstract
The rapid evolution of adolescence language, characterized by slang and idiomatic expressions, presents a significant challenge for machine translation systems. Existing research has extensively covered the translation of languages in general; however, there remains a gap in understanding these systems’ ability when faced with adolescent language. This study aims at (i) the evaluation and the comparison of the accuracy of the translations of colloquial language by Bing Translator, DeepL and HelsinkiNLP from English into Spanish and Italian, (ii) the validity and reliability of two different metrics (i.e., BLEU, METEOR) to assess the accuracy and quality of MT tools with informal language, and (iii) the analysis of how specific features of teenage slang influence the ability of online tools to generate precise and comprehensible translations 1000-character excerpts from the Linguistic Innovators Corpus were translated in Spanish and Italian using DeepL, Bing Translator, and HelsinkiNLP and assessed using BLEU and METEOR metrics to verify their quality and reliability. Our findings show that teenage slang poses challenges for all tools, particularly with phrasal verbs and idioms. Our results also reveal that METEOR seems to be more reliable to assess British teenage language into Spanish and Italian.
Bibliographic References
- Agarwal, A., & Lavie, A. (2008). METEOR, M-BLEU and M-TER: Evaluation Metrics for High-Correlation with Human Rankings of Machine Translation Output. Proceedings of the Third ACL Workshop on Statistical Machine Translation. Association for Computational Linguistics.
- Alawi, N., & Abdulhaq, S. (2017). Machine Translation: The Cultural and Idiomatic Challenge. Journal of Al-Azhar University – Gaza (Humanities), 19(2), 1–28.
- Banitz, B. (2020). Machine translation: A critical look at the performance of rule-based and statistical machine translation. Cadernos de Tradução, 40(1), 54–71. https://doi.org/10.5007/2175-7968.2020v40n1p54
- Baziotis, C., Mathur, P., & Hasler, E. (2023). Automatic Evaluation and Analysis of Idioms in Neural Machine Translation. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 3649–3661). Association for Computational Linguistics.
- Birdsell, B. J. (2022). Student writings with DeepL: Teacher evaluations and implications for teaching. In P. Ferguson & R. Derrah (Eds.), JALT2021: Reflections and New Perspectives (pp. 117-125). JALT. https://doi.org/10.37546/JALTPCP2021-14
- Chatzikoumi, E. (2019). How to evaluate Machine Translation: A review of Automated and Human Metrics. Natural Language Engineering, 26(2), 137–161. https://doi.org/10.1017/S1351324919000469
- Cheshire, J. (2007). Discourse Variation, Grammaticalisation and “Stuff like That”. Journal of Sociolinguistics, 11(2), 155–193. https://doi.org/10.1111/j.1467-9841.2007.00317.x
- Costa, Â., Ling, W., Luís, T., Correia, R., & Coheur, L. (2015). A linguistically motivated taxonomy for Machine Translation error analysis. Machine Translation, 29(2), 127–161. http://dx.doi.org/10.1007/s10590-015-9169-0
- Das, A. K. (2018). Translation and Artificial Intelligence: Where are we heading? International Journal of Translation, 30(1), 1–26.
- Dorr, B., Snover, M., & Madnani, N. (2010). Chapter 5.1 Introduction. In B. Dorr (Ed.), Part 5: Machine Translation Evaluation (pp. 802–806). DARPA GALE Program Report.
- Duan, G., Yang, H., Qin, K., & Huang, T. (2021). Improving Neural Machine Translation Model with Deep Encoding Information. Cognitive Computation, 13, 972–980. https://doi.org/10.1007/s12559-021-09860-7
- Eckert, P. (2003). Language and adolescent peer groups. Journal of Language and Social Psychology, 22(1), 112-118. https://doi.org/10.1177/0261927X02250063
- Gaspari, F., & Zacchetta, E. (2011). Scrittura controllata per la traduzione automatica. In G. Bersani Berselli (Ed.), Usare la Traduzione Automatica (pp. 63-79). Clueb.
- Goto, I., & Tanaka, H. (2017). Detecting Untranslated Content for Neural Machine Translation. Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics.
- Hadla, L. S., Hailat, T. M., & Al-Kabi, M. N. (2015). Comparative Study Between METEOR and BLEU Methods of MT: Arabic into English Translation as a Case Study. International Journal of Advanced Computer Science and Applications (IJACSA), 6(11), 215–223. https://dx.doi.org/10.14569/IJACSA.2015.061128
- He, L., Ghassemiazghandi, M., & Subramaniam, I. (2024). Comparative assessment of Bing Translator and Youdao Machine Translation Systems in English-to-Chinese literary text translation. Forum for Linguistic Studies. 6(2), 1–18. https://doi.org/10.59400/fls.v6i2.1189
- Hutchins, J., & Somers, H. (1992). An Introduction to Machine Translation. Academic Press Limited.
- Jibreel, I. (2023). Online Machine Translation Efficiency in Translating Fixed Expressions Between English and Arabic (Proverbs as a Case-in-Point). Theory and Practice in Language Studies, 13(5), 1148–1158. https://doi.org/10.17507/tpls.1305.07
- Jufriadi, J., Asokawati, A., & Thayyib, M. (2022). The Error Analysis of Google Translate and Bing Translator in Translating Indonesian Folklore. FOSTER: Journal of English Language Teaching, 3(2), 69–79. https://doi.org/10.24256/foster-jelt.v3i2.89
- Lavie, A., & Denkowski, M. (2009). The METEOR metric for automatic evaluation of Machine Translation. Machine Translation, 23, 105–115. https://doi.org/10.1007/s10590-009-9059-4
- Lee, S., Lee, J., Moon, H., Park, C., Seo, J., Eo, S., Koo, S., & Lim, H. (2023). A Survey on Evaluation Metrics for Machine Translation. Mathematics, 11(4), 1–22. https://doi.org/10.3390/math11041006
- Lotz, S., & Van Rensburg, A. (2016). Omission and other sins: Tracking the quality of online machine translation output over four years. Stellenbosch Papers in Linguistics, 46, 77–97. https://doi.org/10.5774/46-0-223
- Mathur, N., Baldwin, T., & Cohn, T. (2020). Tangled up in BLEU: Reevaluating the Evaluation of Automatic Mahine Translation Evaluation Metrics. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
- Mayor Martínez, A., Alegría Loinaz, I., Díaz de Ilarraza Sánchez, A., Labaka Intxauspe, G.,Lersundi Ayestaran, M., & Sarasola Gabiola, K. (2009). Evaluación de un sistema de traducción automática basado en reglas o por qué BLEU sólo sirve para lo que sirve. Procesamiento del Lenguaje Natural, 43, 197–205.
- Moneus, A. M., & Sahari, Y. (2024). Artificial intelligence and human translation: A contrastive study based on legal texts. Heliyon, 10(6), 1–14. https://doi.org/10.1016/j.heliyon.2024.e28106
- Napoletano, M. C., & Canga Alonso, A. (2023). The Translation of Adolescence Language by means of Apertium, Systran and Google Translate. Revista Electrónica de Lingüística Aplicada, 22(1), 148–163. http://dx.doi.org/10.58859/rael.v23i1.585
- Nicholas, G., & Bhatia, A. (2023). Lost in translation: Large language models in non-English content analysis. Center for Democracy & Technology. https://doi.org/10.48550/arXiv.2306.07377
- Nuraeni, F. W., & Pahamzah, J. (2021). An Analysis of Slang Language used in Teenager Interaction. Litera, 20, 313–322. http://dx.doi.org/10.21831/ltr.v20i2.37058
- Orrego-Carmona, D. (2022). Machine translation in everyone’s hands – Adoption and changes among general users of MT. Revista Tradumàtica. Tecnologies de la Traducció, 20, 322–339. https://doi.org/10.5565/rev/tradumatica.324
- Palacios Martínez, I. M. (2011). The language of British teenagers: A preliminary study of its main grammatical features. Atlantis, 33(1), 105–126.
- Palacios Martínez, I. M. (2013). Non-standard negation in Modern English: A corpus-based study of four salient features. ES Review. Spanish Journal of English Studies, 34, 211–226.
- Palacios Martínez, I. M. (2020). Taboo vocatives in the language of London teenagers. Pragmatics, 31(2), 250–277 https://doi.org/10.1075/prag.19028.pal
- Palacios Martínez I. M. (2021). Recent changes in London English: An overview of the main lexical, grammar and discourse features of Multicultural London English (MLE). Complutense Journal of English Studies, 29, 1–20. https://doi.org/10.5209/cjes.77504
- Peña Aguilar, A. (2023). Challenging machine translation engines: Some Spanish-English linguistic problems put to the test. Cadernos de Tradução, 43(1), 1–26. https://doi.org/10.5007/2175-7968.2023.e85397
- Pimentel, C. H. M., & Pires, T. B. (2024). Treinamento e análise de um modelo de tradução automática baseado em Transformer. Texto Livre: inguagem e Tecnologia, 17, 1–15. https://doi.org/10.1590/1983-3652.2024.49118
- Pym, A. (2020). Quality. In M. O’Hagan (Ed.), The Routledge Handbook of Translation and Technology (pp. 437–449). Routledge.
- Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4), 3–13.
- Roiss, S., & Zimmermann González, P. (2020). DeepL y su potencial para el desarrollo de la capacidad de análisis crítico en la clase de traducción inversa. Hermēneus. Revista de Traducción e Interpretación, 22, 363–382. https://doi.org/10.24197/her.22.2020.363-382
- Rosyadi Za, D., Purnamawati, N., Galuh Dwi Ajeng, A. M., & Hejash, M. (2023). Slang as a Medium of Communication for Adolescents in Social Interaction between Others. JETA. Journal of English Teaching and Applied Linguistic, 4(1), 1–14. https://doi.org/10.52217/jeta.v4i1.1141
- Russo, L., Loáiciga, S., & Gulati, A. (2012). Improving Machine Translationof null subjects in Italian and Spanish. In Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp. 81–89). Association for Computational Linguistics.
- Saadany, H., & Orasan, C. (2021). BLEU, METEOR, BERTScore: Evaluation of Metrics Performance in Assessing Critical Translation Errors in Sentiment-Oriented Text. Proceedings of the Translation and Interpreting Technology Online Conference (pp. 48–56). INCOMA Ltd.
- Sharma, S., Diwakar, M., Singh, P., Singh, V., Kadry, S., & Kim, J. (2023). Machine translation systems based on classical-statistical-deep learning approaches. Electronics, 12(7), 1–29. https://doi.org/10.3390/electronics12071716
- Sharou, K. A., & Specia, L. (2022). A taxonomy and study of critical errors in Machine Translation. In H. Moniz, L. Macken, A. Rufener, L. Barrault, M. R. Costa-Jussà, C. Declercq, M. Koponen, E. Kemp, S. Pilos, M. L. Forcada, C. Scarton, J. Van den Bogaert, J. Daems, A. Tezcan, B. Vanroy & M. Fonteyne (Eds.), Proceedings of the 23rd Annual Conference of the European Association for Machine Translation. European Association for Machine Translation.
- Silalahi, E., & Silalahi, N. (2023). Linguistics Realization Analysis on Slang Word; Social Media Whatsapp. JETAL. Journal of English Teaching & Applied Linguistic, 5, 8–13. http://dx.doi.org/10.36655/jetal.v5i1.1120
- Smirnov, A. V., Teslya, N., Shilov, N., Frank, D., Minina, E., & Kovacs, M. (2022). Comparative Analysis of Neural Translation Models based on Transformers Architecture. Proceedings of the 24th International Conference on Enterprise Information Systems (ICEIS 2022) (pp. 586–593). https://doi.org/10.5220/0011083600003179
- Son, J., & Kim, B-Y. (2023). Translation Performance from the User’s Perspective of Large Language Models and Neural Machine Translation Systems. Information 14(10), 1–18. https://doi.org/10.3390/info14100574
- Song, R. (2022). Analysis on the Recent Trends in Machine Translation. Highlights in Science, Engineering and Technology, 16, 40–47. https://doi.org/10.54097/hset.v16i.2228
- Tagliamonte, S. A., & Denis, D. (2010). The Stuff of Change: General Extenders in Toronto, Canada. Journal of English Linguistics, 38(4), 335–368. https://doi.org/10.1177/0075424210367484
- Tavosanis, M. (2019). Valutazione umana di Google Traduttore e DeepL per le traduzioni di testi giornalistici dall’inglese verso l’italiano. In R. Bernardi, R. Navigli & G. Semeraro (Eds.), CLiC-it 2019. Proceedings of the Sixth Italian Conference on Computational Linguistics. CEUR.
- Thiruumeni, P. G., Anand, K., Dhanalakshmi, V., & Soman, K. P. (2011). An approach to handle idioms and phrasal verbs in English-Tamil Machine Translation system. International Journal of Computer Applications, 26, 36–41. https://doi.org/10.5120/3139-4328
- Tognini-Bonelli, E. (2001). Corpus linguistics at work. John Benjamins.
- Torgersen, E. N., Gabrielatos, C., Hoffmann, S., & Fox, S. (2011). A corpus-based study of pragmatic markers in London English. Corpus Linguistics and Linguistic Theory, 7(1), 93–118. https://doi.org/10.1515/cllt.2011.005
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., & Kaiser L. (2017). Attention is All You Need. Advances in Neural Information Processing Systems, 7, 1–15. https://doi.org/10.48550/arXiv.1706.03762
- Vilar, D., Xu, J., D’Haro, L. F., & Ney, H. (2006). Error Analysis of Statistical Machine Translation Output. Proceedings of the Fifth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA).
- Volk, M. (1998). The automatic translation of idioms. Machine translation vs. translation memory systems. In: N. Weber (Ed.), Machine translation: theory, applications, and evaluation. An assessment of the state of the art (pp. 167–192). Gardez-Verlag.
- Wang, H., Wu, H., He, Z., Huang, L. B., & Church, K. W. (2021). Progress in Machine Translation. Engineering, 18, 143–153. https://doi.org/10.1016/j.eng.2021.03.023
- Wang, Y. (2023). Research of types and current state of machine translation. Proceedings of the 2023 International Conference on Machine Learning and Automation. EWA Publishing. https://doi.org/10.54254/2755-2721/37/20230479
- Zhao, Z. (2022). The Machine Translation Model. Proceedings of the 2022 5th International Conference on Humanities Education and Social Sciences (ICHESS 2022). Atlantis Press. https://doi.org/10.2991/978-2-494069-89-3_247
- Zhou, M., Wang, B., Liu, S., Li, M., Zhang, D., & Zhao, T. (2008). Diagnostic Evaluation of Machine Translation Systems Using Automatically Constructed Linguistic Check-Points. Proceedings of the 22nd International Conference on Computational Linguistics. Organizing Committee.
- Zhu, J., Xia, Y., Wu, L., He, D., Qin, T., Zhou, W., Li1, H., & Liu, T. (2020). Incorporating BERT into Neural Machine Translation. Cornell University.