La perplejidad como herramienta para estimar la asignación de nivel de competencia en escritos de una lengua extranjera

  1. Mata, Gadea
  2. Rubio, Julio
  3. Agustín Llach, María del Pilar
  4. Heras, Jonathan
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Año de publicación: 2023

Número: 71

Páginas: 29-38

Tipo: Artículo

Otras publicaciones en: Procesamiento del lenguaje natural

Repositorio institucional: lock_openAcceso abierto Editor

Resumen

La asignación de niveles de competencia a escritos producidos por aprendices de una lengua es una tarea altamente subjetiva. Es por esto que el desarrollo de métodos que evalúen escritos de manera automática puede ayudar tanto al profesorado como al alumnado. En este trabajo, hemos explorado dos vías mediante el uso del corpus CAES. Dicho corpus está formado por escritos de aprendices de español y etiquetado con niveles CEFR (hasta el C1). La primera aproximación es un modelo de aprendizaje profundo llamado Deep-ELE que asigna niveles de competencia a las frases. La segunda aproximación llevada a cabo ha consistido en estudiar la perplejidad de las frases de los estudiantes de distintos niveles, para luego clasificarlos en niveles. Ambas aproximaciones han sido evaluadas, y se ha comprobado que pueden usarse de manera exitosa para clasificar frases por niveles. En concreto, el modelo Deep-ELE obtiene una accuracy de 81,3% y un QWK de 0,83. Como conclusión, este trabajo es un paso para entender cómo las herramientas del procesado de lenguaje natural pueden ayudar a las personas que aprenden un segundo idioma.

Referencias bibliográficas

  • Burstein, J., J. Tetreault, y N. Madnani. 2013. The e-rater automated essay scoring system. En Handbook of Automated Essay Evaluation. Routledge, páginas 55—-67.
  • CAES. 2022. Corpus de aprendices de español (CAES). https://galvan.usc.es/caes/.
  • COE. 2021. CEFR: Common European Framework of Reference for Languages. Council of Europe. https://www.coe.int/en/web/commoneuropean- framework-reference-languages. Cotos, E. 2014. Genre-based automated writing evaluation for L2 research writing: From design to evaluation and enhancement. Macmillan.
  • Devlin, J., M.-W. Chang, K. Lee, y K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. En Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), páginas 4171–4186. Association for Computational Linguistics.
  • Ding, H., Q. Zhong, S. Zhang, y L. Yang. 2021. Text difficulty classification by combining machine learning and language features. En The International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, páginas 1055– 1063. Springer.
  • Foltz, P. W., L. A. Streeter, K. E. Lochbaum, y T. K. Landauer. 2013. Implementation and applications of the Intelligent Essay Assessor. En Handbook of Automated Essay Evaluation. Routledge, páginas 68–88.
  • Fu, J. 2020. Automatic Proficiency Evaluation of Spoken English by Japanese Learners for Dialogue-Based Language Learning System Based on Deep Learning. Ph.D. tesis, Tohoku University.
  • Gilliam, W. 2021. Blur: A library that integrates huggingface transformers with version 2 of the fastai framework. https://github.com/ohmeow/blurr.
  • Hamp-Lyons, L., editor. 1991. Assessing second language writing in academic contexts. Ablex.
  • Hancke, J. y D. Meurers. 2013. Exploring CEFR classification for german based on rich linguistic modeling. Learner Corpus Research, páginas 54–56.
  • Hao, T., X. Li, Y. He, F. L. Wang, y. Qu. 2022. Recent progress in leveraging deep learning methods for question answering. Neural Computing and Applications, páginas 1–19.
  • Heafield, K. 2023. Kenlm language model toolkit. https://kheafield.com/code/kenlm/. Howard, J. y S. Gugger. 2020. Fastai: A layered API for deep learning. Information, 11:108.
  • Jacobs, H. L., S. A. Zinkgraf, D. R. Wormuth, V. F. Hearfiel, y J. B. Hughey. 1981. Testing ESL Composition: A Practical Approach. English Composition Program. Newbury House Publishers, Inc.
  • Jarvis, S., R. Alonso, y S. Crossley. 2019. Native language identification by human judges. En Cross-linguistic influence: From empirical evidence to classroom practice. Springer, páginas 215–231.
  • Jarvis, S. y M. Paquot. 2015. Native language identification. Cambridge University Press.
  • Jurafsky, D. y J. H. Martin. 2021. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall.
  • Kobayashi, A. y I. Wilson. 2020. Using deep learning to classify english native pronunciation level from acoustic information. En SHS Web of Conferences, volumen 77, página 02004. EDP Sciences.
  • Kouris, P., G. Alexandridis, y A. Stafylopatis. 2021. Abstractive text summarization: Enhancing sequence-to-sequence models using word sense disambiguation and semantic content generalization. Computational Linguistics, 47(4):813–859.
  • Lab, T. L. A. 2023. English language learning: Evaluating language knowledge of ell students from grades 8-12. https://www.kaggle.com/competitions/feedbackprize- english-language-learning.
  • Lim, K., J. Song, y J. Park. 2022. Neural automated writing evaluation for korean L2 writing. Natural Language Engineering, páginas 1–23.
  • Liu, Y., M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, y V. Stoyanov. 2019. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  • Malmasi, S., K. Evanini, A. Cahill, J. Tetreault, R. Pugh, C. Hamill, D. Napolitano, y. Qian. 2017. A report on the 2017 native language identification shared task. En 12th Workshop on Innovative Use of NLP for Building Educational Applications, páginas 62–75. Association for Computational Linguistics.
  • Metallinou, A. y J. Cheng. 2014. Using deep neural networks to improve proficiency assessment for children english language learners. En Fifteenth Annual Conference of the International Speech Communication Association.
  • Minaee, S., N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, y J. Gao. 2021. Deep learning–based text classification: a comprehensive review. ACM Computing Surveys (CSUR), 54(3):1–40.
  • Narayan, S. y C. Gardent. 2020. Deep learning approaches to text production. Synthesis Lectures on Human Language Technologies, 13(1):1–199.
  • Ney, H., U. Essen, y R. Kneser. 1994. On structuring probabilistic dependences in stochastic language modelling. Computer Speech & Language, 8(1):1–38.
  • Paszke, A., S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. K¨opf, E. Z. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, y S. Chintala. 2019. PyTorch: An imperative style, highperformance deep learning library. En Advances in Neural Information Processing Systems 32. Curran Associates, Inc., páginas 8024–8035.
  • Polio, C. y H. Yoon. 2020. Exploring multiword combinations as measures of linguistic accuracy in second language writing. En Learner corpora and second language acquisition research. Cambridge University Press, páginas 96–121.
  • Santos, R., J. Rodrigues, A. Branco, y R. Vaz. 2021. Neural text categorization with transformers for learning portuguese as a second language. En EPIA Conference on Artificial Intelligence, páginas 715–726. Springer.
  • Santucci, V., L. Forti, F. Santarelli, S. Spina, y A. Milani. 2020. Learning to classify text complexity for the italian language using support vector machines. En International Conference on Computational Science and Its Applications, páginas 367– 376. Springer.
  • Shao, C., Y. Feng, J. Zhang, F. Meng, y J. Zhou. 2021. Sequence-level training for non-autoregressive neural machine translation. Computational Linguistics, 47(4):891–925.
  • Sharif Razavian, A., H. Azizpour, J. Sullivan, y S. Carlsson. 2014. CNN features off-theshelf: An astounding baseline for recognition. En CVPRW’14, páginas 512–519.
  • Sung, Y.-T., W.-C. Lin, S. B. Dyson, K.- E. Chang, y Y.-C. Chen. 2015. Leveling l2 texts through readability: Combining multilevel linguistic features with the CEFR. The Modern Language Journal, 99(2):371–391.
  • Takai, K., P. Heracleous, K. Yasuda, y A. Yoneyama. 2020. Deep learning-based automatic pronunciation assessment for second language learners. En International Conference on Human-Computer Interaction, páginas 338–342. Springer.
  • Tunstall, L., L. von Werra, y T. Wolf. 2022. Natural language processing with transformers. O’Reilly Media, Inc.
  • Weigle, S. C. 2002. Assessing writing. Cambridge University Press.
  • Wolf, T., L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, y A. Rush. 2020. Transformers: State-of-the-art natural language processing. En Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, páginas 38– 45. Association for Computational Linguistics.
  • Wolfe-Quintero, K., S. Inagaki, y H.-Y. Kim. 1998. Second language development in writing: Measures of fluency, accuracy, and complexity. University of Hawai’i Press.