Resumo
Após a pandemia de Covid-19, as tecnologias digitais estão mais presente nas salas de aula do que nunca. O Reconhecimento Automático da Fala (RAF) oferece possibilidades interessantes para os aprendizes de uma língua estrangeira (LE) aumentarem sua produção oral. O RAF é especialmente adequado para a aprendizagem autônoma de pronúncia quando usado como uma ferramenta de ditado que transcreve a fala do estudante (McCROCKLIN, 2016). No entanto, as ferramentas de RAF são treinadas com falantes nativos monolíngues em mente, não refletindo a realidade dos falantes de inglês em uma escala global. Consequentemente, o presente estudo examinou quão bem duas ferramentas de ditado que utilizam ASR entendem a fala com sotaque estrangeiro e quais características causam falhas de inteligibilidade. Amostras de fala em inglês de 15 falantes de português brasileiro e 15 falantes de espanhol foram obtidas de um banco de dados online (WEINBERGER, 2015) e submetidas a duas ferramentas de ASR: Microsoft Word e VoiceNotebook. As transcrições foram manualmente inspecionadas, codificadas e categorizadas. Os resultados mostram que a inteligibilidade geral dos falantes foi alta para ambas as ferramentas. No entanto, muitas características normais, como modificações vocálicas e consonantais, da fala em LE fizeram com que as ferramentas de ditado ASR interpretassem mal a mensagem, levando a falhas de comunicação. Os resultados são discutidos do ponto de vista pedagógico.
Referências
ASHWELL, T.; ELAM, J. R. (2017). How accurately can the google web speech API recognize and transcribe Japanese L2 english learners’ oral production? JALT CALL Journal, v. 13, n. 1, p. 59-76.
CARLET, A.; KIVISTÖ DE SOUZA, H. ( 2018). Improving L2 pronunciation inside and outside the classroom: Perception, production and autonomous learning of L2 vowels. Ilha do Desterro, v.71, n.3, p.99-123.
BASHORI, M. et al. (2020). Web-based language learning and speaking anxiety. Computer Assisted Language Learning, v. 0, n. 0, p. 1-32.
BASHORI, M. et al. (2021). Effects of ASR-based websites on EFL learners’ vocabulary, speaking anxiety, and language enjoyment. System, v. 99, n. April, p. 102496.
CHEN, H. H. J. (2011). Developing and evaluating an oral skills training website supported by automatic speech recognition technology. ReCALL, v. 23, n. 1, p. 59-78.
CUCCHIARINI, C.; NERI, A.; STRIK, H. (2009). Oral proficiency training in Dutch L2: The contribution of ASR-based corrective feedback. Speech Communication, v. 51, n. 10, p. 853-863.
CUCCHIARINI, C.; STRIK, H. (2018). Automatic Speech Recognition for second language pronunciation training. In: The Routledge handbook of contemporary English pronunciation. Routledge. p. 556-569.
DERWING, T. (2010). Utopian goals for pronunciation teaching. (J. Levis, K. LeVelle, Eds.) In: 1st Pronunciation in Second Language Learning and Teaching Conference. Proceedings... Ames, IA: Iowa State University.
DERWING, T.; MUNRO, M. (1997). Accent, intelligibility, and comprehensibility: Evidence from Four L1s. Studies in Second Language Acquisition, v. 19, n. 1, p. 1-16.
DIZON, G.; TANG, D. (2020). Intelligent personal assistants for autonomous second language learning: An investigation of Alexa. JALT CALL Journal, v. 16, n. 2, p. 107-120.
GASS, S.; VARONIS, E. M. (1984). The effect of familiarity on the comprehensibility of nonnative speech. Language learning, v. 34, n. 1, p. 65-87.
GOLONKA, E. M. et al. (2014). Technologies for foreign language learning: A review of technology types and their effectiveness. Computer Assisted Language Learning, v. 27, n. 1, p. 70-105.
HENRICHSEN, L. E. (2020). An Illustrated Taxonomy of Online CAPT Resources. RELC Journal, 52(1), 179-188.
INCEOGLU, S.; LIM, H.; CHEN, W. H. (2020). Asr for EFL pronunciation practice: Segmental development and learners’ beliefs. Journal of Asia TEFL, v. 17, n. 3, p. 824-840.
JENKINS, J. (2002). A sociolinguistically based, empirically researched pronunciation syllabus for English as an international language. Applied linguistics, v. 23, n. 1, p. 83-103.
JENKINS, J.; COGO, A.; DEWEY, M. (2011). Review of developments in research into English as a lingua franca. Language teaching, 44(3), 281-315.
JOHNSON, E.; JUSCZYK, P. (2001). Word segmentation by 8-month-olds: When speech cues count more than statistics. Journal of Memory and Language, v. 44, p. 548-567.
JURAFSKY, D.; MARTIN, J. H. (2021). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd ed., Unpublished Manuscript). Available at: <https://web.stanford.edu/~jurafsky/slp3/ed3book_sep212021.pdf>. Accessed: Nov, 1st 2021.
KENNEDY, S.; TROFIMOVICH, P. (2008). Intelligibility, comprehensibility, and accentedness of L2 speech: The role of listener experience and semantic context. Canadian Modern Language Review, v. 64, n. 3, p. 459-489.
KIM, I. S. (2006). Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educational Technology and Society, v. 9, n. 1, p. 322-334.
KIVISTÖ DE SOUZA, H.; MORA, J. C. Speech rate effects on L2 vowel production and perception. In: CELSUL, 2012, Cascavel, Paraná. Anais do X Encontro do CELSUL-Círculo de Estudos Linguísticos do Sul. Cascavel, 2012.
KNILL, K. M. et al. (2018). Impact of ASR performance on free speaking language assessment. In: Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings… v. 2018- Septe, p. 1641-1645.
LEVIS, J. M. (2005). Changing contexts and shifting paradigms in pronunciation teaching. Tesol Quarterly, v. 39, n. 3, p. 369-377.
LEVIS, J.; SUVOROV, R. (2013). Automatic Speech Recognition. In: CHAPELLE, C. A. (Ed.). The encyclopedia of applied linguistics. New York: Wiley-Blackwell. p. 316-323.
LIAKIN, D.; CARDOSO, W.; LIAKINA, N. (2015). Learning L2 pronunciation with a mobile speech recognizer: French/y/. CALICO Journal, v. 32, n. 1, p. 1-25.
LIAKIN, D.; CARDOSO, W.; LIAKINA, N. (2017). Mobilizing Instruction in a Second-Language Context: Learners’ Perceptions of Two Speech Technologies. Languages, v. 2, n. 3, p. 11.
LYSTER, R.; SAITO, K. (2010). Oral feedback in classroom SLA: A Meta-Analysis. Studies in Second Language Acquisition, v. 32, n. 2, p. 265-302.
MCCROCKLIN, S. (2019a). Dictation programs for second language pronunciation learning: Perceptions of the transcript , strategy use and improvement. v. 7, n. 2, p. 137-157.
MCCROCKLIN, S.; EDALATISHAMS, I. (2020). Revisiting Popular Speech Recognition Software for ESL Speech. TESOL Quarterly, v. 54, n. 4, p. 1086-1097.
MCCROCKLIN, S. M. (2014). Dictation programs for pronunciation learner empowerment. In: 5th pronunciation in second language learning and teaching conference. Proceedings… n. September, p. 30-39.
MCCROCKLIN, S. M. (2016). Pronunciation learner autonomy: The potential of Automatic Speech Recognition. System, v. 57, n. April 2016, p. 25-42.
MICROSOFT. (2021). Dictate Your Documents in Word. Available at: <https://support.microsoft.com/en-us/office/dictate-your-documents-in-word-3876e05f-3fcc-418f-b8ab-db7ce0d11d3c#Tab=Web>. Accessed: Nov, 1st 2021.
MORA, J. C. (2005). Lexical knowledge effects on the discrimination of non-native phonemic contrasts in words and nonwords by Spanish/Catalan bilingual learners of English. In: ISCA Workshop on Plasticity in Speech Perception.
MROZ, A. (2018). Seeing how people hear you: French learners experiencing intelligibility through automatic speech recognition. Foreign Language Annals, v. 51, n. 3, p. 617-637.
MUNRO, M. (1998). The effects of noise on the intelligibility of foreign-accented speech. Studies in Second Language Acquisition, v. 20, n. 2, p. 139-154.
MUNRO, M. J.; DERWING, T. M. (1995). Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language learning, v. 45, n. 1, p. 73-97.
MUNRO, M. J.; DERWING, T. M. (2015). Intelligibility in research and practice: Teaching priorities. In: REED, M.; LEVIS, J. M. (Eds.). The Handbook of English Pronunciation. Wiley Online Library. p. 375-396.
NAGLE, C. L.; HUENSCH, A. (2020). Expanding the scope of L2 intelligibility research: Intelligibility, comprehensibility, and accentedness in L2 Spanish. Journal of Second Language Pronunciation. 6.
NERI, A.; CUCCHIARINI, C.; STRIK, H. (2006). ASR-based corrective feedback on pronunciation: Does it really work? INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP, v. 4, n. May 2014, p. 1982-1985.
NERI, A.; CUCCHIARINI, C.; STRIK, H. (2008). The effectiveness of computer-based speech corrective feedback for improving segmental quality in L2 Dutch. ReCALL, v. 20, n. 2, p. 225-243.
ROGERSON-REVELL, P. M. (2021). Computer-Assisted Pronunciation Training (CAPT): Current Issues and Future Directions. RELC Journal, v. 52, n. 1, p. 189-205.
VOICENOTEBOOK. (2021). Voice Notebook Homepage. Available at: <https://voicenotebook.com>. Accessed: Nov, 1st 2021.
WEINBERGER, S. (2015). Speech Accent Archive. George Mason University. Available at: <http://accent.gmu.edu>. Accessed: Nov, 1st 2021.
YOSHIDA, M. T. (2018). Choosing technology tools to meet pronunciation teaching and learning goals. The CATESOL Journal, v. 30, n. 1, p. 195-212.
YU, D.; DENG, L. (2015). Automatic Speech Recognition A Deep Learning Approach. London: Springer.
ZIELINSKI, B. W. (2008). The listener: No longer the silent partner in reduced intelligibility. System, v. 36, n. 1, p. 69-84.
Este trabalho está licenciado sob uma licença Creative Commons Attribution 4.0 International License.
Copyright (c) 2022 Trabalhos em Linguística Aplicada