THESIS LEXVET-ESP: DEVELOPING A NEW VOCABULARY TEST AND THE INTEGRATION OF VR FOR SPECIALIZED VOCABULARY ACQUISITION Submitted by Paula Izquierdo García Department of Languages, Literatures and Cultures In partial fulfillment of the requirements For the Degree of Master of Arts Colorado State University Fort Collins, Colorado Summer 2025 Master’s Committee: Advisor: Alyssia Miller De Rutté Shannon Zeller Lily Edwards-Callaway Copyright by Paula Izquierdo García 2025 All Rights Reserved ii ABSTRACT LEXVET-ESP: DEVELOPING A NEW VOCABULARY TEST AND THE INTEGRATION OF VR FOR SPECIALIZED VOCABULARY ACQUISITION The number of Spanish speakers in the U.S. continues to increase, which leads to a growing population of individuals who do not speak English as their primary language, making access to various services, including veterinary care, difficult. To address this issue, Languages for Specific Purposes (LSP) courses have been developed to train (future) professionals to speak in their client’s preferred language. Alongside the rising popularity of LSP courses, technological advancements have also been implemented in language education. Virtual reality (VR) platforms and artificial intelligence (AI) are two examples of relevant tools in second language learning as they allow for the integration of immersive and interactive experiences. This thesis aimed to combine LSP, particularly in Spanish for Veterinary Medicine, and technology to understand the potential effects on language learners’ vocabulary acquisition and retention. The thesis was divided into two projects. The first project focused on developing an assessment tool to test learners’ receptive vocabulary knowledge, defined as words that learners can recognize even if they cannot yet define or use them in different contexts. This assessment builds upon previous studies, which created validated and reliable instruments for measuring vocabulary and proficiency levels. A specialized vocabulary test for Veterinarian Spanish (known as the LexVet- Esp) was developed and validated. The second project implemented a VR/AI platform as part of a Spanish for Veterinary Medicine course and used the LexVet-Esp to assess the impact of the technology on vocabulary acquisition and retention and whether explicit or implicit vocabulary iii exercises contributed to language development. To examine this, students completed the vocabulary test four times. The first test, administered during week nine of a sixteen-week semester, served as the pre-test as participants had not yet interacted with the VR/AI platform. Students took the test again in week twelve after completing three weeks of explicit vocabulary exercises on the platform and again in week fifteen after completing implicit vocabulary activities. Finally, a fourth test was conducted three months after the end of the course to determine long-term vocabulary retention. The results did not show significant differences across the tests, which may have been influenced by the short length of the study (six weeks) and the small sample size (n=15). Despite the lack of statistically significant differences across the tests, results indicated varying complexities related to the vocabulary acquisition and retention processes. Pedagogical implications and future research opportunities are discussed. iv TABLE OF CONTENTS ABSTRACT .................................................................................................................................... ii CHAPTER 1: INTRODUCTION ................................................................................................... 1 1.1 Chapter Overview ............................................................................................... 4 CHAPTER 2: LITERATURE REVIEW ........................................................................................ 6 2.1 Language for Specific Purposes.......................................................................... 6 2.1.1 Spanish for Specific Purposes ......................................................................... 8 2.1.1.1 Spanish for Doctor of Veterinary Medicine Students ............................ 11 2.1.2 Language Needs Analysis ............................................................................. 13 2.1.3 Task-Based Language Teaching ................................................................... 15 2.1.3.1 LNAs, TBLT, and LSP .......................................................................... 17 2.2 Technology in Language Education ................................................................. 18 2.2.1 VR and Language Education ........................................................................ 20 2.2.1.1 Grounded Cognition and Vocabulary Acquisition ................................ 22 2.2.2 AI and Language Education.......................................................................... 23 2.2.3 VR and AI in LSP ......................................................................................... 24 2.2.4 VR and AI in the Veterinary Field ................................................................ 25 2.3 Vocabulary Acquisition in Language Learning ................................................ 26 2.3.1 Vocabulary Assessments .............................................................................. 28 2.4 Research Questions ........................................................................................... 31 CHAPTER 3: THE DESIGN AND VALIDATION OF THE LEXVET-ESP TEST .................. 32 3.1 Introduction to Project 1 ................................................................................... 32 3.2 Methodology ..................................................................................................... 32 3.2.1 Materials ....................................................................................................... 32 3.2.2 Procedure and Participants ............................................................................ 35 3.2.3 Data Analysis ................................................................................................ 36 3.3 Results ............................................................................................................... 37 3.3.1 Point-biserial Correlation .............................................................................. 38 3.3.2 Item Response Theory .................................................................................. 39 3.3.3 Comparisons across proficiency levels ......................................................... 42 3.4 Discussion ......................................................................................................... 43 3.4.1 Comparison of the LexVet-Esp with LexITA and Lextale-Esp ................... 43 v 3.4.2 Addressing the Limitations of Point-Biserial Correlations with IRT ........... 44 3.4.3 Differentiation of Proficiency Levels ........................................................... 45 3.4.4 Limitations .................................................................................................... 46 CHAPTER 4: MEASSURING VOCABULARY ACQUISTION WITH LEXVET-ESP IN A VIRTUAL REALITY ENVIRONMENT..................................................................................... 47 4.1 Introduction to Project ...................................................................................... 47 4.2 Methodology ..................................................................................................... 48 4.2.1 Participants and Context ............................................................................... 48 4.2.2 Materials ....................................................................................................... 48 4.2.3 Procedure ...................................................................................................... 55 4.2.4 Data Analysis ................................................................................................ 56 4.3 Results ............................................................................................................... 57 4.3.1 Assessing Performance Consistency and Retention Over Time ................... 62 4.4 Discussion ......................................................................................................... 63 4.4.1 Impact of VR and AI on Vocabulary Acquisition ........................................ 63 4.4.2 Implication for Educational Practice ............................................................ 66 4.4.3 Limitations and Future Research .................................................................. 67 CHAPTER 5: DISCUSSION AND CONCLUSION ................................................................... 68 5.1 Validation of LexVet-Esp ................................................................................. 69 5.2 Considerations for VR, AI, and Vocabulary Retention .................................... 69 5.3 Future Research ................................................................................................ 70 REFERENCES ............................................................................................................................. 72 APPENDICES .............................................................................................................................. 88 Appendix A: Complete list of 180 terms used to develop the LexVet-Esp .............................. 88 Appendix B: Part 1, Instructions, and Example Included in the LexVet-Esp .......................... 90 1 CHAPTER 1: INTRODUCTION The world is becoming increasingly globalized, leading to a rise in international migration. This phenomenon often results in people relocating to countries where they do not speak the primary language. This process happens in the United States (U.S.), and U.S. immigrants often find employment in a variety of sectors, where language barriers can significantly impact their daily lives and work experiences. An examination of the U.S. occupational data shows that 36.2% of immigrants are classified as being in management, business, sciences, and arts roles, while 21.3% are in the service sector (US Census Bureau, 2024c). When considering the educational level of this population, it is important to note that only 14.9% have a graduate or professional degree, which is in contrast to the 25.6% who do not hold a high school diploma or equivalent (US Census Bureau, 2024c). Language barriers can be especially pronounced in service-oriented professions, where clear communication is essential for successful interactions. One critical area where communication gaps manifest is in healthcare settings, particularly in patient-doctor interactions. Language differences can lead to inaccurate patient assessments, misunderstandings about treatment plans, and ultimately, poorer health outcomes. Similarly, language-related challenges extend to other professions that require direct engagement with clients, including veterinary medicine. In recognition of these challenges, there is a growing emphasis on addressing language barriers across professional fields with increasing focus in language education on teaching Languages for Specific Purposes (LSP). This field of study aims to equip professionals with the language and intercultural skills necessary for their workplace, which will ensure effective communication with diverse client population. In the context of this study, the focus is on 2 teaching Spanish for Veterinary Medicine in which the goal is helping veterinary students in the U.S. establish better rapport with their Spanish-speaking clients through the Spanish language, as the Hispanic population in the U.S. continues to rise. To better understand this study’s importance, it is essential to present current statistics surrounding immigration within the U.S. According to data provided by the US Census Bureau (2024b), the number of foreign-born people has increased to over 45.3 million, which equals 13.7% of the total U.S. population. A significant proportion of this demographic is concentrated in California, New Jersey, New York, and Florida. According to the U.S. Census (2024a), approximately 65.2 million people in the U.S. (19.5% of the population) identify as Hispanic. As shown in the American Community Survey five-year estimates, the U.S. population is projected to be 27% Hispanic/Latino by 2060 (US Census Bureau, 2023). Spanish is the second most spoken language in the U.S., with 22.5% of the population communicating in a language other than English and 13.7% using Spanish as their primary language (U.S. Census Bureau, 2024a). As Spanish continues to grow in usage, the need for Spanish-proficient professionals across industries, including veterinary medicine, is becoming increasingly apparent. As the Hispanic population has grown, so has the rise in pet ownership. Veterinary professionals frequently engage with Hispanic clients, many of whom may prefer to communicate in Spanish when discussing the health and care of their animals. As reported by Larkin (2024), there has been a significant increase in the number of pets within households when comparing data from 1996 to 2024. The growing number of pet owners reinforces the need for accessible veterinary care that accommodates linguistic diversity. According to Brown (2023), 62% of the U.S. adult population owns a pet, and 66% of Hispanic individuals report 3 owning at least one pet. The canine population alone in 2024 was 89.7 million in the U.S., underscoring the broad presence of pets in American households. Even though there has been a high level of pet ownership, there has been a drop in cost associated with veterinary care for dogs. For example, the average cost for routine checkups or preventive care, which represents 80% of visits in 2024, was $147 per visit compared to $190 in previous years (Larkin, 2024). This trend, combined with the growing Hispanic population, highlights the increasing accessibility of veterinary care. However, as more Spanish-speaking pet owners seek routine services, the need for improved communication between veterinary professionals and their clients becomes even more essential. Despite this growing demand, only 10% of veterinary professionals report proficiency in Spanish, although it is unknown how proficiency was assessed to arrive at this statistic (Hopkins, 2023). A range of solutions to increase the number of Spanish-speaking veterinary professionals has been proposed and include the development of specialized courses or certificate programs designed to teach Spanish to veterinary students across the U.S. (Hopkins, 2023). These initiatives aim to equip veterinary students with the language skills necessary to communicate effectively with Spanish-speaking clients, ultimately improving the quality of care provided. In addition to specialized courses and certificate programs, advancements in technology, including virtual reality (VR) and artificial intelligence (AI), are being used to provide language learners immersive and interactive learning experiences in real-world environments. VR allows learners to practice a foreign language in authentic, real-world scenarios, which enhances engagement and motivation (Chen et al., 2022; Özgün & Sadik, 2023). Similarly, AI can facilitate training in the language and give students real-time feedback on writing and pronunciation (Macinska & Vinkler, 2021; Wetherbee, 2023). These technological innovations 4 have the potential to help learners improve their linguistic skills (Sun, 2023), which has the potential to revolutionize language learning, making it more accessible and effective for learners. Therefore, the purpose of this thesis was to integrate a VR and AI platform into a Spanish for Veterinary Medicine course to investigate the effects on second language acquisition with a focus on vocabulary learning and retention. 1.1 Chapter Overview The remaining chapters in this thesis present the theoretical framework, methods, and practical applications of the study. Chapter 2 provides a comprehensive review of previous literature relevant to the study. It explores LSP, with a focus on Spanish for veterinary professionals, alongside methodologies such as the Language Needs Analysis and Task-Based Language Teaching. Additionally, the chapter examines the role of technology in language education, discussing VR, AI, and their applications within the LSP and veterinary fields. The chapter continues with a review of vocabulary retention in language learning and assessment methods used to measure vocabulary knowledge. It then concludes with the research questions that guide the thesis. Chapter 3 presents Project 1, which focuses on the creation and validation of a vocabulary assessment tool for Spanish for Veterinary Medicine, known as the LexVet-Esp. The methodology section outlines materials, procedures, and data analysis techniques, including point-biserial correlations and Item Response Theory to ensure the test’s reliability and validity. The results present the discriminative power of test items, comparisons across proficiency levels, and how the LexVet-Esp aligns with existing assessments in other languages and areas. 5 Chapter 4 details Project 2, which examines vocabulary retention in a VR/AI-enhanced learning environment using the LexVet-Esp as an assessment tool. The methodology describes participant recruitment, study context, materials, and procedures for integrating the VR/AI technology into a Spanish for Veterinary Medicine course. The results assess student performance and retention across multiple test points and the effects of explicit vs. implicit vocabulary instruction. Chapter 5 synthesizes findings from both projects, offering a more extensive discussion on the thesis as a whole. It revisits LexVet-Esp’s validation and discriminative power, explores trends in VR-assisted vocabulary retention, and highlights methodological limitations. The chapter concludes by summarizing the study’s contributions and suggesting areas for future exploration, particularly regarding long-term vocabulary retention and the integration of learning technologies in LSP instruction. 6 CHAPTER 2: LITERATURE REVIEW This chapter provides an analysis of the most relevant literature and theoretical frameworks that are related to this study. It is organized into four main sections, starting with an overview of LSP before focusing on the subfield of Spanish for Specific Purposes (SSP). Then, a summary of two connected areas is presented. The first is an explanation of a Language Needs Analysis (LNA), which is an LSP research methodology, and the second is an overview of Task- Based Language Teaching (TBLT), a language teaching methodology associated with the results of an LNA. The second main section centers on the implementation of VR and AI in language education with emphasis on the use of these technologies in LSP. The third main section is focused on the area of vocabulary acquisition and retention in the field of Second Language Acquisition (SLA). The fourth and final section presents the research questions associated with this project. 2.1 Language for Specific Purposes To better understand what LSP is, specialty languages have to be defined. Gómez de Enterría (2009) defines specialty languages as the languages used in the fields of science, technology, and the professions. These languages are used to transmit specialized knowledge, and Gómez de Enterría (2009) stated that all specialty languages can share common linguistic and functional characteristics. The field of LSP emerged as a way to address the language training of professionals who would use specialty language in their careers. Swales (2000) stated that the beginnings of LSP can be traced back to 1964 when a relationship between linguistic analysis and educational materials was first established. Initially, there was little emphasis on language educators needing expert knowledge of the specialized fields, and instead, LSP 7 practitioners were skilled at conducting basic descriptions of target discourses. As the LSP field grew, there was a need for a more refined approach to research and teaching LSP, and the focus shifted from basic language skills to discourse and genre analysis during the 1980s. Nowadays, LSP maintains a unique relationship with other branches of applied linguistics with close connections to discourse analysis and pragmatics with connections to business and technical communication, translator training, language assessment, and communicative language teaching (Swales, 2000). LSP’s status as a discipline and profession varies globally with limited presence in U.S. graduate programs due to its separation from language acquisition, language teaching methodology, psycholinguistics, and sociolinguistics (Swales, 2000). In the context of the U.S., LSP research faces several challenges, including the need for interdisciplinary integration, the difficulty of collecting authentic and relevant data, and the development of effective assessment tools for specific professional skills. Collaboration between language departments and field-specific programs has become important as “land grant institutions…working in cohort with language departments are particularly well suited to spearhead domain-specific language programs that are methodologically sound and are based on established best practices within the LSP field” (Zeller & Velázquez-Castillo, 2018, p. 295). Adapting to new technologies and incorporating them into LSP curricula presents ongoing challenges. Other challenges to be considered are the global variability of its status (what is the status/recognition in different countries) and the recognition of LSP as a discipline (Lafford, 2012). Additionally, there is often a lack of trained LSP practitioners and researchers with expertise in SLA. Sánchez-López et al. (2017) conducted research in the U.S. in the context of higher education to determine the main areas of LSP that are relevant for research amongst LSP 8 scholars. Through interviews with researchers, Modern Language Association (MLA) members, and department chairs from various institutions in the U.S., Sánchez-López et al. (2017) found the areas of greatest interest for LSP research were business, culture, translation, academic purposes, service learning/community engagement, and health. These findings highlight the diverse professional domains in which language proficiency is essential, reinforcing the growing demand for specialized language instruction tailored to specific career fields. Due to the increasing interest in applying language study to professional contexts, these types of courses appeal to many students (Pastor Cesteros, 2013). Students who enroll in LSP classes typically have a different motivation compared to general language learners as LSP courses have lexical, grammatical, and textual properties that are also different from general language instruction. To align with professional objectives, research suggests that LSP courses integrate meaningful material, authentic texts, and goal-oriented activities designed for real- world application (El Arbaoui, 2024; Nepravishta & Roseni, 2014). In response to this need, educators have developed specialized courses targeting specific field, such as Chinese or Korean for Business, Spanish for Nurses, English for Home Care, Mandarin for Tourism, Legal Arabic, among others (Helms et al., 2023; Trace et al., 2015). 2.1.1 Spanish for Specific Purposes There has been an increasing interest in learning Spanish for Specific Purposes with the most popular areas being science and technology, law, medicine, and tourism (Pastor Cesteros, 2013). The growth of the field of SSP is characterized by an emphasis on global context and collaboration. Significant contributions from Europe and Latin America highlight the importance of communication, connections, and collaboration among SSP scholars and practitioners (Lafford, 2022). Yu et al. (2020) stated that interdisciplinary work is indispensable in order to 9 have authentic and relevant LSP courses as through collaboration, LSP instructors can become more familiarized with the specialty area, and the content expert can become more aware of the importance that language and culture have in their discipline. The interdisciplinarity of this collaboration can occur at different points of the process, such as during course design or the delivery of the course, and it can be an intra-institutional or extra-institutional partnership (Yu et al., 2020). SSP has become an integral part of Spanish curricula in universities, particularly in the U.S., where there is a focus on developing students’ communicative competence to address the needs of marginalized Spanish-speaking communities (Lafford, 2024). The integration of SSP in language curricula aims to provide students with the practical language skills necessary for specific professional contexts, enhancing their employability and effectiveness in diverse fields. In the U.S., the fields that are most researched are related to Science, Technology, Engineering, and Math (STEM), healthcare, business, education, and agriculture (Helms et al., 2023; Lafford, 2024; Miller De Rutté et al., 2024; Pérez, 2021; Salazar et al., 2024; Zeller et al., 2016). Institutions are developing specialized SSP curricula and conducting extensive research on curriculum design and effective teaching methodologies (Pérez, 2021). Zeller & Velázquez- Castillo (2018) provide a deeper contextualization of the need for the development of SSP programs. In their study, they offer a review of how various institutions have addressed SSP offerings, and they discuss how there exists a cohort of undergraduate students eager to participate in a non-traditional SSP area, which is Spanish for Animal Health and Care. The authors delineated the steps undertaken to establish an undergraduate certificate program in this area. They conducted an LNA, identified tasks (i.e., illness treatment and care, health histories, preventative care, among others), gathered additional information about these tasks (i.e., 10 language functions and samples), and created a scaffolded sequence for the identified tasks to facilitate the development of a curriculum. Additionally, the study articulated the critical components that must be considered during the creation of SSP curricula. These components included identifying linguistic characteristics, grammatical forms and structures, lexical attributes, and sociocultural factors that influence professional communication (Zeller & Velázquez-Castillo, 2018). Despite the advancements in SSP, challenges, such as the need for qualified instructors, integration of technology, and accurate assessment tools, persist (Czerkawski & Berti, 2020). SSP practitioners find themselves in a challenging position, caught between the demands of the language and the extensive diversity of scientific disciplines they are expected to address. They are tasked with teaching a specialized language, even though they often lack familiarity with the scientific fields and/or the linguistic tools required to transmit information related to those fields, which leaves them struggling to bridge the gap between two domains they might not fully command (Dodigovic, 1993). Because of the great diversity of disciplines, in most cases, each individual instructor has developed their own approach to teaching. To address these challenges, there has been a call for stronger connections between LSP scholars and practitioners. Professional development in SSP is supported through conferences and publications with notable organizations, such as the Asociación Europea de Lenguas para Fines Específicos (AELFE) and the Congreso Internacional de Español para Fines Específicos (CIEFE), playing key roles in advancing the field (Lafford, 2022). These efforts highlight the dynamic nature of SSP and the ongoing commitment to its advancement through research, education, and professional collaboration. 11 2.1.1.1 Spanish for Doctor of Veterinary Medicine Students “An awareness of Hispanic culture and language is becoming increasingly necessary within the US borders, especially for those who work directly with a Spanish-speaking workforce” (Zeller & Velázquez-Castillo, 2018, p. 290), and as previously mentioned, the area of Spanish for Animal Health and Care is growing at the undergraduate level. Similarly, the SSP field has begun to train future veterinary professionals at the graduate level in the Spanish language that is needed to communicate with Spanish-speaking clients. One initial attempt was carried out by Graves (2014, as cited in Zeller & Velázquez-Castillo, 2018) in which a bilingual workshop on cow reproduction was offered; however, this type of workshop did not focus on communication. Instead, training focused on providing translation of terms and phrases and did not focus on grammatical patterns or pragmatics of the language. Moreover, this training presented only one point of view, the English-speaking perspective, and did not consider speakers from other perspectives, such as the Spanish-speaking community. The lack of the Spanish speaker’s voice reinforces the privilege of dominant cultures while disadvantaging minority groups, and so, cultural competence must also be considered in these course offerings as it is only then that importance will be given to examining and addressing biases (Zeller et al., 2023). In another study, Landau et al. (2015) sent a questionnaire to students at all veterinary schools in the U.S. to understand students’ perceptions of the use and need for Spanish in the veterinary field. Students were asked about their experience in and ability with Spanish as well as the topic of preparedness to communicate in medical terms. Results from this study indicated that there was a gap between students’ general and medical Spanish proficiency, which resulted in many of them feeling unprepared to give medical information to Spanish-speaking clients. 12 Moreover, some students, who self-identified as informal interpreters, did not consider themselves fluent in Spanish, which highlights a potential discrepancy between perceived and actual language skills. On the other hand, the need for Spanish in the veterinary field was not apparent for many of the students surveyed, and the ones who used Spanish in this setting did not stop to think about the challenges and difficulties that non-English speakers experience (Landau et al., 2015). As a result, Landau et al. (2015) stated that “there is room to improve professional communication competencies and diversity/multicultural awareness as identified by the North American Veterinary Medical Education Consortium” (p. 330). Nowadays, there are U.S. institutions that provide Spanish language courses within their Doctor of Veterinary Medicine (DVM) curriculum. For example, Texas A&M University offers a course titled “Medical Spanish”, which is aimed primarily at teaching foundational Spanish skills to facilitate engagement with Spanish-speaking clientele (Zeller et al., 2023). Another institution is the University of California, Davis, which has integrated Spanish language courses into the DVM offerings in collaboration with the Spanish and Portuguese department, and Purdue University offers Spanish language instruction during lunch breaks to DVM students (Zeller et al., 2023). An additional initiative within the field of Animal Sciences, although not directly affiliated with DVM offerings, is the collaborative program established between Texas Tech University, North Carolina State University, and Tarleton State University (Salazar et al., 2024). This program consists of three courses in Spanish administered across three consecutive semesters. The sequence of courses starts at a basic level, concentrating on vocabulary and grammatical structures, and advances to more complex tasks at an intermediate level in the second course. The final course covers discussions and simulations that reflect real-world scenarios related to agriculture and animal care in Spanish. 13 Another study related to the Spanish for Veterinarians Language Program (SVLP) at a large western U.S. university (Forehand et al., 2023). Before developing the program, the researchers conducted a survey to determine if an eventual certificate in Spanish at the DVM level was of interest to the student body. A total of 791 students from across the U.S. responded to the survey, 275 of whom were enrolled at the researchers’ institution. From this survey, it was found that there was great motivation for this type of coursework and that students were even willing to pay for classes out of pocket. Participants mentioned that their motivation to take these courses stemmed from the desire to help them improve client relations, increase confidence, and understand or potentially improve the animal’s health. The SVLP was designed based on an extensive LNA and using TBLT, both of which will be described in the next section, to focus on specific communicative tasks in Spanish so that students could develop the relevant language that they would need to use in their future careers (Zeller et al., 2023). 2.1.2 Language Needs Analysis Chambers (1980) originally explained that “needs analysis should be concerned with the establishment of communicative needs and their realizations, resulting from an analysis of the communication in the target situation” (as cited by Basturkmen, 2010, p. 18). Basturkmen (2010) revised Chambers’ ideas 30 years later and stated that needs analysis in [English for Specific Purposes; ESP] refers to a course development process. In this process the language and skills that the learners will use in their target professional or vocational workplace or in their study areas are identified and considered in relation to the present state of knowledge of the learners, their perceptions of their needs and the practical possibilities and constraints of the teaching context (p.19). 14 There are several advantages to doing this type of analysis, and one of those advantages is the fact that LNAs employ multiple sources of information, incorporating insights from students, professionals, and the target community who directly engages with the specialized language in practice (Malicka et al., 2019). One example of an LNA in the animal sciences field is work done by Zeller and Velázquez-Castillo (2018), who conducted a comprehensive needs analysis centered on the linguistic demands of the Spanish-speaking workforce in livestock environments. Their study involved observations of livestock work routines, interviews with practitioners and clients, and discussions with veterinarians and a farm manager, ensuring that the analysis reflected the real-world communication needs of professionals and the communities they serve. These analyses resulted in the development of a program for future professionals working with livestock farms and establishments. Additionally, the analysis of LNA data puts language tasks at the center. According to Candlin (1987), a task is “one of a set of differentiated, sequentiable, problem-posing activities involving learners and teachers” (as cited in Robinson, 2011, p. 6). Van den Branden (2006) adds that a task is “an activity in which a person engages to attain an objective, and which necessitates the use of language” (p. 4). Through the integration of perspectives from the target language community, a comprehensive and deep understanding arises regarding the specific types of language tasks learners must develop in order to meet the target language community members’ needs (Long, 2005). Therefore, it is through the LNA process that it will be possible to “relate instructional goals, processes, and practices to real-life performance outside the classroom” (Malicka et al., 2019, p. 79). Another example of a comprehensive task-based LNA related to Spanish for Veterinarians at the DVM level involved observations at veterinary clinics in Colorado, Texas, 15 and Colombia, along with other data collection methods, including interviews with veterinary professionals and their Spanish-speaking clients (Zeller et al., forthcoming). This research aimed to improve access to veterinary care for Spanish-speaking pet owners by identifying the clients’ communication needs and developing a curriculum tailored to these needs. The findings informed the creation of a Spanish language program that equips veterinarians with the necessary Spanish language skills to engage effectively with their clients. This program consists of five courses with a total of nine credits. Four of these courses are centered in the veterinary wellness appointment and includes taking the health history, relaying diagnostics, and discussing the treatment plan. All courses are conducted in Spanish. The fifth course, which is one credit, is focused on cultural awareness and access to care, and it is delivered in English. The analysis from the LNA also indicated the entrance proficiency level required for the program, which is a minimum of novice high on the American Council on the Teaching of Foreign Languages (ACTFL) proficiency scale. The proficiency level increases with each course so that veterinarians can build trust and develop stronger relationships with their clients, thereby improving the quality of care provided to the animals. 2.1.3 Task-Based Language Teaching TBLT is the teaching methodology associated with the LNA research methodology. Ellis (2003) stated that “TBLT is an approach based on interactive and communicative tasks aiming at involving learners in meaningful communication and interaction enabling them to acquire linguistic structures as a result of engaging in authentic use” (as cited in Khatib & Dehghankar, 2018, p. 5). TBLT was developed out of the Communicative Language Teaching approach (Motlagh et al., 2014), which emphasizes interaction and communication as the primary goals of learning a new language (Qasserras, 2023). The core concept of the TBLT approach centers on 16 the task itself (Richards & Rodgers, 2001). El Arbaoui (2024) stated that the use of tasks encourages “effective and comprehensive language exposure and usage” (p. 253). TBLT has been widely studied in SLA, and research has shown that tasks aid in several SLA processes, such as understanding input and producing output, as tasks provide students with opportunities to engage in meaningful interaction (Long, 1985; Robinson, 2011). Through task- based activities, learners are encouraged to produce language output, aligning with the Output Hypothesis (Swain, 1985), which suggests that producing language helps learners notice gaps in their knowledge and improve their language skills. Additionally, tasks facilitate a focus on form by drawing learners’ attention to specific linguistic forms while, at the same time, maintaining a focus on meaning (Ellis, 2003). This approach promotes negotiation of meaning, allowing learners to clarify misunderstandings and enhance their comprehension and production skills (Skehan, 1998). TBLT considers numerous philosophical positions and empirical traditions from different fields, including education, applied linguistics, and psychology. Important considerations of this methodology relate to experiential learning, student-centeredness, and a process-oriented approach to syllabus creation (Nunan, 2014). These three characteristics mean that there is a focus on how learners acquire language skills and the strategies they use rather than just the final product or specific language items that students should master. It also proposes that when language is used to accomplish meaningful tasks, it significantly enhances the learning process. Furthermore, this approach highlights that language becomes more beneficial to learning when it holds personal significance or relevance for the learner (Motlagh et al., 2014). These principles underscore the importance of practical, purposeful language use in educational settings. 17 2.1.3.1 LNAs, TBLT, and LSP LSP courses should be meticulously designed using evidence-based methodologies, such as the LNA, to address the language needs required by professionals in a certain domain, which also stresses the importance of adopting TBLT to replicate the real-world tasks that will be carried out outside of the classroom (Naudi, 2023). This approach allows learners to consistently engage with the language in both oral and written forms through activities focused on completing the task in its fullest, often leaving aside grammatical perfection (Cédric, 2021; Naudi, 2023). Investigations conducted by Hattani (2020), Georgy (2023), and Nazari (2020) outline both advantages and disadvantages associated with the incorporation of TBLT and LSP and are discussed next. Among the advantages of combining TBLT in LSP courses are the facilitation of a significant contextual framework for learners as well as the promotion of a student-centered classroom environment. This approach enhances student engagement and contributes to a more stimulating and less repetitive classroom atmosphere (Hattani, 2020). Most of these instructional sessions are conducted in the target language so that students are immersed in the L2 during the class (Nazari, 2020). Due to the student-centered nature of the class, many activities are completed through collaborative engagement in pairs or groups to foster both cooperative and collaborative learning (Hattani, 2020). Research has shown that students find the TBLT nature of their LSP courses to be more engaging than general language courses, and they report an increased sense of self-efficacy when using their L2 (Hattani, 2020). Similarly, students articulate that this pedagogical approach allows them to identify more targeted learning objectives, providing them with better clarity regarding the skills and knowledge they are expected to develop during these classes (El Arbaoui, 2024). Finally, students expressed a desire 18 for instructors to adapt and implement this instructional methodology in other disciplines, including those not directly associated with language acquisition (El Arbaoui, 2024). Some disadvantages of incorporating TBLT and LSP include the time investment required for lesson planning and finding authentic materials, which may not be widely available or adapted for use by language learners (Georgy, 2023). Additionally, instructors must be aware of their role as facilitators of information while simultaneously monitoring and providing feedback without disrupting the tasks or obstructing communication (Georgy, 2023; Hattani, 2020). However, instructors have noted an increase in students’ motivation and self-esteem when employing this pedagogical approach, as it enables students to notice their relevance in real- world contexts (Hattani, 2020). 2.2 Technology in Language Education In this next section, the role of technology in language education, at large, and in LSP are discussed. Technological instruments in the classroom should act as collaborators rather than obstacles to allow for the successful integration of supplementary resources and activities, and there are many possibilities to do so. As noted by Chacón Medina (2007), “the greatest potential of new information and communication technologies (ICTs) is derived from the capabilities of manipulation, storage, and distribution of information in an easy, fast, and accessible way for all people” (p. 25). ICTs can increase access to knowledge while optimizing educational processes. This characteristic makes technology indispensable in education, and the incorporation of technology into L2 teaching is becoming progressively common to help promote immersion in authentic contexts. 19 Technological advancements have brought a variety of tools and platforms that make language learning more engaging and effective (Mayer, 2014). Some of the most popular include multimedia tools, mobile apps, digital games, augmented reality (AR), and VR (Blyth, 2018; Godwin-Jones, 2011; Ibáñez & Delgado-Kloos, 2018; Pachler et al., 2010; Reinders & Wattana, 2015). Multimedia tools, such as videos and audio recordings, help learners improve their listening and speaking skills by exposing them to authentic language use (Pachler et al., 2010). Multimedia tools also help improve language comprehension and retention as they offer a variety of input types, such as audio, visual, and textual (Mayer, 2009). Apps like Duolingo and Babbel use smartphones and tablets to offer flexible, on-the-go learning opportunities, especially for learners with busy schedules or limited access to traditional classrooms (Kukulska-Hulme & Shield, 2008; Reinders & Hubbard, 2013). Mobile technologies use adaptive learning systems that personalize instruction to meet individual learners’ needs while offering tailored feedback and practice, which boost learning efficiency. These apps also provide interactive exercises and gamified experiences that keep learners motivated (Godwin-Jones, 2011). Digital games can create immersive environments for practicing language through problem-solving and storytelling (Reinders & Wattana, 2015), which increases learner engagement and motivation and leads to higher participation rates and greater persistence in language study (Sykes & Reinhardt, 2012). Another example of this idea is found in AR as AR overlays digital information onto the real world to help learners engage with vocabulary and phrases in context (Ibáñez & Delgado-Kloos, 2018), while VR provides immersive virtual environments where learners can practice language skills (Blyth, 2018). AR and VR deliver contextualized practice, which helps students transfer language skills to authentic communicative environments (Holden & Sykes, 2011). Additionally, digital games and VR can support collaborative learning by allowing learners to interact and 20 communicate in virtual spaces to enhance language practice and foster social interaction (Thorne, 2008). 2.2.1 VR and Language Education VR is one of the main technologies in this study, and this next section will focus on VR use in language education. VR is defined as “a simulation of a three-dimensional virtual environment, generated by a computer, in which the individual can engage with the aforementioned environment” (Peixoto et al., 2021, p. 48952). VR systems can be divided into three distinct categories: non-immersive, semi-immersive, and immersive. The principal differentiation among these categories is based on the degree of immersion experienced by the user, as well as the cost associated with these categories. As the level of immersion increases, there is a corresponding need for specialized equipment and advanced technological infrastructure to enable a more authentic and interactive environment (Peixoto et al., 2021), and all three levels of immersion have appeared in language teaching (Lin & Lan, 2015). Immersive VR settings can replicate real-world situations, enabling students to develop language competencies within authentic contexts that are otherwise challenging to recreate in conventional classrooms. This type of VR can enhance speaking and listening skills as learners can practice conversations in a safe space, which helps them build both confidence and proficiency (Lin & Lan, 2015). In the world of immersive VR, new concepts have emerged, like the idea of the metaverse. According to Anacona Ortiz et al. (2019), metaverses are characterized as “virtual worlds that allow users to let their imagination run free” (p. 62). Within educational frameworks, metaverses are described as “real-time immersive 3D simulated environments whose ecosystem is well-suited to incorporate audiovisual notifications, resulting in impressive configurations for formative or pedagogical spaces” (Barráez-Herrera, 2022, p. 16). The 21 transition towards immersive digital learning environments emphasizes the potential of VR technologies in augmenting student engagement and promoting authentic learning experiences. The effectiveness of VR in the field of L2 learning has been a topic of investigation with a focus on the outcomes of the implementation of these tools. VR-assisted language learning (VRALL) provides “sensory-rich environments, allowing [learners] to experience telepresence (i.e., the feeling of ‘being there’ in the target language country)” (Kaplan-Rakowski & Wojdynski, 2018, p. 124). This immersive modality possesses the capacity to augment learner motivation and engagement, particularly in situations where exposure to authentic linguistic environments is constrained. One study that investigated the application of VR in the field of language learning found that VR facilitates learning experiences while eliminating geographical barriers for L2 learners (Kaplan-Rakowski & Wojdynski, 2018). This study also found that 82% of participants wanted to keep studying a language after using VR. Similarly, a systematic review of immersive VR applications in higher education found improved educational outcomes due to the integration of authentic tasks, real-time feedback mechanisms, and interactive simulations (Radianti et al., 2020). In terms of the effects of VR on L2 vocabulary acquisition, there are relatively few studies despite the growing interest in the field (Agurto-Cabrera & Guevara-Vizcaíno, 2023; Moreno Martínez & Galván Malagón, 2020; Valero Franco & Berns, 2023). However, Legault et al. (2019) found that immersive VR can be an effective tool for L2 vocabulary acquisition as VR scenarios promote contextualized vocabulary use and can improve memory retention through multisensory experiences. Additionally, providing these tools to learners allows them to use their motor skills while learning an L2, which is again achieved because of the replicas of real-world scenarios where learners can engage with vocabulary in meaningful ways (Legault et al., 2019). 22 Other research investigated the advantages and limitations of using VR in L2 education. Advantages included having an authentic, real-life environment and the ability to engage with multiple senses leading to higher motivation, retention, and achievement (Klimova, 2021). VR supports diverse learning styles, fosters creativity, and boosts self-confidence while reducing anxiety and encouraging active participation and the development of learner autonomy. However, there are some limitations to consider, such as the high cost of software and teachers and students’ potentially limited tech skills (Klimova, 2021). The success of VR applications relies on carefully designed content that follows cognitive and instructional design principles (Mayer, 2014). The challenge for educators and programmers, then, lies in the creation of VR experiences that integrate interactivity, immersion, and educational efficacy while ensuring accessibility and cost-effectiveness to promote wider adoption. 2.2.1.1 Grounded Cognition and Vocabulary Acquisition VR provides not only an immersive environment for language learners but also sensorimotor experiences. This combination can be explained by the framework of grounded cognition, which emphasizes the important role of sensorimotor experiences in the language acquisition process while highlighting the importance of contextualized and experiential approaches in vocabulary acquisition. Barsalou (2008) suggested that cognitive functions, including language comprehension, are fundamentally intertwined with the brain’s modal systems dedicated to perception, action, and introspection. This notion indicates that lexical items are not simply abstract representations. Instead, they are deeply rooted in sensorimotor experiences, reinforcing the importance of contextualized and experiential learning approaches in vocabulary acquisition. As individuals interact with their environment, the sensorimotor states linked to these experiences are encoded by memory systems for future representational use 23 (Barsalou, 2023). Therefore, grounded learning methodologies emphasize the importance of contextual and experiential vocabulary acquisition, where lexical items are learned through active participation in authentic real-world contexts rather than through passive retention. The multimodal characteristics of this learning approach ensure that vocabulary components are associated with rich, interrelated representations that include visual, auditory, tactile, and emotional dimensions and can also be closely connected to cultural factors, as vocabulary acquisition is influenced by culturally specific situated activities and linguistic expressions (Barsalou, 2023). This theory supports the integration of VR in L2 education. 2.2.2 AI and Language Education In addition to VR, AI is another technological tool that can be used in education. AI is defined “as the branch of computer science dedicated to creating systems capable of performing tasks that typically require human intelligence” (Macinska & Vinkler, 2021, p. 4). Generative AI (GenAI) is characterized by its ability to generate new content that has not existed previously. As such, GenAI can create output that may appear original, and one example of this is chatbots that can mimic human conversations (Macinska & Vinkler, 2021). Chatbots are software applications that simulate human conversation by asking and answering questions via text or audio. They are not new as they have existed since the 1960s, and they have been integrated into different messaging apps for years (Son et al., 2023). In the context of L2 learning, chatbots have been programmed to engage in conversations with the learner to perform different tasks. Research has demonstrated that the use of chatbots increases the output of the learner and also enhances their speaking performance (Son et al., 2023). 24 AI is revolutionizing language education by offering new ways to develop different linguistic skills. In their systematic review, Zhu & Wang (2025) highlighted different ways in which AI tools can be used in the process of language learning, including AI-powered writing evaluation systems that provide feedback on written work as well as other tools like ChatGPT. AI-driven avatars are also helping learners improve their pronunciation and speaking abilities. AI-enhanced language learning experiences have the potential to address multiple dimensions of language learning and proficiency development; however, GenAI was only released in an open access format in 2022, and much more research is needed to understand its full potential. 2.2.3 VR and AI in LSP As this study is focused on vocabulary acquisition in LSP and integrates VR and AI, this next section presents relevant, albeit limited, research in this area. Li et al. (2022) studied the effects of an immersive VR experience on vocabulary acquisition in English for Geography. Through the interaction with multimodal and contextualized scenarios, such as the hydrologic cycle, researchers found that VR can improve incidental vocabulary learning and learner engagement as the participants were able to activate memory and comprehension due to the immersive environments (Li et al., 2022). Additionally, these environments supported cognitive, behavioral, and social participation, which are essential for mastering specialized terminologies (Li et al., 2022). Similarly, Park (2022) studied vocabulary learning in a VR-enhanced real-world environment called the Digital Kitchen. This study combined TBLT with multimodality to engage learners’ senses (i.e., touch, smell, and taste), which are often absent in traditional or virtual settings. Results showed that learners in the Digital Kitchen achieved higher scores in vocabulary retention compared to those in traditional classroom settings. Park (2022) highlighted 25 the importance of integrating physical and sensory experiences into language learning to enhance memory and language acquisition. Miller De Rutté (2024a, 2024b) investigated the role of VR in medical Spanish instruction. Undergraduate students enrolled in a medical Spanish course engaged in immersive VR simulations designed to replicate real-world professional interactions, allowing them to perform medical tasks in Spanish. Miller De Rutté (2024b) found increases in students’ motivations levels after using VR as students expressed a heightened ability to visualize themselves successfully performing professional tasks. Additionally, students emphasized specific skill development, particularly in conversation, listening, pronunciation, and critical thinking (Miller De Rutté, 2024a). The immersive nature of VR was found to deepen students’ self-reported levels of engagement and focus while amplifying course topics related to cultural awareness, pragmatics, and applied medical vocabulary. Participants expressed that the simulations created a more authentic and interactive learning environment, reinforcing their ability to connect language skills with professional application (Miller De Rutté, 2024a). 2.2.4 VR and AI in the Veterinary Field Similarly to the use of technology in language education, VR and AI have also been implemented in the veterinary field as they offer new solutions to improve education, diagnostics, and administrative efficiency (Appleby & Basran, 2022; Sobkowich, 2025). Specifically, VR enables veterinary students and professionals to practice surgical techniques and diagnostic procedures in a controlled, risk-free environment (Xu et al., 2023). These immersive simulations not only improve skill acquisition but also build confidence, preparing practitioners for real-life scenarios (Appleby & Basran, 2022; Sobkowich, 2025). VR is also used in anatomy education, where students can explore 3D models of pet’s anatomy derived from real 26 cases, and this approach has been shown to improve interactive learning while linking anatomical studies to clinical relevance (Aghapour & Bockstahler, 2022). Regarding the use of AI in veterinary medicine, Sobkowich (2025) discussed the variety of ways in which AI is being used in the field. Examples include automating workflow for scheduling appointments, predicting surgery complications, using adaptive testing systems, or extracting data from patient records. In the education field, Sobkowich (2025) states that AI is being used with platforms like Khanmigo to provide adaptive and on-demand support for students, while intelligent learning management systems adjust lesson pacing and sequencing based on individual progress. Another tool that is being used is AI-driven chatbots to recreate clinical scenarios or practice client communication. However, despite these technological advancements, the application of VR and AI combined specifically for communication in veterinary contexts remains largely unexplored. 2.3 Vocabulary Acquisition in Language Learning Vocabulary acquisition serves as a fundamental function in the process of L2 learning with a direct influence on communicative competence and language proficiency. Vocabulary is part of the foundation for developing the four language skills (i.e., listening, reading, writing, and speaking). There are complex learning processes involved in vocabulary acquisition and retention, and retention processes are interconnected with the acquisition and transfer phases of the learning process in addition to memory and cognitive processes (Houston, 2001 as cited in Sanatullova-Allison, 2014). Sanatullova-Allison, (2014) explained that once an item is perceived, it enters primary memory (PM) with short-term storage. Rehearsal is necessary for the item to remain in PM and, if rehearsal is long enough, the item may enter secondary memory (SM), which is long-term storage […] Long-term 27 memory is made of declarative and procedural knowledge: the former is the knowledge about facts and the latter is the knowledge about how to perform tasks (p. 2). Different language learning and teaching principles have been implemented to promote L2 vocabulary acquisition and memory retention (Nation, 2022). For example, spaced repetition learning, where review intervals are lengthened after successful recall and shortened after mistakes, has been found to significantly improve long-term retention compared to studying in one continuous session (Sanatullova-Allison, 2014). Other research has shown that engaging in active recall, defined as a learning technique that involves actively retrieving information from memory rather than passively reviewing it (Pham et al., 2016), significantly improves memory retention, promotes deeper cognitive engagement, and leads to better learning outcomes unlike strategies such as re-reading or passive review (Karpicke & Roediger III, 2008). Similarly, in terms of instructional techniques to teach vocabulary, research has found that intentional vocabulary teaching generally results in long-term retention as well as proficiency gains in the L2 (Schmitt & Schmitt, 2020). Other research has found that learners who actively participated in tasks outperformed those who did not engage in tasks (Joe 1998, as cited by Sanatullova- Allison, 2014), which emphasizes the importance of task-based teaching approaches. Word frequency, how words are presented, what students do with the words, and what language is used when explaining vocabulary are other key considerations in instruction (Borawski, 2019). Presenting vocabulary in terms of frequency based on a spoken L2 corpus and using those words in a small and morphological word list has been found to improve the acquisition and retention of L2 vocabulary (Borawski, 2019). Therefore, vocabulary knowledge is a construct that includes form, meaning, and use, and knowing a word is not only recognizing or defining a word but using it in the correct context. 28 Mastering all of these aspects enables a speaker to use the word with native-like fluency and accuracy (Schmitt & Meara, 1997). Previous research indicates four key aspects inherent to vocabulary knowledge, which include vocabulary size (the number of words known), collocational competence (the learner’s ability to recognize and use word combinations that naturally occur together), receptive vocabulary knowledge (the ability to recognize words), and productive vocabulary knowledge (the ability to use words appropriately) (Masrai, 2023; Nation, 2022). Apart from these core dimensions, vocabulary knowledge is also influenced by depth of word understanding, including semantic associations and morphological awareness (Mcbride- Chang et al., 2008; Samaraweera, 2025). The relationship between these elements ensures that learners not only recognize and produce words but also interpret subtle meanings and contextual relevance. This process enhances their ability to communicate effectively in the target language. 2.3.1 Vocabulary Assessments Evaluating L2 vocabulary comprehension is an essential element in language proficiency assessment. Over time, scholars have developed a variety of instruments to evaluate both the breadth and depth of learners’ vocabulary knowledge with different tools measuring different vocabulary aspects. These instruments include traditional word-list exams as well as more advanced assessments based on contextual vocabulary use that evaluate the learner’s capability to identify and use words within specific contexts. Some of the most common types of vocabulary tests are receptive vocabulary tests, productive vocabulary tests, and computer- adaptive testing. Initial research in vocabulary assessment focused on measuring receptive vocabulary through breadth, which refers to the number of words learners know, with common tools like the Vocabulary Size Test (Nation, 1990; Nation & Beglar, 2007). However, it was later recognized 29 that vocabulary depth, or how well learners understand word meanings, associations, and usage, is also important to assess (Pignot-Shahov, 2012; Sun et al., 2023). Productive vocabulary tests assess the learner’s ability to actively use words in different contexts, such as translation (Fitzpatrick, 2007). One example of this type of test is the Productive Levels Test (Laufer & Nation, 1999), which evaluates vocabulary knowledge based on word-frequency bands. For this test, participants reorganize letters to create a word, and there are a possible total of eight words across five levels (Fitzpatrick, 2007). Computer-adaptive testing, such as the Computer Adaptive Test of Size and Strength (CATSS), assesses both breadth and depth of vocabulary knowledge by measuring active recall, passive recall, active recognition, and passive recognition while adapting dynamically to the learner’s proficiency level (Pignot-Shahov, 2012). However, the most widely used method is the receptive vocabulary test, which measures learners’ ability to recognize words and is the type of test used in this study. This type of test does not require the learner to produce language, which makes it accessible to a wide range of proficiency levels (Amenta et al., 2020). As mentioned, a common receptive vocabulary assessment is the Vocabulary Levels Test (Nation, 1990). It encompasses various levels of difficulty, from basic to more advanced. The assessment requires learners to match words with their corresponding meanings, and each level presents a distinct set of words that align with frequency bands in the English language (e.g., the 1,000 most frequently encountered words, the 2,000 most frequently encountered words, etc.). The Vocabulary Size Test, on the other hand, was created by Nation & Beglar (2007) to assess learners’ vocabulary size. This test consists of a series of word recognition tasks designed to evaluate the number of words a learner can comprehend. The items are sourced from various 30 frequency bands in the English language, and the test provides an estimation of the total number of words with which a learner is familiar across the different frequency bands. Another widely known assessment is the LexTALE test, introduced by Lemhöfer & Broersma (2012), which evaluates the receptive vocabulary size of advanced English learners. The LexTALE uses a lexical decision task format, requiring learners to decide whether the presented words are real English words or invented terms. The assessment is designed to be both quick and reliable and incorporates a scoring system that determines a learner’s vocabulary size based on their performance. The LexTALE can differentiate between varying levels of language proficiency and is a validated measure. Building upon the success of the LexTALE, other researchers have created similar instruments for other languages. Izura et al. (2014) developed the Lextale-Esp, a test adapted for the Spanish language. The test consists of 60 items (40 real words and 20 non-words) and can differentiate across Spanish language proficiency levels, including at the higher ends of the proficiency scale. Amenta et al. (2020) developed the LexITA to measure receptive vocabulary size in Italian and has been validated as an effective assessment for evaluating vocabulary knowledge across a spectrum of proficiency levels, as its equivalents in Spanish and English. These three tools, the LexTALE, Lextale-Esp, and LexITA are distinguished from others by their efficiency and accuracy. They are structured for quick administration, generally requiring five to ten minutes to complete, while also enabling objective scoring through a simple yes/no format. This study built on the strengths of these existing assessments, as it followed their methodology. However, it further adapted their work to create a specialized vocabulary test to the address needs of veterinary Spanish learners. 31 2.4 Research Questions The purpose of this study was to understand the effects of the use of VR and AI on Spanish vocabulary acquisition and retention in a course on Spanish for Veterinarians. The following research questions (RQs) guided the study: RQ1: Does the use of a VR and AI platform in the Spanish for Veterinarians classroom affect vocabulary acquisition in L2 learners? RQ2: Does the use of explicit versus implicit vocabulary learning activities in a VR and AI environment influence learners’ vocabulary acquisition and retention? 32 CHAPTER 3: THE DESIGN AND VALIDATION OF THE LEXVET-ESP TEST 3.1 Introduction to Project 1 The overarching goal of this thesis was to understand technology’s influence on vocabulary acquisition and retention in students enrolled in courses on Spanish for Veterinarians. However, there is no validated vocabulary assessment to measure vocabulary acquisition in this area. Therefore, the main objective of this first part of the research was the creation of a receptive vocabulary acquisition test to serve as a data collection measure in the subsequent chapter. The LexVet-Esp test, as this new test is now called, followed the same procedures from previous research to develop and validate the new test (Amenta et al., 2020; Izura et al., 2014; Lemhöfer & Broersma, 2012), which is described in the following subsections. The specific research questions guiding this first project were as follows: RQ1: How accurately does the LexVet-Esp predict Spanish vocabulary knowledge and proficiency? RQ2: How well does the LexVet-Esp differentiate between native and non-native Spanish speakers? 3.2 Methodology 3.2.1 Materials Ninety words were extracted from a spoken corpus on Spanish for Veterinarians (Zeller et al., forthcoming). The corpus was based on veterinary medicine observations from four different locations, Colorado, Texas, Mexico, and Colombia, and included observations in both Spanish and English. For this project, only the Spanish interactions were analyzed. These 33 interactions occurred between different personnel in a veterinary clinic (e.g., veterinary technician, veterinarian, etc.) and the Spanish-speaking client. The selection of the 90-word items was based on frequency distributions as described in previous studies (Amenta et al., 2020; Izura et al., 2014; Lemhöfer & Broersma, 2012). Words spanned from exceedingly high frequency words (e.g., perro/dog or doctor/doctor), which are anticipated to be familiar to most Spanish speakers, to very low frequency words, (e.g., férula/splint or sarro/tartar), which are likely to be recognized exclusively by highly proficient primary language speakers. The characteristics of the word items, including their distribution, length, and frequency per million words, are presented in Table 1. Overall, 26 words had a frequency of less than one occurrence per million words, 23 had a frequency of one to five occurrences per million, 14 had a frequency of 6 to 10 occurrences per million, 17 had a frequency of 11 to 20 occurrences per million, eight had a frequency of 21 to 100 per million, and two words (comer/to eat and mirar/to look) had a frequency greater than 100 per million. Most of the words were nouns (n=50), and there were equal numbers of verbs (n=20) and adjectives (n=20). Table 1 Characteristics of the word items included in the initial test with 90 words Distribution Length (letters) Frequency per million min 3 22.22 1st quartile 6 111.11 median 8 144.44 3rd quartile 9.75 88.88 max 15 33.33 34 mean 8.13 144.44 SD 2.58 677.77 Next, a compilation of 90 non-words was generated. These non-words were pseudowords, but they followed similar patterns to real words. For example, verbs in Spanish end in -ar, -er, or ir, and non-word verbs followed that same pattern (i.e., real word - comer; non-word - riñonar). ChatGPT 3.5 was used to generate non-word items after it was given the authentic word list and was prompted to create 90 non-words, including nouns, verbs, and adjectives, similar to the real word list. It was also prompted to use different parameters, such as including suffixes analogous to various grammatical categories (e.g., adding mente for an adverb) or verbs with Spanish characteristics (e.g., ending in -ar, -er, -ir). ChatGPT 3.5 generated non-word parts of speech, including nouns (toráciz, cachorreo, parasíticoz), adjectives (pulgario, abdominado, anestesiático), and verbs (autolimitantear, alérgizer, brinquir). After the creation of these 90 non-word items, they were reviewed to verify that the words did not exist. To do this, two dictionaries were consulted – the Real Academia Española (2014) and the Real Academia Nacional de Medicina (2012). Non-words were searched in both dictionaries to verify that they did not exist in either of them. If they did exist, a new non-word was created for which it could be substituted. There were only ten words that ChatGPT 3.5 generated as non-words but were, in fact, real words. They included ótico (otic), lipémico (lipemic), intratorácico (intrathoracic), cremar (to cremate), sondeo (probing), remisivo (remissive), venenoso (poisonous), blandir (brandish), and esterilizante (sterilizing). Table 2 shows the distribution, length, and frequency per million words of the selected non-words. Note 35 that the frequency per million words is listed as N/A due to the fact that these are non-words, and so, frequency per million words cannot be calculated. Table 2 Characteristics of the non-words items included in the initial test with 90 non-words Distribution Length (letters) Frequency per million min 3 N/A 1st quartile 6 N/A median 8 N/A 3rd quartile 9.75 N/A max 15 N/A mean 8.13 N/A SD 2.58 N/A 3.2.2 Procedure and Participants To determine which real words and non-words were to be included in the vocabulary assessment, the list of all words was distributed to students at different proficiency levels in the form of a written paper test. The test (Appendix A) was administered in person so that the participants did not have access to the internet nor could they consult a dictionary or online translator to verify whether a word existed or not. In the first part of the test, participants answered a short questionnaire where they indicated the level of the class they were taking, their age, race, and what their primary language was. Then, participants were given a written explanation of the test with an example of four common words (sí - yes; sacapuntas - pencil 36 sharpener; bien - good; casa - house) so that they could better understand how the test functioned (Amenta et al., 2020). The instructions for this first part of the test were provided in English. This test was given to students enrolled at a large western university in the U.S. Participants were divided into two main groups. The first group consisted of primary Spanish speakers (n=23), and the second group was both heritage Spanish speakers and Spanish L2 speakers (n=180). The second group was further divided according to the level of the class in which the participants were enrolled, following previous research. There were 49 participants at the 100-level, 35 at the 200-level, 63 at the 300-level, 30 at the 400-level, and three at the 500- level. Participants were then given the 180 items (90 real words and 90 non-words) that were randomly ordered, and they were asked to identify the real words on the list. All participants received the same order of real words and non-words. Participants were not given a specific time limit to complete the test; however, no participant took more than 15 minutes to complete it. 3.2.3 Data Analysis The first part of the assessment collected demographic data. This data was analyzed using descriptive statistics and was used to separate the participants into two main groups (native speakers vs. heritage and L2 speakers) as well as across the different proficiency levels. The vocabulary test was then analyzed using a series of statistical tests to determine which 60 real words and 30 non-words were to be included in the final version. First, following previous research (Lemhöfer & Broersma, 2012; Brysbaert, 2013), the tests were scored using the following formula: Score = Nyes to words – 2 * Nyes to non-words 37 This formula was used as it penalized participants for random guessing behavior, meaning that if a participant answered arbitrarily (i.e., affirmatively responding to half of the real words and half of the non-words), they would obtain a score close to zero. This formula also allows for participants to earn a negative score if they incorrectly categorized non-words as real words. Then, point-biserial correlations and Item Response Theory (IRT) were conducted. Point- biserial correlation coefficients range from -1.0 to +1.0. For the purposes of this study, participants with a positive correlation exhibited superior performance on the overall test while also excelling on the specific item in question. However, a negative correlation implied that the test was not successfully measuring the intended outcome, as it suggested that proficient participants were performing less favorably on the specific item compared to their lower proficiency counterparts. An IRT analysis was also conducted to determine if items on the assessment were accurately measuring the proficiency of the participants. This type of analysis provided information related to both the difficulty level of the item and its discriminatory power. Discriminatory power refers to the steepness of the item response curve, transitioning from “not known” at the lower end of the ability spectrum to “known” at the upper end. Once the most effective test items were selected, test performance across groups of participants was compared using an Analysis of Variance (ANOVA) to determine if the test could discriminate across proficiency levels. The R package ltm (Rizopoulos, 2006) was utilized for all analyses in this study. 3.3 Results Participants’ (n=203) ages ranged from 18-70 with a mean age of 21.65. See Table 3 for mean, standard deviation (SD), and age range by class level. The majority of participants’ self- 38 identified as White (n=142; 69.95%), which was followed by Hispanic/Latino (n= 44; 21.67%), other (n = 11; 5.42%), or Black/African American (n = 6; 2.96%). When asked about their primary language, the majority replied that English was their L1 (n= 178; 87.68%), while a smaller portion indicated that Spanish was their first language (n = 25; 12.32%). Table 3 Distribution of age according to the level of the class Group Mean SD Range 100 21.51 7.53 47 (65-18) 200 18.94 8.50 52 (70-18) 300 20.25 1.38 7 (25-18) 400 18.74 9.75 51 (70-20) 500 25.5 3.62 8 (30-22) Native speakers 29.8 8.81 22 (43-21) 3.3.1 Point-biserial Correlation To select the items to be included for the final test, the quality of both the real words and non-words was analyzed by calculating the point-biserial correlation between the responses to each item and the cumulative scores of the participants. All items had a positive correlation (ranging from r = 0.08 for the word hez/feces to r = .68 for the word glándula/gland). Since there were only positive correlations, no items could be eliminated, and an IRT needed to be conducted next. 39 3.3.2 Item Response Theory The results from the IRT analysis produced charts with difficulty of the word item along the x-axis and probability of correctly answering the item along the y-axis. This information then allowed for the selection of items that spanned a broad spectrum of difficulty using discriminatory power, which was measured by the steepness of the response curve at its midpoint. A steeper curve indicated greater discriminatory power of the item. Figures 1 and 2 show the IRT charts for real words and non-words. Figure 1 IRT of real words 40 Figure 2 IRT of non-words Based on the findings from the IRT analysis, a selection of 60 real words and 30 non- words of varying difficulty levels with strong discriminatory power was made. This selection process involved ranking the items based on their difficulty levels and choosing those with optimal discriminatory power at approximately 1/30th of the total range covered by the items (Amenta et al., 2020; Izura et al., 2014). This process occurs so that the most difficult words can be excluded to ensure better discrimination between participants at the highest proficiency levels. 41 Table 4 presents a selection of sample words, where the intercept represents the IRT item difficulty, and z1 indicates the discriminative power of each word relative to the others. Table 4 Discriminative Power and IRT Scores for Selected Items Item Intercept z1 Cremación (cremation) 1,338,778,108 16,326,639 Irritar (to irritate) 0.508833260 11,035,564 Remisar (non-word) -4,156,332,499 0.1710680 Enroncharar (non-word) -4,156,332,499 0.1710680 Based on these results, the final word list was determined, and descriptive statistics can be seen in Table 5. The table presents the distribution of word and non-word lengths. Word length ranged from three to 14 letters, with a median of nine, while non-word length spanned from seven to 15 letters, with a median of 11. The frequency per million varied across words, averaging 16,821.35, whereas non-words do not have frequency values assigned. Table 5 Descriptive statistics of the final words Distribution Words Non-words Length (letters) Frequency per million Length (letters) Frequency per million min 3 2,320.19 7 N/A 1st quartile 7 2,320.19 9 N/A median 9 8,120.65 11 N/A 42 3rd quartile 11 23,201.86 12 N/A max 14 102,088.17 15 N/A mean 8.98 16,821.35 10.74 N/A SD 2.94 21,472.48 2.53 N/A 3.3.3 Comparisons across proficiency levels The formula for scoring the final test as described previously was used to calculate participants’ scores. Possible scores ranged from -180 (if a participant selected all non-words as real and all real words as non-words) to 90 (if a participant accurately identified all real words and refrained from selecting any non-words as real). Table 6 shows the groups of participants by level along with their mean score, standard deviation, and the range of test scores. Table 6 Test scores by proficiency group Group Mean score SD Range 100 -5.59 16.52 63 (-44 – 19) 200 1.97 20.66 114 (-85 – 29) 300 15.28 17.59 96 (-30 – 66) 400 18.06 18.36 71 (-25 – 46) 500 47.67 25.10 54 (18 – 72) Native speakers 66.20 11.63 28 (52 – 80) 43 A one-way ANOVA was conducted to determine the difference in mean scores across the six proficiency groups. Overall, there was a statistically significant difference among the groups (F(5, 197) = 25.852, p < 0.001). Tukey post hoc analyses revealed evidence of a significant difference between all levels except 100 vs. 200 (p = 0.4089), 300 vs. 400 (p = 0.9759) and 500 vs. native speakers (p = 0.5451). 3.4 Discussion The objective of the initial investigation was to create and validate an instrument to test the acquisition of receptive vocabulary relevant to Spanish for Veterinarians. To do so, test items that possess the capability to effectively differentiate across proficiency levels were statistically determined. Unlike general vocabulary tests, such as the LexITA (Amenta et al., 2020) or the Lextale-Esp (Izura et al., 2014) which measure general vocabulary, this instrument focuses on technical terms for veterinary professionals. This section discusses the effectiveness of the LexVet-Esp in assessing specialized vocabulary knowledge by comparing its results to existing assessments, evaluating the strengths and limitations of the statistical analyses, and determining its ability to differentiate proficiency levels among participants. Finally, limitations and future directions are presented. 3.4.1 Comparison of the LexVet-Esp with LexITA and Lextale-Esp While the LexITA (Amenta et al., 2020) and Lextale-Esp (Izura et al., 2014) evaluate general receptive vocabulary proficiency, the LexVet-Esp has been specifically developed to assess field-specific vocabulary knowledge within the area of Spanish for Veterinarians. Although all three assessments use a combination of real words and non-words to measure vocabulary recognition, the LexVet-Esp prioritizes technical terminology in veterinary consultations, which ensures its relevance for professionals of the field. To prioritize this 44 specialized terminology, the selection of real word items was guided by a spoken corpus of veterinary interactions in Spanish. The methodology for the selection of non-words was likewise informed by the same approaches used in the LexITA and Lextale-Esp instead employed artificial intelligence alongside manual validation to ensure that the non-words resembled real words in form and structure while lacking semantic value (Amenta et al., 2020; Izura et al., 2014). This step is critical in mitigating guessing behavior, as participants are forced to identify between real words and non-words, which is a fundamental aspect of assessing receptive vocabulary (Ha, 2021; Masrai, 2023). The incorporation of non-words assured that participants were required to possess actual lexical knowledge to distinguish real words from non-words, thereby enhancing the reliability of the assessment. 3.4.2 Addressing the Limitations of Point-Biserial Correlations with IRT All test items displayed positive point-biserial correlations, indicating that participants who performed well overall were more likely to correctly identify individual items. However, while this confirmed a basic relationship between item difficulty and participant ability, the analysis did not provide additional information to differentiate item performance across varying proficiency levels. This limitation aligns with prior observations in the LexITA (Amenta et al., 2020) and Lextale-Esp (Izura et al., 2014), where point-biserial analysis was useful for initial evaluation but insufficient for deeper item validation. Given the need for a more precise assessment of individual items, IRT was employed to provide a more comprehensive analysis of item difficulty and discrimination across proficiency levels. The IRT analysis demonstrated the assessment’s capacity to distinguish across learners of varying proficiency levels, with both simpler and more challenging items effectively differentiating individuals with lower proficiency 45 from those with higher proficiency. The broad variety of word difficulties incorporated into the final assessment was necessary to guarantee that the evaluation was sensitive to the full spectrum of language proficiency levels. 3.4.3 Differentiation of Proficiency Levels The LexVet-Esp scores demonstrated significant differences across proficiency levels. Higher-proficiency groups, particularly those at the 500-level and native speakers, tended to perform better in correctly identifying vocabulary items. This finding partially supports the hypothesis that vocabulary acquisition is progressive (Nation, 2022), with learners expanding their lexicon to include increasingly complex and less commonly used terminology as they advance in their studies (Borawski, 2019). Lower-proficiency learners showed limitations in identifying specialized terminology. This trend aligns with the Lextale-Esp findings, where participants at lower proficiency levels struggled to identify infrequent vocabulary items (Izura et al., 2014). The data in the current study reflect similar patterns, reinforcing the notion that specialized vocabulary knowledge is not solely a product of exposure but also an indicator of overall linguistic proficiency (Masrai, 2023). Additionally, and consistent with previous studies (Amenta et al., 2020; Izura et al., 2014), participants who frequently selected non-words tended to score lower overall. This result aligns with prior research suggesting that non-word recognition can serve as a general indicator of language proficiency (Roche & Harrington, 2013). Overall, the results indicate the test’s ability to differentiate learners ranging from novices to advanced speakers. This finding underscores the instrument’s potential for assessing specialized language acquisition with a level of precision comparable to established vocabulary assessments. 46 3.4.4 Limitations Firstly, while the sample size was considered sufficient for analyses, the reliance on a single data collection site limits the extent to which the findings can be generalized. With the incorporation of participants from various geographical areas and/or backgrounds, especially those with different levels of exposure to the Spanish language, the validity of the results would be significantly enhanced. This consideration is especially relevant when reflecting on previous assessments, such as LexITA and Lextale-Esp, which featured participants from multiple universities and diverse linguistic backgrounds, strengthening the credibility of their conclusions (Amenta et al., 2020; Izura et al., 2014). Additionally, the small number of participants at the 500 level (n=3) represents a clear limitation as it limits the insights of higher proficiency learners. Another limitation is that the proficiency level was measured by course enrollment level, which may not be the most accurate measure of language proficiency. Future research should incorporate a standardized proficiency test to determine the level of participants. 47 CHAPTER 4: MEASURING VOCABULARY ACQUISTION WITH THE LEXVET-ESP IN A VIRTUAL REALITY ENVIRONMENT 4.1 Introduction to Project The principal objective of Project 2 was to implement the LexVet-Esp in a Spanish for Veterinary Medicine course to test participants’ vocabulary acquisition and retention before and after using a platform that combined VR and AI. The project also aimed to compare the results of two distinct instructional modalities for vocabulary acquisition: explicit versus implicit vocabulary exercises. Explicit vocabulary instruction involves direct explanation of word meanings, uses, and forms, followed by focused practice. In contrast, implicit instruction integrates vocabulary exposure within wider communicative activities, avoiding direct emphasis on the words. By contrasting these approaches, the project attempted to gain insights into the roles of explicit and implicit instruction in vocabulary acquisition in a VR/AI-enhanced educational platform. Information regarding the platform and activities will be detailed in the upcoming sections. The specific research questions guiding this project were as follows: 1. What is the impact of a VR/AI platform on the acquisition of receptive vocabulary in a specialized language course on Spanish for Veterinary Medicine? 2. How does the use of explicit versus implicit vocabulary activities impact students’ vocabulary learning in a VR/AI-based context in a course on Spanish for Veterinary Medicine? 3. How effective is VR/AI-based vocabulary practice for long-term retention of specialized vocabulary in Spanish for Veterinary Medicine? 48 4.2 Methodology 4.2.1 Participants and Context Participants in this project were students enrolled in the Spanish for Veterinary Medicine certificate in the DVM program at a large, western university in the U.S. Out of all the students taking the certificate, fifteen completed all phases of this project and were included in the study. Participation in this study was optional and did not impact students’ final grades. Participants were enrolled in the second two-credit course of the elective certificate. The course met weekly for a duration of 50 minutes, and students were expected to complete two hours of work outside of class time. The course followed a flipped classroom in which students were required to complete a series of online preparatory activities prior to each face-to-face class. These preparatory materials included tasks designed to develop reading comprehension, listening proficiency, vocabulary enhancement, and speaking skills and to familiarize students with essential thematic content and vocabulary pertinent to the in-person class. The face-to-face sessions prioritized active, communicative use of the language. During class sessions, students engaged in structured speaking exercises, discussions, and task-oriented learning activities that reinforced the material addressed in the preparatory phase. 4.2.2 Materials A non-immersive VR/AI platform, MeTabi, was used for this project. MeTabi is an online language learning platform dedicated to enhancing students’ linguistic proficiency within specified professional contexts that do not require headsets and spatial interaction, just the use of a computer or phone. Currently, it is available in over ten languages and uses Language Coaches, or characters that use AI, to help create a scaffolded learning process for students. This 49 framework permits the development of a variety of activities that focus on four essential skills: oral comprehension, written comprehension, oral production, and written production. As MeTabi is an AI-enhanced but screen-based system, learners benefit from immediate feedback generated by the AI system but do not experience real-time sensory immersion. For the purposes of this study, a virtual veterinary clinic was built in the MeTabi environment (Figures 3 and 4), and MeTabi was implemented into the Spanish for Veterinary Medicine course during the second half of the semester. Students used Metabi to complete their required outside of class work. Figure 3 One of the consultation rooms in MeTabi 50 Figure 4 The clinic’s laboratory with different diagnostic tools From week nine to week twelve, students completed explicit vocabulary activities. Examples can be seen in Figures 5-9. Figure 5 shows an activity in which participants watched a video talking about colors, textures, pet behaviors, etc., and images appeared alongside a list of words describing those images. At the end of the video, the participants completed a drag and drop activity where they moved each word into the correct category. Figure 6 shows a picture matching activity, and Figure 7 is a crossword puzzle. Figures 8 and 9 represent two parts of the same activity. For this activity, participants were presented with 18 flashcards, and each contained a picture and the corresponding word in Spanish. The first part of the activity required participants to translate the words into English, and in the second part, they were asked to complete a dialogue between the client and the doctor using the words from the flashcards (Figures 8 and 9). 51 Figure 5 Drag and drop activity Figure 6 Matching activity 52 Figure 7 Crossword puzzle Figure 8 Part 1 of the flashcard activity 53 Figure 9 Part 2 of the flashcard activity During the remaining three weeks, there were implicit vocabulary exercises. Vocabulary was introduced through listening comprehension activities, oral activities with the Language Coaches, and pronunciation practice. No words were highlighted or explained to the students, and instead, the focus was on seeing how words were used in context. An example of this type of activity can be seen in Figure 10. In this example, participants watched a video after which they were asked to create five questions that they would ask their client in order to gather additional information about their pet. Another type of activity involved an AI Language Coach, as seen in Figure 11. For this activity, the participants were given a prompt that guided them to simulate a conversation in which they needed to use specific vocabulary that appeared in previous activities, even though those activities focused on grammar or listening comprehension. Another example activity, seen in Figure 12, included participants listening to three different clients, taking notes regarding the pet’s health history, and then recommending a diagnostic procedure. This entire 54 activity required participants to utilize vocabulary seen in prior activities without explicit instruction. Figure 10 Video with listening comprehension and writing questions Figure 11 Language Coach activity 55 Figure 12 Listening and speaking activity 4.2.3 Procedure The LexVet-Esp was implemented to evaluate students’ vocabulary acquisition and was made up of two parts. The first part included a short questionnaire where students indicated their age, race/ethnicity, and their L1. This was followed by an explanation of the test with an example of common words so that they could better understand how the test functioned (Appendix B). The instructions for this part of the test were provided in English. Then, participants completed the LexVet-Esp by identifying real words from the list of 90 total words, 60 of which were real words and 30 were non-