Affiliations: | College of Engineering and Computer Science |
Team Leader: |
|
Faculty Mentor: |
Joseph J. LaViola Jr., PhD
|
Team Size:
|
6 |
Open Spots: | 0 |
Team Member Qualifications:
|
Applicants should have experience with Python, including scripting, working with APIs, and using libraries such as LangChain, Pandas, Matplotlib and OpenAI. Experience with Visual Language Models, particularly in areas such as prompt engineering, API usage, and context handling, is also preferred. Experience in Unity development, especially involving VR interaction, scripting, and API integration, is beneficial. Proficiency with GitHub, including command line operations like cloning repositories, pulling and pushing changes, merging branches, and submitting pull requests, is encouraged. Strong research writing skills, including the use of tools like Zotero and Overleaf for citation management, are advantageous. Finally, applicants should be able to take clear notes and report progress effectively. |
Description:
|
This project explores the integration of Large Language Models (LLMs) and Visual Language Models (VLMs) for recognition of user actions in Virtual Reality (VR). This research will expand on the features of our existing conversational pipelines, such as a system for real-time voice interactions with virtual agents through ASR → LLM → TTS, built with Unity and Python. Students will contribute to the multimodal action/gesture recognition system that can incorporate visual context into various XR applications that can leverage user's state for improving immersive interactive experience. |