Real-time video call with a Large Language Model
Ava is a Proof of Concept (PoC) app that enables real-time video calls with a Large Language Model (ChatGPT).
CodeLink
AI & ML Lead
Full-stack Developer
Web
Prompt Engineering
Firebase Crashlytics
Cloudflare
Dreambooth
Text to Speech & Speech to Text
Low-Range Adaptation
LLM (ChatGPT)
Stable Diffusion
ThreeJS
Is it possible for users to have a real-time video call with ChatGPT?
Create a PoC that allows users to interact with an LLM via a real-time video call.
The CodeLink internal team worked autonomously to design and build the Android and iOS mobile applications.
The project kicked off in March of 2023 and is still ongoing.
We began by conducting a deep dive to determine how we could use Speech-to-Text and Text-to-Speech to enable voice-based interactions with the LLM, as well as real-time speech animations like lip sync, head tilt, and blinks to make the model more life-like. Once the model was built, we optimized it to balance training, inference, and avatar quality. We used various Google Cloud Platform services to host the training and inference servers. The final PoC allows users to chat with the 3D model on different topics, such as learning English.