charstream.xyz like Character.Ai but with real-time video avatars
we could only implement voice and text avatars during the hackathon, but our plans for the future are much bigger.
CharStream.xyz: Real-Time Interactive Video Avatars Powered by AI
Imagine Character.AI, but instead of just text, you're interacting with a live, animated video avatar of your chosen character. That's the core concept of CharStream.xyz.
What it is:
Personalized AI Companions: CharStream allows users to engage in real-time conversations with AI characters, each possessing unique personalities and backstories. Video-Driven Interaction: Unlike traditional chatbots, CharStream utilizes animated video avatars, adding a visual and emotional dimension to the interaction. This makes the experience more immersive and engaging. Real-Time Responsiveness: The AI characters respond in real-time, with synthesized speech and synchronized video animation, creating a sense of natural conversation. Character Customization: We aim to enable users to further personalize their AI companions through future features like NFT-based add-ons that modify the character's behavior and personality (e.g., making them act "drunk" or "stoned"). Accessibility and Empathy: The inherent non-judgmental nature of AI allows for a safe and empathetic space for users to interact, potentially fostering emotional connection and companionship. How it works:
User Input: The user speaks into their microphone. Speech-to-Text Conversion: OpenAI's audio API converts the user's speech into text. Natural Language Understanding (NLU): An AI agent analyzes the text to understand the user's intent, sentiment, and key entities. Language Model Response: OpenAI's large language model (LLM) generates a response based on the NLU output and the character's defined persona, stored and managed with the recall network. Text-to-Speech Synthesis: OpenAI's audio API converts the LLM's text response into natural-sounding speech. Video Animation: A video animation agent controls the real-time animation of the video avatar, including lip-syncing, facial expressions, and potentially gestures, synchronized with the synthesized speech. Real-Time Output: The user sees and hears the animated avatar's response in real-time, creating a seamless conversational experience. Contextual Memory: The Recall Network provides the AI agents with memory, and contextual awareness, so the characters can remember previous conversations. User Authentication: Clerk is used to manage user accounts and secure access to the platform.
So we used OpenAI, OpenAI is audio API and OpenAI is API to power the bots and we are using Recall Network to store the power the bot agents and characters with memories and we are planning, we have also added an add-on to make the character more drunk or more stoned by buying NFT add-ons which you could not implement in the given time and yes very fun you should try it out and we have also the project is based on next next years and foundationally we had a lot of fun building it out and we use clerk for authentication.