* fix links * fix upload * add code * Add code --------- Co-authored-by: Freddy Boulton <freddyboulton@hf-freddy.local>
13 KiB
A collection of applications built with FastRTC. Click on the tags below to find the app you're looking for!
audio
video
llm
computer-vision
real-time-api
voice chat
code generation
stopword
transcription
SambaNova
Groq
ElevenLabs
Kyutai
Agentic
-
🗣️{ .lg .middle }👀{ .lg .middle } Gemini Audio Video Chat {: data-tags="audio,video,real-time-api"}
Stream BOTH your webcam video and audio feeds to Google Gemini. You can also upload images to augment your conversation!
:octicons-arrow-right-24: Demo
-
🗣️{ .lg .middle } Google Gemini Real Time Voice API {: data-tags="audio,real-time-api,voice-chat"}
Talk to Gemini in real time using Google's voice API.
:octicons-arrow-right-24: Demo
-
🗣️{ .lg .middle } OpenAI Real Time Voice API {: data-tags="audio,real-time-api,voice-chat"}
Talk to ChatGPT in real time using OpenAI's voice API.
:octicons-arrow-right-24: Demo
-
🤖{ .lg .middle } Hello Computer {: data-tags="llm,stopword,sambanova"}
Say computer before asking your question!
:octicons-arrow-right-24: Demo
-
🤖{ .lg .middle } Llama Code Editor {: data-tags="audio,llm,code-generation,groq,stopword"}
Create and edit HTML pages with just your voice! Powered by Groq!
-
🗣️{ .lg .middle } SmolAgents with Voice {: data-tags="audio,llm,voice-chat,agentic"}
Build a voice-based smolagent to find a coworking space!
-
🗣️{ .lg .middle } Talk to Claude {: data-tags="audio,llm,voice-chat"}
Use the Anthropic and Play.Ht APIs to have an audio conversation with Claude.
:octicons-arrow-right-24: Demo
-
🎵{ .lg .middle } LLM Voice Chat {: data-tags="audio,llm,voice-chat,groq,elevenlabs"}
Talk to an LLM with ElevenLabs!
:octicons-arrow-right-24: Demo
-
🎵{ .lg .middle } Whisper Transcription {: data-tags="audio,transcription,groq"}
Have whisper transcribe your speech in real time!
:octicons-arrow-right-24: Demo
-
🤖{ .lg .middle } Talk to Sambanova {: data-tags="llm,stopword,sambanova"}
Talk to Llama 3.2 with the SambaNova API.
:octicons-arrow-right-24: Demo
-
🗣️{ .lg .middle } Hello Llama: Stop Word Detection {: data-tags="audio,llm,code-generation,stopword,sambanova"}
A code editor built with Llama 3.3 70b that is triggered by the phrase "Hello Llama". Build a Siri-like coding assistant in 100 lines of code!
-
🗣️{ .lg .middle } Audio Input/Output with mini-omni2 {: data-tags="audio,llm,voice-chat"}
Build a GPT-4o like experience with mini-omni2, an audio-native LLM.
-
🗣️{ .lg .middle } Kyutai Moshi {: data-tags="audio,llm,voice-chat,kyutai"}
Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.
-
🗣️{ .lg .middle } Talk to Ultravox {: data-tags="audio,llm,voice-chat"}
Talk to Fixie.AI's audio-native Ultravox LLM with the transformers library.
-
🗣️{ .lg .middle } Talk to Llama 3.2 3b {: data-tags="audio,llm,voice-chat"}
Use the Lepton API to make Llama 3.2 talk back to you!
-
🤖{ .lg .middle } Talk to Qwen2-Audio {: data-tags="audio,llm,voice-chat"}
Qwen2-Audio is a SOTA audio-to-text LLM developed by Alibaba.
-
📷{ .lg .middle } Yolov10 Object Detection {: data-tags="video,computer-vision"}
Run the Yolov10 model on a user webcam stream in real time!
-
📷{ .lg .middle } Video Object Detection with RT-DETR {: data-tags="video,computer-vision"}
Upload a video and stream out frames with detected objects (powered by RT-DETR) model.
-
🔈{ .lg .middle } Text-to-Speech with Parler {: data-tags="audio"}
Stream out audio generated by Parler TTS!