mirror of
https://github.com/HumanAIGC-Engineering/gradio-webrtc.git
synced 2026-02-05 01:49:23 +08:00
136 lines
5.4 KiB
Markdown
136 lines
5.4 KiB
Markdown
<div class="grid cards" markdown>
|
|
|
|
- :speaking_head:{ .lg .middle } __Audio Input/Output with mini-omni2__
|
|
|
|
---
|
|
|
|
Build a GPT-4o like experience with mini-omni2, an audio-native LLM.
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/58c06523-fc38-4f5f-a4ba-a02a28e7fa9e" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/mini-omni2-webrtc)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/mini-omni2-webrtc/blob/main/app.py)
|
|
|
|
- :speaking_head:{ .lg .middle } __Talk to Claude__
|
|
|
|
---
|
|
|
|
Use the Anthropic and Play.Ht APIs to have an audio conversation with Claude.
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/650bc492-798e-4995-8cef-159e1cfc2185" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/talk-to-claude)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-claude/blob/main/app.py)
|
|
|
|
- :speaking_head:{ .lg .middle } __Kyutai Moshi__
|
|
|
|
---
|
|
|
|
Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/becc7a13-9e89-4a19-9df2-5fb1467a0137" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi/blob/main/app.py)
|
|
|
|
- :speaking_head:{ .lg .middle } __Hello Llama: Stop Word Detection__
|
|
|
|
---
|
|
|
|
A code editor built with Llama 3.3 70b that is triggered by the phrase "Hello Llama".
|
|
Build a Siri-like coding assistant in 100 lines of code!
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/3e10cb15-ff1b-4b17-b141-ff0ad852e613" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](hhttps://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor/blob/main/app.py)
|
|
|
|
- :robot:{ .lg .middle } __Llama Code Editor__
|
|
|
|
---
|
|
|
|
Create and edit HTML pages with just your voice! Powered by SambaNova systems.
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/a09647f1-33e1-4154-a5a3-ffefda8a736a" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/llama-code-editor)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/llama-code-editor/blob/main/app.py)
|
|
|
|
- :speaking_head:{ .lg .middle } __Talk to Ultravox__
|
|
|
|
---
|
|
|
|
Talk to Fixie.AI's audio-native Ultravox LLM with the transformers library.
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/e6e62482-518c-4021-9047-9da14cd82be1" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/talk-to-ultravox)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-ultravox/blob/main/app.py)
|
|
|
|
|
|
- :speaking_head:{ .lg .middle } __Talk to Llama 3.2 3b__
|
|
|
|
---
|
|
|
|
Use the Lepton API to make Llama 3.2 talk back to you!
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/3ee37a6b-0892-45f5-b801-73188fdfad9a" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/llama-3.2-3b-voice-webrtc)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/llama-3.2-3b-voice-webrtc/blob/main/app.py)
|
|
|
|
|
|
- :robot:{ .lg .middle } __Talk to Qwen2-Audio__
|
|
|
|
---
|
|
|
|
Qwen2-Audio is a SOTA audio-to-text LLM developed by Alibaba.
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/c821ad86-44cc-4d0c-8dc4-8c02ad1e5dc8" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/talk-to-qwen-webrtc)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-qwen-webrtc/blob/main/app.py)
|
|
|
|
|
|
- :camera:{ .lg .middle } __Yolov10 Object Detection__
|
|
|
|
---
|
|
|
|
Run the Yolov10 model on a user webcam stream in real time!
|
|
|
|
<video width=98% src="https://github.com/user-attachments/assets/c90d8c9d-d2d5-462e-9e9b-af969f2ea73c" controls style="text-align: center"></video>
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/webrtc-yolov10n)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/webrtc-yolov10n/blob/main/app.py)
|
|
|
|
- :camera:{ .lg .middle } __Video Object Detection with RT-DETR__
|
|
|
|
---
|
|
|
|
Upload a video and stream out frames with detected objects (powered by RT-DETR) model.
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/rt-detr-object-detection-webrtc)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/rt-detr-object-detection-webrtc/blob/main/app.py)
|
|
|
|
- :speaker:{ .lg .middle } __Text-to-Speech with Parler__
|
|
|
|
---
|
|
|
|
Stream out audio generated by Parler TTS!
|
|
|
|
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/parler-tts-streaming-webrtc)
|
|
|
|
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/parler-tts-streaming-webrtc/blob/main/app.py)
|
|
|
|
|
|
</div> |