mirror of
https://github.com/HumanAIGC-Engineering/gradio-webrtc.git
synced 2026-02-04 09:29:23 +08:00
Add code (#173)
This commit is contained in:
@@ -14,11 +14,11 @@
|
||||
}
|
||||
</style>
|
||||
|
||||
A collection of VAD models ready to use with FastRTC. Click on the tags below to find the VAD model you're looking for!
|
||||
|
||||
A collection of Turn Taking Algorithms and Voice Activity Detection (VAD) models ready to use with FastRTC. Click on the tags below to find the model you're looking for!
|
||||
|
||||
<div class="tag-buttons">
|
||||
<button class="tag-button" data-tag="pytorch"><code>pytorch</code></button>
|
||||
<button class="tag-button" data-tag="vad-models"><code>VAD Model</code></button>
|
||||
<button class="tag-button" data-tag="turn-taking-algorithm"><code>Turn-taking Algorithm</code></button>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
@@ -41,27 +41,60 @@ document.querySelectorAll('.tag-button').forEach(button => {
|
||||
});
|
||||
</script>
|
||||
|
||||
## Gallery
|
||||
|
||||
<div class="grid cards" markdown>
|
||||
|
||||
- :speaking_head:{ .lg .middle }:eyes:{ .lg .middle } __Your VAD Model__
|
||||
{: data-tags="pytorch"}
|
||||
- :speaking_head:{ .lg .middle }:eyes:{ .lg .middle } __Walkie Talkie__
|
||||
{: data-tags="turn-taking-algorithm"}
|
||||
|
||||
---
|
||||
|
||||
Description
|
||||
The user's turn ends when they finish a sentence with the word "over".
|
||||
For example, "Hello, how are you? Over." would send end the user's turn and trigger the response.
|
||||
This is intended as a simple reference implementation for how to implement a custom-turn-taking algorithm.
|
||||
|
||||
Install Instructions
|
||||
```bash
|
||||
pip install fastrtc-walkie-talkie
|
||||
```
|
||||
|
||||
Usage
|
||||
<video width=98% src="https://github.com/user-attachments/assets/d94c1b91-5430-48b0-801d-15e17bdad2a0" controls style="text-align: center"></video>
|
||||
|
||||
[:octicons-arrow-right-24: Demo](Your demo here)
|
||||
[:octicons-arrow-right-24: Demo](https://github.com/freddyaboulton/fastrtc-walkie-talkie/blob/main/scratch.py)
|
||||
|
||||
[:octicons-code-16: Repository](Code here)
|
||||
[:octicons-code-16: Repository](https://github.com/freddyaboulton/fastrtc-walkie-talkie/blob/main/src/fastrtc_walkie_talkie/__init__.py)
|
||||
|
||||
</div>
|
||||
|
||||
## How to add your own VAD model
|
||||
## What is this for?
|
||||
|
||||
By default, FastRTC uses the `ReplyOnPause` class to handle turn-taking. However, you may want to tweak this behavior to better fit your use case.
|
||||
|
||||
In this gallery, you can find a collection of turn-taking algorithms and VAD models that you can use to customize the turn-taking behavior to your needs. Each card contains install and usage instructions.
|
||||
|
||||
## How to add your own Turn-taking Algorithm or VAD model
|
||||
|
||||
### Turn-taking Algorithm
|
||||
|
||||
1. Typically you will want to subclass the `ReplyOnPause` class and override the `determine_pause` method.
|
||||
|
||||
```python
|
||||
from fastrtc.reply_on_pause import ReplyOnPause, AppState
|
||||
class MyTurnTakingAlgorithm(ReplyOnPause):
|
||||
def determine_pause(self, audio: np.ndarray, sampling_rate: int, state: AppState) -> bool:
|
||||
return super().determine_pause(audio, sampling_rate, state)
|
||||
```
|
||||
|
||||
2. Then package your class into a pip installable package and publish it to [pypi](https://pypi.org/).
|
||||
|
||||
3. Open a [PR](https://github.com/freddyaboulton/fastrtc-walkie-talkie/blob/main/src/fastrtc_walkie_talkie/__init__.py) to add your model to the gallery!
|
||||
|
||||
!!! tip "Example Implementation"
|
||||
See the [Walkie Talkie](https://github.com/freddyaboulton/fastrtc-walkie-talkie/) package for an example implementation of a turn-taking algorithm.
|
||||
|
||||
### VAD Model
|
||||
|
||||
1. Your model can be implemented in **any** framework you want but it must implement the `PauseDetectionModel` protocol.
|
||||
```python
|
||||
@@ -105,4 +138,7 @@ document.querySelectorAll('.tag-button').forEach(button => {
|
||||
stream.ui.launch()
|
||||
```
|
||||
|
||||
3. Open a [PR](https://github.com/freddyaboulton/fastrtc/edit/main/docs/vad_gallery.md) to add your model to the gallery! Ideally you model package should be pip installable so other can try it out easily.
|
||||
3. Open a [PR](https://github.com/freddyaboulton/fastrtc/edit/main/docs/turn_taking_gallery.md) to add your model to the gallery! Ideally you model package should be pip installable so other can try it out easily.
|
||||
|
||||
!!! tip "Package Naming Convention"
|
||||
It is recommended to name your package `fastrtc-<package-name>` so developers can easily find it on [pypi](https://pypi.org/search/?q=fastrtc-).
|
||||
@@ -29,7 +29,7 @@ nav:
|
||||
- Deployment: deployment.md
|
||||
- Advanced Configuration: advanced-configuration.md
|
||||
- Speech-to-Text Gallery: speech_to_text_gallery.md
|
||||
- VAD Gallery: vad_gallery.md
|
||||
- Turn-taking Gallery: turn_taking_gallery.md
|
||||
- Utils: utils.md
|
||||
- Frequently Asked Questions: faq.md
|
||||
extra_javascript:
|
||||
|
||||
Reference in New Issue
Block a user