This commit is contained in:
Freddy Boulton
2025-03-13 19:56:37 -04:00
committed by GitHub
parent 4fb28f3bf2
commit c9b67726ba
2 changed files with 47 additions and 11 deletions

View File

@@ -14,11 +14,11 @@
}
</style>
A collection of VAD models ready to use with FastRTC. Click on the tags below to find the VAD model you're looking for!
A collection of Turn Taking Algorithms and Voice Activity Detection (VAD) models ready to use with FastRTC. Click on the tags below to find the model you're looking for!
<div class="tag-buttons">
<button class="tag-button" data-tag="pytorch"><code>pytorch</code></button>
<button class="tag-button" data-tag="vad-models"><code>VAD Model</code></button>
<button class="tag-button" data-tag="turn-taking-algorithm"><code>Turn-taking Algorithm</code></button>
</div>
<script>
@@ -41,27 +41,60 @@ document.querySelectorAll('.tag-button').forEach(button => {
});
</script>
## Gallery
<div class="grid cards" markdown>
- :speaking_head:{ .lg .middle }:eyes:{ .lg .middle } __Your VAD Model__
{: data-tags="pytorch"}
- :speaking_head:{ .lg .middle }:eyes:{ .lg .middle } __Walkie Talkie__
{: data-tags="turn-taking-algorithm"}
---
Description
The user's turn ends when they finish a sentence with the word "over".
For example, "Hello, how are you? Over." would send end the user's turn and trigger the response.
This is intended as a simple reference implementation for how to implement a custom-turn-taking algorithm.
Install Instructions
```bash
pip install fastrtc-walkie-talkie
```
Usage
<video width=98% src="https://github.com/user-attachments/assets/d94c1b91-5430-48b0-801d-15e17bdad2a0" controls style="text-align: center"></video>
[:octicons-arrow-right-24: Demo](Your demo here)
[:octicons-arrow-right-24: Demo](https://github.com/freddyaboulton/fastrtc-walkie-talkie/blob/main/scratch.py)
[:octicons-code-16: Repository](Code here)
[:octicons-code-16: Repository](https://github.com/freddyaboulton/fastrtc-walkie-talkie/blob/main/src/fastrtc_walkie_talkie/__init__.py)
</div>
## How to add your own VAD model
## What is this for?
By default, FastRTC uses the `ReplyOnPause` class to handle turn-taking. However, you may want to tweak this behavior to better fit your use case.
In this gallery, you can find a collection of turn-taking algorithms and VAD models that you can use to customize the turn-taking behavior to your needs. Each card contains install and usage instructions.
## How to add your own Turn-taking Algorithm or VAD model
### Turn-taking Algorithm
1. Typically you will want to subclass the `ReplyOnPause` class and override the `determine_pause` method.
```python
from fastrtc.reply_on_pause import ReplyOnPause, AppState
class MyTurnTakingAlgorithm(ReplyOnPause):
def determine_pause(self, audio: np.ndarray, sampling_rate: int, state: AppState) -> bool:
return super().determine_pause(audio, sampling_rate, state)
```
2. Then package your class into a pip installable package and publish it to [pypi](https://pypi.org/).
3. Open a [PR](https://github.com/freddyaboulton/fastrtc-walkie-talkie/blob/main/src/fastrtc_walkie_talkie/__init__.py) to add your model to the gallery!
!!! tip "Example Implementation"
See the [Walkie Talkie](https://github.com/freddyaboulton/fastrtc-walkie-talkie/) package for an example implementation of a turn-taking algorithm.
### VAD Model
1. Your model can be implemented in **any** framework you want but it must implement the `PauseDetectionModel` protocol.
```python
@@ -105,4 +138,7 @@ document.querySelectorAll('.tag-button').forEach(button => {
stream.ui.launch()
```
3. Open a [PR](https://github.com/freddyaboulton/fastrtc/edit/main/docs/vad_gallery.md) to add your model to the gallery! Ideally you model package should be pip installable so other can try it out easily.
3. Open a [PR](https://github.com/freddyaboulton/fastrtc/edit/main/docs/turn_taking_gallery.md) to add your model to the gallery! Ideally you model package should be pip installable so other can try it out easily.
!!! tip "Package Naming Convention"
It is recommended to name your package `fastrtc-<package-name>` so developers can easily find it on [pypi](https://pypi.org/search/?q=fastrtc-).