Add ReplyOnStopWords (#35)

* add code

* fix dependencies

* add code:
This commit is contained in:
Freddy Boulton
2024-12-11 18:25:53 -08:00
committed by GitHub
parent b1e4326ae3
commit 6c983482b8
14 changed files with 368 additions and 18 deletions

View File

@@ -36,6 +36,19 @@
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi/blob/main/app.py)
- :speaking_head:{ .lg .middle } __Hello Llama: Stop Word Detection__
---
A code editor built with Llama 3.3 70b that is triggered by the phrase "Hello Llama".
Build a Siri-like coding assistant in 100 lines of code!
<video width=98% src="https://github.com/user-attachments/assets/3e10cb15-ff1b-4b17-b141-ff0ad852e613" controls style="text-align: center"></video>
[:octicons-arrow-right-24: Demo](hhttps://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor)
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor/blob/main/app.py)
- :robot:{ .lg .middle } __Llama Code Editor__
---

View File

@@ -15,11 +15,16 @@ Stream video and audio in real time with Gradio using WebRTC.
pip install gradio_webrtc
```
to use built-in pause detection (see [Audio Streaming](https://freddyaboulton.github.io/gradio-webrtc/user-guide/#reply-on-pause)), install the `vad` extra:
to use built-in pause detection (see [ReplyOnPause](/user-guide/#reply-on-pause)), install the `vad` extra:
```bash
pip install gradio_webrtc[vad]
```
For stop word detection (see [ReplyOnStopWords](/user-guide/#reply-on-stopwords)), install the `stopword` extra:
```bash
pip install gradio_webrtc[stopword]
```
## Examples
See the [cookbook](/cookbook)

View File

@@ -65,6 +65,54 @@ and passing it to the `stream` event of the `WebRTC` component.
5. Set a `time_limit` to control how long a conversation will last. If the `concurrency_count` is 1 (default), only one conversation will be handled at a time.
### Reply On Stopwords
You can configure your AI model to run whenever a set of "stop words" are detected, like "Hey Siri" or "computer", with the `ReplyOnStopWords` class.
The API is similar to `ReplyOnPause` with the addition of a `stop_words` parameter.
=== "Code"
``` py title="ReplyonPause"
import gradio as gr
from gradio_webrtc import WebRTC, ReplyOnPause
def response(audio: tuple[int, np.ndarray]):
"""This function must yield audio frames"""
...
for numpy_array in generated_audio:
yield (sampling_rate, numpy_array, "mono")
with gr.Blocks() as demo:
gr.HTML(
"""
<h1 style='text-align: center'>
Chat (Powered by WebRTC ⚡️)
</h1>
"""
)
with gr.Column():
with gr.Group():
audio = WebRTC(
mode="send",
modality="audio",
)
webrtc.stream(ReplyOnStopWords(generate,
input_sample_rate=16000,
stop_words=["computer"]), # (1)
inputs=[webrtc, history, code],
outputs=[webrtc], time_limit=90,
concurrency_limit=10)
demo.launch()
```
1. The `stop_words` can be single words or pairs of words. Be sure to include common misspellings of your word for more robust detection, e.g. "llama", "lamma". In my experience, it's best to use two very distinct words like "ok computer" or "hello iris".
=== "Notes"
1. The `stop_words` can be single words or pairs of words. Be sure to include common misspellings of your word for more robust detection, e.g. "llama", "lamma". In my experience, it's best to use two very distinct words like "ok computer" or "hello iris".
### Stream Handler
`ReplyOnPause` is an implementation of a `StreamHandler`. The `StreamHandler` is a low-level