Add ReplyOnStopWords (#35)

* add code * fix dependencies * add code:
2026-02-04 17:39:23 +08:00 · 2024-12-11 18:25:53 -08:00
parent b1e4326ae3
commit 6c983482b8
14 changed files with 368 additions and 18 deletions
--- a/docs/cookbook.md
+++ b/docs/cookbook.md
@@ -36,6 +36,19 @@
    
    [:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi/blob/main/app.py)

+-   :speaking_head:{ .lg .middle } __Hello Llama: Stop Word Detection__
+
+    ---
+
+    A code editor built with Llama 3.3 70b that is triggered by the phrase "Hello Llama".
+    Build a Siri-like coding assistant in 100 lines of code!
+
+    <video width=98% src="https://github.com/user-attachments/assets/3e10cb15-ff1b-4b17-b141-ff0ad852e613" controls style="text-align: center"></video>
+
+    [:octicons-arrow-right-24: Demo](hhttps://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor)
+    
+    [:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor/blob/main/app.py)
+
 -   :robot:{ .lg .middle } __Llama Code Editor__

    ---
--- a/docs/index.md
+++ b/docs/index.md
@@ -15,11 +15,16 @@ Stream video and audio in real time with Gradio using WebRTC.
 pip install gradio_webrtc
 ```

-to use built-in pause detection (see [Audio Streaming](https://freddyaboulton.github.io/gradio-webrtc/user-guide/#reply-on-pause)), install the `vad` extra:
+to use built-in pause detection (see [ReplyOnPause](/user-guide/#reply-on-pause)), install the `vad` extra:

 ```bash
 pip install gradio_webrtc[vad]
 ```

+For stop word detection (see [ReplyOnStopWords](/user-guide/#reply-on-stopwords)), install the `stopword` extra:
+```bash
+pip install gradio_webrtc[stopword]
+```
+
 ## Examples
 See the [cookbook](/cookbook)
--- a/docs/user-guide.md
+++ b/docs/user-guide.md
@@ -65,6 +65,54 @@ and passing it to the `stream` event of the `WebRTC` component.

    5. Set a `time_limit` to control how long a conversation will last. If the `concurrency_count` is 1 (default), only one conversation will be handled at a time.

+
+### Reply On Stopwords
+
+You can configure your AI model to run whenever a set of "stop words" are detected, like "Hey Siri" or "computer", with the `ReplyOnStopWords` class. 
+
+The API is similar to `ReplyOnPause` with the addition of a `stop_words` parameter.
+
+=== "Code"
+    ``` py title="ReplyonPause"
+    import gradio as gr
+    from gradio_webrtc import WebRTC, ReplyOnPause
+
+    def response(audio: tuple[int, np.ndarray]):
+        """This function must yield audio frames"""
+        ...
+        for numpy_array in generated_audio:
+            yield (sampling_rate, numpy_array, "mono")
+
+
+    with gr.Blocks() as demo:
+        gr.HTML(
+        """
+        <h1 style='text-align: center'>
+        Chat (Powered by WebRTC ⚡️)
+        </h1>
+        """
+        )
+        with gr.Column():
+            with gr.Group():
+                audio = WebRTC(
+                    mode="send",
+                    modality="audio",
+                )
+        webrtc.stream(ReplyOnStopWords(generate,
+                                input_sample_rate=16000,
+                                stop_words=["computer"]), # (1)
+                      inputs=[webrtc, history, code],
+                      outputs=[webrtc], time_limit=90,
+                      concurrency_limit=10)
+
+    demo.launch()
+    ```
+
+    1. The `stop_words` can be single words or pairs of words. Be sure to include common misspellings of your word for more robust detection, e.g. "llama", "lamma". In my experience, it's best to use two very distinct words like "ok computer" or "hello iris". 
+    
+=== "Notes"
+    1. The `stop_words` can be single words or pairs of words. Be sure to include common misspellings of your word for more robust detection, e.g. "llama", "lamma". In my experience, it's best to use two very distinct words like "ok computer" or "hello iris". 
+
 ### Stream Handler

 `ReplyOnPause` is an implementation of a `StreamHandler`. The `StreamHandler` is a low-level