Add some utils fns, add moshi to cookbook, fix querySelector, support async functions in ReplyOnPause (#29)

* add * add code
2026-02-05 18:09:23 +08:00 · 2024-12-04 15:14:19 -05:00
parent c85c117576
commit 868e0bfa64
9 changed files with 158 additions and 10 deletions
--- a/docs/cookbook.md
+++ b/docs/cookbook.md
@@ -24,6 +24,18 @@
    
    [:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-claude/blob/main/app.py)

+-   :speaking_head:{ .lg .middle } __Kyutai Moshi__
+
+    ---
+
+    Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.
+
+    <video width=98% src="https://github.com/user-attachments/assets/becc7a13-9e89-4a19-9df2-5fb1467a0137" controls style="text-align: center"></video>
+
+    [:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi)
+    
+    [:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi/blob/main/app.py)
+
 -   :robot:{ .lg .middle } __Llama Code Editor__

    ---
--- a/docs/index.md
+++ b/docs/index.md
@@ -22,7 +22,4 @@ pip install gradio_webrtc[vad]
 ```

 ## Examples
-1. [Object Detection from Webcam with YOLOv10](https://huggingface.co/spaces/freddyaboulton/webrtc-yolov10n) 📷
-2. [Streaming Object Detection from Video with RT-DETR](https://huggingface.co/spaces/freddyaboulton/rt-detr-object-detection-webrtc) 🎥
-3. [Text-to-Speech](https://huggingface.co/spaces/freddyaboulton/parler-tts-streaming-webrtc) 🗣️
-4. [Conversational AI](https://huggingface.co/spaces/freddyaboulton/omni-mini-webrtc) 🤖🗣️
+See the [cookbook](/cookbook)
--- a/docs/utils.md
+++ b/docs/utils.md
@@ -0,0 +1,54 @@
+# Utils
+
+## `audio_to_bytes`
+
+Convert an audio tuple containing sample rate and numpy array data into bytes.
+Useful for sending data to external APIs from `ReplyOnPause` handler.
+
+Parameters
+```
+audio : tuple[int, np.ndarray]
+    A tuple containing:
+        - sample_rate (int): The audio sample rate in Hz
+        - data (np.ndarray): The audio data as a numpy array
+```
+
+Returns
+```
+bytes
+    The audio data encoded as bytes, suitable for transmission or storage
+```
+
+Example
+```python
+>>> sample_rate = 44100
+>>> audio_data = np.array([0.1, -0.2, 0.3])  # Example audio samples
+>>> audio_tuple = (sample_rate, audio_data)
+>>> audio_bytes = audio_to_bytes(audio_tuple)
+```
+
+## `audio_to_file`
+
+Save an audio tuple containing sample rate and numpy array data to a file.
+
+Parameters
+```
+audio : tuple[int, np.ndarray]
+    A tuple containing:
+        - sample_rate (int): The audio sample rate in Hz
+        - data (np.ndarray): The audio data as a numpy array
+```
+Returns
+```
+str
+    The path to the saved audio file
+```
+Example
+```
+```python
+>>> sample_rate = 44100
+>>> audio_data = np.array([0.1, -0.2, 0.3])  # Example audio samples
+>>> audio_tuple = (sample_rate, audio_data)
+>>> file_path = audio_to_file(audio_tuple)
+>>> print(f"Audio saved to: {file_path}")
+```