mirror of
https://github.com/HumanAIGC-Engineering/gradio-webrtc.git
synced 2026-02-05 18:09:23 +08:00
Add some utils fns, add moshi to cookbook, fix querySelector, support async functions in ReplyOnPause (#29)
* add * add code
This commit is contained in:
@@ -24,6 +24,18 @@
|
||||
|
||||
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-claude/blob/main/app.py)
|
||||
|
||||
- :speaking_head:{ .lg .middle } __Kyutai Moshi__
|
||||
|
||||
---
|
||||
|
||||
Kyutai's moshi is a novel speech-to-speech model for modeling human conversations.
|
||||
|
||||
<video width=98% src="https://github.com/user-attachments/assets/becc7a13-9e89-4a19-9df2-5fb1467a0137" controls style="text-align: center"></video>
|
||||
|
||||
[:octicons-arrow-right-24: Demo](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi)
|
||||
|
||||
[:octicons-code-16: Code](https://huggingface.co/spaces/freddyaboulton/talk-to-moshi/blob/main/app.py)
|
||||
|
||||
- :robot:{ .lg .middle } __Llama Code Editor__
|
||||
|
||||
---
|
||||
|
||||
@@ -22,7 +22,4 @@ pip install gradio_webrtc[vad]
|
||||
```
|
||||
|
||||
## Examples
|
||||
1. [Object Detection from Webcam with YOLOv10](https://huggingface.co/spaces/freddyaboulton/webrtc-yolov10n) 📷
|
||||
2. [Streaming Object Detection from Video with RT-DETR](https://huggingface.co/spaces/freddyaboulton/rt-detr-object-detection-webrtc) 🎥
|
||||
3. [Text-to-Speech](https://huggingface.co/spaces/freddyaboulton/parler-tts-streaming-webrtc) 🗣️
|
||||
4. [Conversational AI](https://huggingface.co/spaces/freddyaboulton/omni-mini-webrtc) 🤖🗣️
|
||||
See the [cookbook](/cookbook)
|
||||
54
docs/utils.md
Normal file
54
docs/utils.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Utils
|
||||
|
||||
## `audio_to_bytes`
|
||||
|
||||
Convert an audio tuple containing sample rate and numpy array data into bytes.
|
||||
Useful for sending data to external APIs from `ReplyOnPause` handler.
|
||||
|
||||
Parameters
|
||||
```
|
||||
audio : tuple[int, np.ndarray]
|
||||
A tuple containing:
|
||||
- sample_rate (int): The audio sample rate in Hz
|
||||
- data (np.ndarray): The audio data as a numpy array
|
||||
```
|
||||
|
||||
Returns
|
||||
```
|
||||
bytes
|
||||
The audio data encoded as bytes, suitable for transmission or storage
|
||||
```
|
||||
|
||||
Example
|
||||
```python
|
||||
>>> sample_rate = 44100
|
||||
>>> audio_data = np.array([0.1, -0.2, 0.3]) # Example audio samples
|
||||
>>> audio_tuple = (sample_rate, audio_data)
|
||||
>>> audio_bytes = audio_to_bytes(audio_tuple)
|
||||
```
|
||||
|
||||
## `audio_to_file`
|
||||
|
||||
Save an audio tuple containing sample rate and numpy array data to a file.
|
||||
|
||||
Parameters
|
||||
```
|
||||
audio : tuple[int, np.ndarray]
|
||||
A tuple containing:
|
||||
- sample_rate (int): The audio sample rate in Hz
|
||||
- data (np.ndarray): The audio data as a numpy array
|
||||
```
|
||||
Returns
|
||||
```
|
||||
str
|
||||
The path to the saved audio file
|
||||
```
|
||||
Example
|
||||
```
|
||||
```python
|
||||
>>> sample_rate = 44100
|
||||
>>> audio_data = np.array([0.1, -0.2, 0.3]) # Example audio samples
|
||||
>>> audio_tuple = (sample_rate, audio_data)
|
||||
>>> file_path = audio_to_file(audio_tuple)
|
||||
>>> print(f"Audio saved to: {file_path}")
|
||||
```
|
||||
Reference in New Issue
Block a user