mirror of
https://github.com/FunAudioLLM/CosyVoice.git
synced 2026-02-04 09:29:25 +08:00
add cosyvoice2 offline inference
This commit is contained in:
11
README.md
11
README.md
@@ -246,6 +246,17 @@ docker run -d --runtime=nvidia -p 50000:50000 cosyvoice:v1.0 /bin/bash -c "cd /o
|
||||
cd fastapi && python3 client.py --port 50000 --mode <sft|zero_shot|cross_lingual|instruct>
|
||||
```
|
||||
|
||||
#### Using Nvidia TensorRT-LLM for deployment
|
||||
|
||||
Using TensorRT-LLM to accelerate cosyvoice2 llm could give 4x acceleration comparing with huggingface transformers implementation.
|
||||
To quick start:
|
||||
|
||||
``` sh
|
||||
cd runtime/triton_trtllm
|
||||
docker compose up -d
|
||||
```
|
||||
For more details, you could check [here](https://github.com/FunAudioLLM/CosyVoice/tree/main/runtime/triton_trtllm)
|
||||
|
||||
## Discussion & Communication
|
||||
|
||||
You can directly discuss on [Github Issues](https://github.com/FunAudioLLM/CosyVoice/issues).
|
||||
|
||||
Reference in New Issue
Block a user