Commit Graph

227 Commits

Author SHA1 Message Date
lyuxiang.lx
9504c3f88b fix flow matching training for zero shot inference 2024-08-01 10:54:27 +08:00
lyuxiang.lx
02f941d348 update model inference 2024-07-24 19:18:09 +08:00
lyuxiang.lx
a13411c561 add stream code 2024-07-23 00:02:30 +08:00
刘悦
f2939a9a50 Update file_utils.py
fix flake8
2024-07-22 16:02:18 +08:00
刘悦
cf43100f66 Update file_utils.py
add speed_change function
2024-07-22 11:44:39 +08:00
zhuyunfeng
6ae6ba3f77 Update frontend.py
Fix the bug in handling anomalies for synthetic text ending with Chinese and English commas.
2024-07-14 22:33:56 +08:00
lyuxiang.lx
44aea805ea add train cfg in flow matching 2024-07-11 17:36:59 +08:00
lyuxiang.lx
6cebcb3410 move use_spk_embedding to processor 2024-07-11 13:15:34 +08:00
lyuxiang.lx
0fd15bb12b use spk_embedding when sft 2024-07-10 17:49:32 +08:00
lyuxiang.lx
a723ea375e Merge branch 'main' of github.com:FunAudioLLM/CosyVoice into main 2024-07-10 16:42:11 +08:00
lyuxiang.lx
793a24862c add constant lr scheduler 2024-07-10 16:37:25 +08:00
cyz
225b56de05 FIX: 修复自然语言控制生成音频时发生错误,异常信息如下:AttributeError: 'CosyVoiceFrontEnd' object has no attribute 'en_tn_model' 2024-07-10 12:02:41 +08:00
lyuxiang.lx
6a3e44242a keep only embedding mean as spk embedding 2024-07-10 00:21:56 +08:00
lyuxiang.lx
ee9e87b4d3 add empty cache 2024-07-09 23:48:23 +08:00
lyuxiang.lx
7981796523 add WeTextProcessing 2024-07-09 23:37:54 +08:00
passerbya
69026d83bb 没有标点结尾时默认加上句号 2024-07-09 17:42:40 +08:00
passerbya
f9fe31f200 文本中没有标点时无法合成 2024-07-09 17:26:19 +08:00
passerbya
95b8866f3c 优先使用ttsfrd,ttsfrd不存在时使用WeTextProcessing 2024-07-09 17:25:55 +08:00
passerbya
88c8bf7b9e 更换前端为WeTextProcessing 2024-07-09 08:22:06 +08:00
passerbya
2f496104ec 半角句号会导致合成失败:RuntimeError: torch.cat(): expected a non-empty list of Tensors
text='小明因为感冒,鼻子不通,讲话总带着齉音.'
  File "/usr/local/data/CosyVoice/cosyvoice/cli/cosyvoice.py", line 62, in inference_zero_shot
    return {'tts_speech': torch.concat(tts_speeches, dim=1)}
RuntimeError: torch.cat(): expected a non-empty list of Tensors

原因为self.frontend.text_normalize(tts_text, split=True)返回为空
2024-07-09 08:17:34 +08:00
lyuxiang.lx
62c71075ac update dockerfile 2024-07-08 16:40:46 +08:00
lyuxiang.lx
50c7b06ea9 compatible when ttsfrd is not avaliable 2024-07-07 13:13:42 +08:00
lyuxiang.lx
71238461f0 remove academic and change to iic/CosyVoice_ttsfrd 2024-07-07 12:19:34 +08:00
lyuxiang.lx
834053940d update modelscope model 2024-07-06 01:53:58 +08:00
lyuxiang.lx
0379f38dd9 add cpuruntime in provider 2024-07-05 21:41:50 +08:00
lyuxiang.lx
3910efd6d3 add submodule 2024-07-04 21:40:58 +08:00
lyuxiang.lx
076829ab84 add cosyvoice code 2024-07-04 21:15:12 +08:00