mirror of
https://github.com/HumanAIGC/lite-avatar.git
synced 2026-02-04 09:29:19 +08:00
b2cd07cfd9c091167c003a6377ea846fab2b8b88
LiteAvatar
We introduce a audio2face model for realtime 2D chat avatar, which can run in 30fps on only CPU devices without GPU acceleration.
Pipeline
- An efficient ASR model from modelsope for audio feature extraction.
- A mouth parameter prediction model given audio feature inputs for voice synchronized mouth movement generation.
- A lightweight 2D face generator model for mouth movement rendering, which can also be deployed on mobile devices realizing realtime inference.
Data Preparation
Get sample avatar data located in ./data/sample_data.zip and extract to you path
Installation
We recommend a python version = 3.10 and cuda version = 11.8. Then build environment as follows:
pip install -r requirements.txt
Inference
python lite_avatar.py --data_dir /path/to/sample_data --audio_file /path/to/audio.wav --result_dir /path/to/result
The mp4 video result will be saved in the result_dir.
Interactive demo
The realtime interactive video chat demo powered by our LiteAvatar algorithm is available at OpenAvatarChat
Acknowledgement
We are grateful for the following open-source projects that we used in this project:
- Paraformer and FunASR for audio feature extraction.
Citation
If you find this project useful, please ⭐️ star the repository and cite our related paper:
@inproceedings{ZhuangQZZT22,
author = {Wenlin Zhuang and Jinwei Qi and Peng Zhang and Bang Zhang and Ping Tan},
title = {Text/Speech-Driven Full-Body Animation},
booktitle = {Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, {IJCAI}},
pages = {5956--5959},
year = {2022}
}
Languages
Python
88.2%
C++
9.7%
C
1.3%
CMake
0.4%
Perl
0.3%
Other
0.1%