deepgeek/lite-avatar

Fork 0

mirror of https://github.com/HumanAIGC/lite-avatar.git synced 2026-02-04 09:29:19 +08:00

Go to file

sudowind b2cd07cfd9 Update audio2mouth_cpu.py

2025-04-11 09:41:53 +08:00

data

add files

2025-02-20 12:17:03 +08:00

funasr_local

add files

2025-02-20 12:17:03 +08:00

weights

add files

2025-02-20 12:17:03 +08:00

.gitattributes

update gitattr

2025-02-20 12:16:27 +08:00

.gitignore

add git attr

2025-02-20 12:08:47 +08:00

audio2mouth_cpu.py

Update audio2mouth_cpu.py

2025-04-11 09:41:53 +08:00

extract_paraformer_feature.py

add files

2025-02-20 12:17:03 +08:00

LICENSE

Initial commit

2025-02-14 13:13:03 +08:00

lite_avatar.py

Update lite_avatar.py

2025-04-11 09:36:53 +08:00

README.md

update readme

2025-02-20 14:12:34 +08:00

requirements.txt

add files

2025-02-20 12:17:03 +08:00

README.md

LiteAvatar

We introduce a audio2face model for realtime 2D chat avatar, which can run in 30fps on only CPU devices without GPU acceleration.

Pipeline

An efficient ASR model from modelsope for audio feature extraction.
A mouth parameter prediction model given audio feature inputs for voice synchronized mouth movement generation.
A lightweight 2D face generator model for mouth movement rendering, which can also be deployed on mobile devices realizing realtime inference.

Data Preparation

Get sample avatar data located in ./data/sample_data.zip and extract to you path

Installation

We recommend a python version = 3.10 and cuda version = 11.8. Then build environment as follows:

pip install -r requirements.txt

Inference

python lite_avatar.py --data_dir /path/to/sample_data --audio_file /path/to/audio.wav --result_dir /path/to/result

The mp4 video result will be saved in the result_dir.

Interactive demo

The realtime interactive video chat demo powered by our LiteAvatar algorithm is available at OpenAvatarChat

Acknowledgement

We are grateful for the following open-source projects that we used in this project:

Paraformer and FunASR for audio feature extraction.

Citation

If you find this project useful, please ⭐️ star the repository and cite our related paper:

@inproceedings{ZhuangQZZT22,
  author       = {Wenlin Zhuang and Jinwei Qi and Peng Zhang and Bang Zhang and Ping Tan},
  title        = {Text/Speech-Driven Full-Body Animation},
  booktitle    = {Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, {IJCAI}},
  pages        = {5956--5959},
  year         = {2022}
}

Languages

Python 88.2%

C++ 9.7%

C 1.3%

CMake 0.4%

Perl 0.3%

Other 0.1%

README.md Unescape Escape

LiteAvatar

Pipeline

Data Preparation

Installation

Inference

Interactive demo

Acknowledgement

Citation

README.md