Files
MuseTalk/train_codes
2024-06-01 22:23:47 +08:00
..
2024-04-28 18:04:22 +08:00
2024-04-28 18:04:22 +08:00
2024-04-28 18:04:22 +08:00
2024-04-30 15:10:03 +08:00
2024-04-28 18:04:22 +08:00
2024-04-28 18:04:22 +08:00

Draft training codes

We provde the draft training codes here. Unfortunately, data preprocessing code is still being reorganized.

Setup

We trained our model on an NVIDIA A100 with batch size=8, gradient_accumulation_steps=4 for 20w+ steps. Using multiple GPUs should accelerate the training.

Data preprocessing

You could refer the inference codes which crop the face images and extract audio features.

Finally, the data should be organized as follows:

./data/
├── images
│     └──RD_Radio10_000
│         └── 0.png
│         └── 1.png
│         └── xxx.png
│     └──RD_Radio11_000
│         └── 0.png
│         └── 1.png
│         └── xxx.png
├── audios
│     └──RD_Radio10_000
│         └── 0.npy
│         └── 1.npy
│         └── xxx.npy
│     └──RD_Radio11_000
│         └── 0.npy
│         └── 1.npy
│         └── xxx.npy

Training

Simply run after preparing the preprocessed data

sh train.sh

TODO

  • release data preprocessing codes
  • release some novel designs in training (after technical report)