深度极客
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
Updated 2026-02-05 10:27:50 +08:00
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Updated 2026-02-04 13:10:40 +08:00
Generate ARKit expression from audio in realtime
Updated 2025-10-24 13:53:58 +08:00
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Updated 2025-09-26 13:44:17 +08:00
Updated 2025-06-30 11:09:20 +08:00