From 08ae772afbf6d4411e99a30f067aa9e77d1ff41f Mon Sep 17 00:00:00 2001 From: Cui Junbo <92843231+Cuiunbo@users.noreply.github.com> Date: Fri, 28 Jun 2024 13:30:30 +0800 Subject: [PATCH] Update README_zh.md --- README_zh.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README_zh.md b/README_zh.md index 7cc254b..1a44d8c 100644 --- a/README_zh.md +++ b/README_zh.md @@ -578,6 +578,8 @@ print(answer) ``` +### 多卡推理 +您可以通过将模型的层分布在多个低显存显卡(12 GB 或 16 GB)上,运行 MiniCPM-Llama3-V 2.5。请查看该[教程](https://github.com/OpenBMB/MiniCPM-V/blob/main/docs/inference_on_multiple_gpus.md),详细了解如何使用多张低显存显卡载入模型并进行推理。 ### Mac 推理