From 08ae772afbf6d4411e99a30f067aa9e77d1ff41f Mon Sep 17 00:00:00 2001
From: Cui Junbo <92843231+Cuiunbo@users.noreply.github.com>
Date: Fri, 28 Jun 2024 13:30:30 +0800
Subject: [PATCH] Update README_zh.md

---
 README_zh.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/README_zh.md b/README_zh.md
index 7cc254b..1a44d8c 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -578,6 +578,8 @@ print(answer)
 ```
 
 
+### 多卡推理
+您可以通过将模型的层分布在多个低显存显卡（12 GB 或 16 GB）上，运行 MiniCPM-Llama3-V 2.5。请查看该[教程](https://github.com/OpenBMB/MiniCPM-V/blob/main/docs/inference_on_multiple_gpus.md)，详细了解如何使用多张低显存显卡载入模型并进行推理。
 
 
 ### Mac 推理