From 7369d5e003843762a13fc1d0e25202149aa79cc2 Mon Sep 17 00:00:00 2001 From: yiranyyu <2606375857@qq.com> Date: Fri, 24 May 2024 11:58:41 +0800 Subject: [PATCH] update --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c39e121..9894d32 100644 --- a/README.md +++ b/README.md @@ -588,7 +588,7 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps ### Inference with llama.cpp -MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. This implementation supports smooth inference of 6~8 token/s on mobile phone1. +MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. This implementation supports smooth inference of 6~8 token/s on mobile phones1. 1. Test environment:Xiaomi 14 pro + Snapdragon 8 Gen 3