From c0c5581f227a7e01f675c37348d05f5d6ad19be7 Mon Sep 17 00:00:00 2001 From: yiranyyu <2606375857@qq.com> Date: Fri, 24 May 2024 11:57:33 +0800 Subject: [PATCH] update readme --- README.md | 11 ++++++++--- README_zh.md | 12 ++++++++---- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 3c6d7a6..c39e121 100644 --- a/README.md +++ b/README.md @@ -25,6 +25,7 @@ ## News +* [2024.05.24] MiniCPM-Llama3-V 2.5 supports [llama.cpp](#inference-with-llamacpp) now, providing a smooth inference of 6-8 tokens/s on mobile phones. Try it now! * [2024.05.23] 🔍 We've released a comprehensive comparison between Phi-3-vision-128k-instruct and MiniCPM-Llama3-V 2.5, including benchmarks evaluations, and multilingual capabilities 🌟📊🌍. Click [here](./docs/compare_with_phi-3_vision.md) to view more details. * [2024.05.20] We open-soure MiniCPM-Llama3-V 2.5, it has improved OCR capability and supports 30+ languages, representing the first end-side MLLM achieving GPT-4V level performance! We provide [efficient inference](#deployment-on-mobile-phone) and [simple fine-tuning](./finetune/readme.md). Try it now! * [2024.04.23] MiniCPM-V-2.0 supports vLLM now! Click [here](#vllm) to view more details. @@ -51,7 +52,7 @@ - [Inference on Mac](#inference-on-mac) - [Deployment on Mobile Phone](#deployment-on-mobile-phone) - [WebUI Demo](#webui-demo) - - [Inference with llama.cpp](#llamacpp) + - [Inference with llama.cpp](#inference-with-llamacpp) - [Inference with vLLM](#inference-with-vllm) - [Fine-tuning](#fine-tuning) - [TODO](#todo) @@ -586,8 +587,12 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps ``` -### Inference with llama.cpp -MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. +### Inference with llama.cpp +MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. This implementation supports smooth inference of 6~8 token/s on mobile phone1. + + +1. Test environment:Xiaomi 14 pro + Snapdragon 8 Gen 3 + ### Inference with vLLM diff --git a/README_zh.md b/README_zh.md index c35d8c7..fefdb21 100644 --- a/README_zh.md +++ b/README_zh.md @@ -28,8 +28,8 @@ ## 更新日志 +* [2024.05.24] MiniCPM-Llama3-V 2.5现在支持 [llama.cpp](#llamacpp-部署) 推理了!实现端侧 6-8 tokens/s 的流畅推理,欢迎试用! * [2024.05.23] 🔍 我们添加了Phi-3-vision-128k-instruct 与 MiniCPM-Llama3-V 2.5的全面对比,包括基准测试评估和多语言能力 🌟📊🌍。点击[这里](./docs/compare_with_phi-3_vision.md)查看详细信息。 - * [2024.05.20] 我们开源了 MiniCPM-Llama3-V 2.5,增强了 OCR 能力,支持 30 多种语言,并首次在端侧实现了 GPT-4V 级的多模态能力!我们提供了[高效推理](#手机端部署)和[简易微调](./finetune/readme.md)的支持,欢迎试用! * [2024.04.23] 我们增加了对 [vLLM](#vllm) 的支持,欢迎体验! * [2024.04.18] 我们在 HuggingFace Space 新增了 MiniCPM-V 2.0 的 [demo](https://huggingface.co/spaces/openbmb/MiniCPM-V-2),欢迎体验! @@ -55,7 +55,7 @@ - [Mac 推理](#mac-推理) - [手机端部署](#手机端部署) - [本地WebUI Demo部署](#本地webui-demo部署) - - [llama.cpp部署](#llamacpp) + - [llama.cpp 部署](#llamacpp-部署) - [vLLM 部署 ](#vllm-部署-) - [微调](#微调) - [未来计划](#未来计划) @@ -601,8 +601,12 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps ``` -### llama.cpp 部署 -MiniCPM-Llama3-V 2.5 现在支持llama.cpp啦! 用法请参考我们的fork [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) . +### llama.cpp 部署 +MiniCPM-Llama3-V 2.5 现在支持llama.cpp啦! 用法请参考我们的fork [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv), 在手机上可以支持 6~8 token/s 的流畅推理1。 + + +1. 测试环境:Xiaomi 14 pro + Snapdragon 8 Gen 3 + ### vLLM 部署