update readme

This commit is contained in:
yiranyyu
2024-05-24 11:57:33 +08:00
parent 4b94ad6d14
commit c0c5581f22
2 changed files with 16 additions and 7 deletions

View File

@@ -25,6 +25,7 @@
## News <!-- omit in toc -->
* [2024.05.24] MiniCPM-Llama3-V 2.5 supports [llama.cpp](#inference-with-llamacpp) now, providing a smooth inference of 6-8 tokens/s on mobile phones. Try it now!
* [2024.05.23] 🔍 We've released a comprehensive comparison between Phi-3-vision-128k-instruct and MiniCPM-Llama3-V 2.5, including benchmarks evaluations, and multilingual capabilities 🌟📊🌍. Click [here](./docs/compare_with_phi-3_vision.md) to view more details.
* [2024.05.20] We open-soure MiniCPM-Llama3-V 2.5, it has improved OCR capability and supports 30+ languages, representing the first end-side MLLM achieving GPT-4V level performance! We provide [efficient inference](#deployment-on-mobile-phone) and [simple fine-tuning](./finetune/readme.md). Try it now!
* [2024.04.23] MiniCPM-V-2.0 supports vLLM now! Click [here](#vllm) to view more details.
@@ -51,7 +52,7 @@
- [Inference on Mac](#inference-on-mac)
- [Deployment on Mobile Phone](#deployment-on-mobile-phone)
- [WebUI Demo](#webui-demo)
- [Inference with llama.cpp](#llamacpp)
- [Inference with llama.cpp](#inference-with-llamacpp)
- [Inference with vLLM](#inference-with-vllm)
- [Fine-tuning](#fine-tuning)
- [TODO](#todo)
@@ -586,8 +587,12 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps
```
</details>
### Inference with llama.cpp<a id="llamacpp"></a>
MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail.
### Inference with llama.cpp<a id="inference-with-llamacpp"></a>
MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. This implementation supports smooth inference of 6~8 token/s on mobile phone<sup>1</sup>.
<small>
1. Test environmentXiaomi 14 pro + Snapdragon 8 Gen 3
</small>
### Inference with vLLM<a id="vllm"></a>

View File

@@ -28,8 +28,8 @@
## 更新日志 <!-- omit in toc -->
* [2024.05.24] MiniCPM-Llama3-V 2.5现在支持 [llama.cpp](#llamacpp-部署) 推理了!实现端侧 6-8 tokens/s 的流畅推理,欢迎试用!
* [2024.05.23] 🔍 我们添加了Phi-3-vision-128k-instruct 与 MiniCPM-Llama3-V 2.5的全面对比,包括基准测试评估和多语言能力 🌟📊🌍。点击[这里](./docs/compare_with_phi-3_vision.md)查看详细信息。
<!-- * [2024.05.22] 我们进一步提升了端侧推理速度!实现了 6-8 tokens/s 的流畅体验,欢迎试用! -->
* [2024.05.20] 我们开源了 MiniCPM-Llama3-V 2.5,增强了 OCR 能力,支持 30 多种语言,并首次在端侧实现了 GPT-4V 级的多模态能力!我们提供了[高效推理](#手机端部署)和[简易微调](./finetune/readme.md)的支持,欢迎试用!
* [2024.04.23] 我们增加了对 [vLLM](#vllm) 的支持,欢迎体验!
* [2024.04.18] 我们在 HuggingFace Space 新增了 MiniCPM-V 2.0 的 [demo](https://huggingface.co/spaces/openbmb/MiniCPM-V-2),欢迎体验!
@@ -55,7 +55,7 @@
- [Mac 推理](#mac-推理)
- [手机端部署](#手机端部署)
- [本地WebUI Demo部署](#本地webui-demo部署)
- [llama.cpp部署](#llamacpp)
- [llama.cpp 部署](#llamacpp-部署)
- [vLLM 部署 ](#vllm-部署-)
- [微调](#微调)
- [未来计划](#未来计划)
@@ -601,8 +601,12 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps
```
</details>
### llama.cpp 部署<a id="llamacpp"></a>
MiniCPM-Llama3-V 2.5 现在支持llama.cpp啦! 用法请参考我们的fork [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) .
### llama.cpp 部署<a id="llamacpp-部署"></a>
MiniCPM-Llama3-V 2.5 现在支持llama.cpp啦! 用法请参考我们的fork [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) 在手机上可以支持 6~8 token/s 的流畅推理<sup>1</sup>
<small>
1. 测试环境Xiaomi 14 pro + Snapdragon 8 Gen 3
</small>
### vLLM 部署 <a id='vllm'></a>
<details>