Update README.md

This commit is contained in:
Hongji Zhu
2024-08-08 11:20:14 +08:00
committed by GitHub
parent 9129ad7cce
commit c27e0856b6

View File

@@ -1504,7 +1504,7 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python test.py
</details> </details>
### Deployment on Mobile Phone ### Deployment on Mobile Phone
MiniCPM-Llama3-V 2.5 and MiniCPM-V 2.0 can be deployed on mobile phones with Android operating systems. 🚀 Click [MiniCPM-Llama3-V 2.5](http://minicpm.modelbest.cn/android/modelbest-release-20240528_182155.apk) / [MiniCPM-V 2.0](https://github.com/OpenBMB/mlc-MiniCPM) to install apk. MiniCPM-V 2.0 can be deployed on mobile phones with Android operating systems. 🚀 Click [MiniCPM-V 2.0](https://github.com/OpenBMB/mlc-MiniCPM) to install apk.
### Inference with llama.cpp ### Inference with llama.cpp
MiniCPM-V 2.6 can run with llama.cpp now! See [our fork of llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpmv-main/examples/llava/README-minicpmv2.6.md) for more detail. This implementation supports smooth inference of 16~18 token/s on iPad (test environmentiPad Pro + M4). MiniCPM-V 2.6 can run with llama.cpp now! See [our fork of llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpmv-main/examples/llava/README-minicpmv2.6.md) for more detail. This implementation supports smooth inference of 16~18 token/s on iPad (test environmentiPad Pro + M4).
@@ -1515,9 +1515,9 @@ MiniCPM-V 2.6 can run with ollama now! See [our fork of ollama](https://github.c
### Inference with vLLM ### Inference with vLLM
<details> <details>
<summary> vLLM now officially supports MiniCPM-V 2.0, MiniCPM-Llama3-V 2.5 and MiniCPM-V 2.6, Click to see. </summary> <summary> vLLM now officially supports MiniCPM-V 2.6, MiniCPM-Llama3-V 2.5 and MiniCPM-V 2.0, Click to see. </summary>
1. Install vLLM(==0.5.4): 1. Install vLLM(>=0.5.4):
```shell ```shell
pip install vllm pip install vllm
``` ```