diff --git a/README.md b/README.md
index e46b405..110dce4 100644
--- a/README.md
+++ b/README.md
@@ -614,7 +614,7 @@ MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](htt
 ### Inference with vLLM<a id="vllm"></a>
 
 <details>
-<summary>Click to see how to inference MiniCPM-V 2.0 with vLLM (MiniCPM-Llama3-V 2.5 coming soon) </summary>
+<summary>Click to see how to inference MiniCPM-V 2.0 and MiniCPM-Llama3-V 2.5 with vLLM </summary>
 Because our pull request to vLLM is still waiting for reviewing, we fork this repository to build and test our vLLM demo. Here are the steps:
 
 1. Clone our version of vLLM:
@@ -624,6 +624,7 @@ git clone https://github.com/OpenBMB/vllm.git
 2. Install vLLM:
 ```shell
 cd vllm
+git checkout minicpmv
 pip install -e .
 ```
 3. Install timm: