diff --git a/README.md b/README.md
index 6469b63..7e856a7 100644
--- a/README.md
+++ b/README.md
@@ -13,7 +13,7 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
 <p align="center">
   MiniCPM-Llama3-V  2.5  <a href="https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/">🤗</a> <a href="https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5">🤖</a> |
   MiniCPM-V 2.0  <a href="https://huggingface.co/openbmb/MiniCPM-V-2/">🤗</a> <a href="https://huggingface.co/spaces/openbmb/MiniCPM-V-2">🤖</a> |
-  <a href="https://openbmb.vercel.app/minicpm-v-2-en"> Technical Blog </a>
+  <a href="https://openbmb.vercel.app/minicpm-v-2-en"> MiniCPM-V 2.0 Technical Blog </a> | <a href=https://github.com/OpenBMB/MiniCPM-V/tree/main/docs/MiniCPM_Llama3_V_25_technical_report.pdf>MiniCPM-Llama3-V 2.5 Technical Report</a> 
 </p>
 
 </div>
@@ -29,6 +29,7 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
 ## News <!-- omit in toc -->
 
 #### 📌 Pinned
+* [2024.08.03] MiniCPM-Llama3-V 2.5 technical report is released! See [here](./docs/MiniCPM_Llama3_V_25_technical_report.pdf).
 * [2024.07.19] MiniCPM-Llama3-V 2.5 supports vLLM now! See [here](#vllm).
 * [2024.05.28] 🚀🚀🚀 MiniCPM-Llama3-V 2.5 now fully supports its feature in llama.cpp and ollama! Please pull the latest code **of our provided forks** ([llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md), [ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5)). GGUF models in various sizes are available [here](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main). MiniCPM-Llama3-V 2.5 series is **not supported by the official repositories yet**, and we are working hard to merge PRs. Please stay tuned!
 * [2024.05.28] 💫 We now support LoRA fine-tuning for MiniCPM-Llama3-V 2.5, using only 2 V100 GPUs! See more statistics [here](https://github.com/OpenBMB/MiniCPM-V/tree/main/finetune#model-fine-tuning-memory-usage-statistics).
@@ -61,6 +62,7 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
 - [Inference](#inference)
   - [Model Zoo](#model-zoo)
   - [Multi-turn Conversation](#multi-turn-conversation)
+  - [Inference on Multiple GPUs](#inference-on-multiple-gpus)
   - [Inference on Mac](#inference-on-mac)
   - [Deployment on Mobile Phone](#deployment-on-mobile-phone)
   - [Inference with llama.cpp](#inference-with-llamacpp)
diff --git a/README_en.md b/README_en.md
index 6469b63..7e856a7 100644
--- a/README_en.md
+++ b/README_en.md
@@ -13,7 +13,7 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
 <p align="center">
   MiniCPM-Llama3-V  2.5  <a href="https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/">🤗</a> <a href="https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5">🤖</a> |
   MiniCPM-V 2.0  <a href="https://huggingface.co/openbmb/MiniCPM-V-2/">🤗</a> <a href="https://huggingface.co/spaces/openbmb/MiniCPM-V-2">🤖</a> |
-  <a href="https://openbmb.vercel.app/minicpm-v-2-en"> Technical Blog </a>
+  <a href="https://openbmb.vercel.app/minicpm-v-2-en"> MiniCPM-V 2.0 Technical Blog </a> | <a href=https://github.com/OpenBMB/MiniCPM-V/tree/main/docs/MiniCPM_Llama3_V_25_technical_report.pdf>MiniCPM-Llama3-V 2.5 Technical Report</a> 
 </p>
 
 </div>
@@ -29,6 +29,7 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
 ## News <!-- omit in toc -->
 
 #### 📌 Pinned
+* [2024.08.03] MiniCPM-Llama3-V 2.5 technical report is released! See [here](./docs/MiniCPM_Llama3_V_25_technical_report.pdf).
 * [2024.07.19] MiniCPM-Llama3-V 2.5 supports vLLM now! See [here](#vllm).
 * [2024.05.28] 🚀🚀🚀 MiniCPM-Llama3-V 2.5 now fully supports its feature in llama.cpp and ollama! Please pull the latest code **of our provided forks** ([llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md), [ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5)). GGUF models in various sizes are available [here](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main). MiniCPM-Llama3-V 2.5 series is **not supported by the official repositories yet**, and we are working hard to merge PRs. Please stay tuned!
 * [2024.05.28] 💫 We now support LoRA fine-tuning for MiniCPM-Llama3-V 2.5, using only 2 V100 GPUs! See more statistics [here](https://github.com/OpenBMB/MiniCPM-V/tree/main/finetune#model-fine-tuning-memory-usage-statistics).
@@ -61,6 +62,7 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
 - [Inference](#inference)
   - [Model Zoo](#model-zoo)
   - [Multi-turn Conversation](#multi-turn-conversation)
+  - [Inference on Multiple GPUs](#inference-on-multiple-gpus)
   - [Inference on Mac](#inference-on-mac)
   - [Deployment on Mobile Phone](#deployment-on-mobile-phone)
   - [Inference with llama.cpp](#inference-with-llamacpp)
diff --git a/README_zh.md b/README_zh.md
index 8f5ca2c..424915d 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -14,7 +14,9 @@
 <p align="center">
   MiniCPM-Llama3-V  2.5  <a href="https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/">🤗</a> <a href="https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5">🤖</a> |
   MiniCPM-V 2.0  <a href="https://huggingface.co/openbmb/MiniCPM-V-2/">🤗</a> <a href="https://huggingface.co/spaces/openbmb/MiniCPM-V-2">🤖</a> |
-  <a href="https://openbmb.vercel.app/minicpm-v-2">MiniCPM-V 2.0 技术博客</a>
+  <a href="https://openbmb.vercel.app/minicpm-v-2">MiniCPM-V 2.0 技术博客</a> |
+  <a href=https://github.com/OpenBMB/MiniCPM-V/tree/main/docs/MiniCPM_Llama3_V_25_technical_report.pdf>MiniCPM-Llama3-V 2.5 技术报告</a> 
+
 </p>
 
 </div>
@@ -32,6 +34,7 @@
 
 #### 📌 置顶
 
+* [2024.08.03] MiniCPM-Llama3-V 2.5 技术报告已发布！欢迎点击[这里](./docs/MiniCPM_Llama3_V_25_technical_report.pdf)查看。
 * [2024.07.19] MiniCPM-Llama3-V 2.5 现已支持[vLLM](#vllm) ！
 * [2024.05.28] 💥 MiniCPM-Llama3-V 2.5 现在在 llama.cpp 和 ollama 中完全支持其功能！**请拉取我们最新的 fork 来使用**：[llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) & [ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5)。我们还发布了各种大小的 GGUF 版本，请点击[这里](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main)查看。请注意，**目前官方仓库尚未支持 MiniCPM-Llama3-V 2.5**，我们也正积极推进将这些功能合并到 llama.cpp & ollama 官方仓库，敬请关注！
 * [2024.05.28] 💫 我们现在支持 MiniCPM-Llama3-V 2.5 的 LoRA 微调，更多内存使用统计信息可以在[这里](https://github.com/OpenBMB/MiniCPM-V/tree/main/finetune#model-fine-tuning-memory-usage-statistics)找到。
@@ -65,6 +68,7 @@
 - [推理](#推理)
   - [模型库](#模型库)
   - [多轮对话](#多轮对话)
+  - [多卡推理](#多卡推理)
   - [Mac 推理](#mac-推理)
   - [手机端部署](#手机端部署)
   - [本地WebUI Demo部署](#本地webui-demo部署)
diff --git a/docs/MiniCPM_Llama3_V_25_technical_report.pdf b/docs/MiniCPM_Llama3_V_25_technical_report.pdf
new file mode 100644
index 0000000..91a2d9d
Binary files /dev/null and b/docs/MiniCPM_Llama3_V_25_technical_report.pdf differ