mirror of
https://github.com/OpenBMB/MiniCPM-V.git
synced 2026-02-05 18:29:18 +08:00
根据要求做了markdown的格式修改,术语apk没有修改,遵循了readme中原来的表述方式
This commit is contained in:
@@ -71,11 +71,13 @@ Join our <a href="docs/wechat.md" target="_blank"> 💬 WeChat</a>
|
|||||||
- [Citation](#citation)
|
- [Citation](#citation)
|
||||||
|
|
||||||
## MiniCPM-Llama3-V 2.5 Common Module Navigation <!-- omit in toc -->
|
## MiniCPM-Llama3-V 2.5 Common Module Navigation <!-- omit in toc -->
|
||||||
|
You can click on the following table to quickly access the commonly used content you need.
|
||||||
|
|
||||||
| Functional Categories | | | | | | | ||
|
| Functional Categories | | | | | | | ||
|
||||||
|:--------:|:------:|:--------------:|:--------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|
|
|:--------:|:------:|:--------------:|:--------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|
|
||||||
| Inference | [Transformers](https://github.com/OpenBMB/MiniCPM-V/blob/main/docs/inference_on_multiple_gpus.md) | [Ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5) | [Swift](./docs/swift_train_and_infer.md) | [Llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | [Xinfrence](./docs/xinference_infer.md) | [Gradio](./web_demo_2.5.py) | [Streamlit](./web_demo_streamlit-2_5.py) |
|
| Inference | [Transformers](https://github.com/OpenBMB/MiniCPM-V/blob/main/docs/inference_on_multiple_gpus.md) | [ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5) | [SWIFT](./docs/swift_train_and_infer.md) | [llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | [Xinfrence](./docs/xinference_infer.md) | [Gradio](./web_demo_2.5.py) | [Streamlit](./web_demo_streamlit-2_5.py) |[vLLM](#vllm)
|
||||||
| Finetune | [Finetune](./finetune/readme.md) | [Lora](./finetune/readme.md) | [Swift](./docs/swift_train_and_infer.md) | | | | | |
|
| Finetune | [Finetune](./finetune/readme.md) | [Lora](./finetune/readme.md) | [SWIFT](./docs/swift_train_and_infer.md) | | | | | |
|
||||||
| Edge Deployment | [Apk](http://minicpm.modelbest.cn/android/modelbest-release-20240528_182155.apk) | [Llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | | | | | | |
|
| Edge Deployment | [apk](http://minicpm.modelbest.cn/android/modelbest-release-20240528_182155.apk) | [llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | | | | | | |
|
||||||
| Quantize | [Bnb](./quantize/bnb_quantize.py) |
|
| Quantize | [Bnb](./quantize/bnb_quantize.py) |
|
||||||
|
|
||||||
## MiniCPM-Llama3-V 2.5
|
## MiniCPM-Llama3-V 2.5
|
||||||
|
|||||||
@@ -76,11 +76,13 @@
|
|||||||
- [引用](#引用)
|
- [引用](#引用)
|
||||||
|
|
||||||
## MiniCPM-Llama3-V 2.5快速导航 <!-- omit in toc -->
|
## MiniCPM-Llama3-V 2.5快速导航 <!-- omit in toc -->
|
||||||
|
你可以点击以下表格快速访问MiniCPM-Llama3-V 2.5中你所需要的常用内容
|
||||||
|
|
||||||
| 功能分类 | | | | | | | ||
|
| 功能分类 | | | | | | | ||
|
||||||
|:--------:|:------:|:--------------:|:--------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|
|
|:--------:|:------:|:--------------:|:--------:|:-------:|:-----------:|:-----------:|:--------:|:-----------:|
|
||||||
| 推理 | [Transformers](https://github.com/OpenBMB/MiniCPM-V/blob/main/docs/inference_on_multiple_gpus.md) | [Ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5) | [Swift](./docs/swift_train_and_infer.md) | [Llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | [Xinfrence](./docs/xinference_infer.md) | [Gradio](./web_demo_2.5.py) | [Streamlit](./web_demo_streamlit-2_5.py) |
|
| 推理 | [Transformers](https://github.com/OpenBMB/MiniCPM-V/blob/main/docs/inference_on_multiple_gpus.md) | [ollama](https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5) | [SWIFT](./docs/swift_train_and_infer.md) | [Llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | [Xinfrence](./docs/xinference_infer.md) | [Gradio](./web_demo_2.5.py) | [Streamlit](./web_demo_streamlit-2_5.py) |[vLLM](#vllm)
|
||||||
| 微调 | [Finetune](./finetune/readme.md) | [Lora](./finetune/readme.md) | [Swift](./docs/swift_train_and_infer.md) | | | | | |
|
| 微调 | [Finetune](./finetune/readme.md) | [LoRA](./finetune/readme.md) | [Swift](./docs/swift_train_and_infer.md) | | | | | |
|
||||||
| 安卓部署 | [Apk安装](http://minicpm.modelbest.cn/android/modelbest-release-20240528_182155.apk) | [Llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | | | | | | |
|
| 安卓部署 | [apk安装](http://minicpm.modelbest.cn/android/modelbest-release-20240528_182155.apk) | [Llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-v2.5/examples/minicpmv/README.md) | | | | | | |
|
||||||
| 量化 | [Bnb量化](./quantize/bnb_quantize.py) |
|
| 量化 | [Bnb量化](./quantize/bnb_quantize.py) |
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
## Swift install
|
## SWIFT install
|
||||||
You can quickly install Swift using bash commands.
|
You can quickly install SWIFT using bash commands.
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
git clone https://github.com/modelscope/swift.git
|
git clone https://github.com/modelscope/swift.git
|
||||||
@@ -8,11 +8,11 @@ You can quickly install Swift using bash commands.
|
|||||||
pip install -e '.[llm]'
|
pip install -e '.[llm]'
|
||||||
```
|
```
|
||||||
|
|
||||||
## Swift Infer
|
## SWIFT Infer
|
||||||
Inference using Swift can be carried out in two ways: through a command line interface and via Python code.
|
Inference using SWIFT can be carried out in two ways: through a command line interface and via Python code.
|
||||||
|
|
||||||
### Quick start
|
### Quick start
|
||||||
Here are the steps to launch Swift from the Bash command line:
|
Here are the steps to launch SWIFT from the Bash command line:
|
||||||
|
|
||||||
1. Run the bash code will download the model of MiniCPM-Llama3-V-2_5 and run the inference
|
1. Run the bash code will download the model of MiniCPM-Llama3-V-2_5 and run the inference
|
||||||
``` shell
|
``` shell
|
||||||
@@ -43,8 +43,8 @@ CUDA_VISIBLE_DEVICES=0 swift infer --model_type minicpm-v-v2_5-chat
|
|||||||
--model_id_or_path /root/ld/ld_model_pretrain/MiniCPM-Llama3-V-2_5 \
|
--model_id_or_path /root/ld/ld_model_pretrain/MiniCPM-Llama3-V-2_5 \
|
||||||
--dtype bf16
|
--dtype bf16
|
||||||
```
|
```
|
||||||
### Python code with swift infer
|
### Python code with SWIFT infer
|
||||||
The following demonstrates using Python code to initiate inference with the MiniCPM-Llama3-V-2_5 model through Swift.
|
The following demonstrates using Python code to initiate inference with the MiniCPM-Llama3-V-2_5 model through SWIFT.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import os
|
import os
|
||||||
@@ -88,17 +88,17 @@ The following demonstrates using Python code to initiate inference with the Mini
|
|||||||
print(f'history: {history}')
|
print(f'history: {history}')
|
||||||
```
|
```
|
||||||
|
|
||||||
## Swift train
|
## SWIFT train
|
||||||
Swift supports training on the local dataset,the training steps are as follows:
|
SWIFT supports training on the local dataset,the training steps are as follows:
|
||||||
1. Make the train data like this:
|
1. Make the train data like this:
|
||||||
```jsonl
|
```jsonl
|
||||||
{"query": "这张图片描述了什么", "response": "这张图片有一个大熊猫", "images": ["local_image_path"]}
|
{"query": "这张图片描述了什么", "response": "这张图片有一个大熊猫", "images": ["local_image_path"]}
|
||||||
{"query": "这张图片描述了什么", "response": "这张图片有一个大熊猫", "history": [], "images": ["image_path"]}
|
{"query": "这张图片描述了什么", "response": "这张图片有一个大熊猫", "history": [], "images": ["image_path"]}
|
||||||
{"query": "竹子好吃么", "response": "看大熊猫的样子挺好吃呢", "history": [["这张图有什么", "这张图片有大熊猫"], ["大熊猫在干嘛", "吃竹子"]], "images": ["image_url"]}
|
{"query": "竹子好吃么", "response": "看大熊猫的样子挺好吃呢", "history": [["这张图有什么", "这张图片有大熊猫"], ["大熊猫在干嘛", "吃竹子"]], "images": ["image_url"]}
|
||||||
```
|
```
|
||||||
2. Lora Tuning:
|
2. LoRA Tuning:
|
||||||
|
|
||||||
The lora target model are k and v weight in llm you should pay attention to the eval_steps,maybe you should set the eval_steps to a large value, like 200000,beacuase in the eval time , swift will return a memory bug so you should set the eval_steps to a very large value.
|
The LoRA target model are k and v weight in LLM you should pay attention to the eval_steps,maybe you should set the eval_steps to a large value, like 200000,beacuase in the eval time , SWIFT will return a memory bug so you should set the eval_steps to a very large value.
|
||||||
```shell
|
```shell
|
||||||
# Experimental environment: A100
|
# Experimental environment: A100
|
||||||
# 32GB GPU memory
|
# 32GB GPU memory
|
||||||
@@ -117,17 +117,17 @@ CUDA_VISIBLE_DEVICES=0,1 swift sft \
|
|||||||
--eval_steps 200000
|
--eval_steps 200000
|
||||||
```
|
```
|
||||||
|
|
||||||
## Lora Merge and Infer
|
## LoRA Merge and Infer
|
||||||
The lora weight can be merge to the base model and then load to infer.
|
The LoRA weight can be merge to the base model and then load to infer.
|
||||||
|
|
||||||
1. Load the lora weight to infer run the follow code:
|
1. Load the LoRA weight to infer run the follow code:
|
||||||
```shell
|
```shell
|
||||||
CUDA_VISIBLE_DEVICES=0 swift infer \
|
CUDA_VISIBLE_DEVICES=0 swift infer \
|
||||||
--ckpt_dir /your/lora/save/checkpoint
|
--ckpt_dir /your/lora/save/checkpoint
|
||||||
```
|
```
|
||||||
2. Merge the lora weight to the base model:
|
2. Merge the LoRA weight to the base model:
|
||||||
|
|
||||||
The code will load and merge the lora weight to the base model, save the merge model to the lora save path and load the merge model to infer
|
The code will load and merge the LoRA weight to the base model, save the merge model to the LoRA save path and load the merge model to infer
|
||||||
```shell
|
```shell
|
||||||
CUDA_VISIBLE_DEVICES=0 swift infer \
|
CUDA_VISIBLE_DEVICES=0 swift infer \
|
||||||
--ckpt_dir your/lora/save/checkpoint \
|
--ckpt_dir your/lora/save/checkpoint \
|
||||||
|
|||||||
@@ -10,7 +10,7 @@ pip install "xinference[all]"
|
|||||||
|
|
||||||
### Quick start
|
### Quick start
|
||||||
The initial steps for conducting inference with Xinference involve downloading the model during the first launch.
|
The initial steps for conducting inference with Xinference involve downloading the model during the first launch.
|
||||||
1. Start xinference in the terminal:
|
1. Start Xinference in the terminal:
|
||||||
```shell
|
```shell
|
||||||
xinference
|
xinference
|
||||||
```
|
```
|
||||||
@@ -37,7 +37,7 @@ Replica : 1
|
|||||||
|
|
||||||
### Local MiniCPM-Llama3-V-2_5 Launch
|
### Local MiniCPM-Llama3-V-2_5 Launch
|
||||||
If you have already downloaded the MiniCPM-Llama3-V-2_5 model locally, you can proceed with Xinference inference following these steps:
|
If you have already downloaded the MiniCPM-Llama3-V-2_5 model locally, you can proceed with Xinference inference following these steps:
|
||||||
1. Start xinference
|
1. Start Xinference
|
||||||
```shell
|
```shell
|
||||||
xinference
|
xinference
|
||||||
```
|
```
|
||||||
|
|||||||
Reference in New Issue
Block a user