Merge remote-tracking branch 'origin/main'

This commit is contained in:
waxnkw
2024-05-23 12:43:45 +08:00
4 changed files with 48 additions and 6 deletions

View File

@@ -44,6 +44,7 @@
- [Online Demo](#online-demo)
- [Install](#install)
- [Inference](#inference)
- [Hardware Requirements](#hardware-requirements)
- [Model Zoo](#model-zoo)
- [Multi-turn Conversation](#multi-turn-conversation)
- [Inference on Mac](#inference-on-mac)
@@ -453,6 +454,15 @@ pip install -r requirements.txt
## Inference
### Hardware Requirements
| Model | GPU Memory |
|:----------------------|:-------------------:|
| MiniCPM-Llama3-V 2.5 | 19 GB |
| MiniCPM-Llama3-V 2.5 (int4) | 8 GB |
| MiniCPM-Llama3-V 2.0 | 8 GB |
### Model Zoo
| Model | Description | Download Link |
|:----------------------|:-------------------|:---------------:|
@@ -589,13 +599,13 @@ python examples/minicpmv_example.py
### Simple Fine-tuning <!-- omit in toc -->
We supports simple fine-tuning with Hugging Face for MiniCPM-V 2.0 and MiniCPM-Llama3-V 2.5.
We support simple fine-tuning with Hugging Face for MiniCPM-V 2.0 and MiniCPM-Llama3-V 2.5.
[Reference Document](./finetune/readme.md)
### With the SWIFT Framework <!-- omit in toc -->
We now support finetune MiniCPM-V series with the SWIFT framework. SWIFT supports training, inference, evaluation and deployment of nearly 200 LLMs and MLLMs . It supports the lightweight training solutions provided by PEFT and a complete Adapters Library including techniques such as NEFTune, LoRA+ and LLaMA-PRO.
We now support MiniCPM-V series fine-tuning with the SWIFT framework. SWIFT supports training, inference, evaluation and deployment of nearly 200 LLMs and MLLMs . It supports the lightweight training solutions provided by PEFT and a complete Adapters Library including techniques such as NEFTune, LoRA+ and LLaMA-PRO.
Best Practices[MiniCPM-V 1.0](https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/minicpm-v最佳实践.md), [MiniCPM-V 2.0](https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/minicpm-v-2最佳实践.md)
@@ -618,9 +628,9 @@ Please contact cpm@modelbest.cn to obtain written authorization for commercial u
## Statement <!-- omit in toc -->
As LMMs, OmniLMMs generate contents by learning a large amount of multimodal corpora, but they cannot comprehend, express personal opinions or make value judgement. Anything generated by OmniLMMs does not represent the views and positions of the model developers
As LMMs, MiniCPM-V models (including OmniLMM) generate contents by learning a large amount of multimodal corpora, but they cannot comprehend, express personal opinions or make value judgement. Anything generated by MiniCPM-V models does not represent the views and positions of the model developers
We will not be liable for any problems arising from the use of OmniLMM open source models, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
We will not be liable for any problems arising from the use of MiniCPMV-V models, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.
## Institutions <!-- omit in toc -->

View File

@@ -28,6 +28,7 @@
## 更新日志 <!-- omit in toc -->
<!-- * [2024.05.22] 我们进一步提升了端侧推理速度!实现了 6-8 tokens/s 的流畅体验,欢迎试用! -->
* [2024.05.20] 我们开源了 MiniCPM-Llama3-V 2.5,增强了 OCR 能力,支持 30 多种语言,并首次在端侧实现了 GPT-4V 级的多模态能力!我们提供了[高效推理](#手机端部署)和[简易微调](./finetune/readme.md)的支持,欢迎试用!
* [2024.04.23] 我们增加了对 [vLLM](#vllm) 的支持,欢迎体验!
* [2024.04.18] 我们在 HuggingFace Space 新增了 MiniCPM-V 2.0 的 [demo](https://huggingface.co/spaces/openbmb/MiniCPM-V-2),欢迎体验!

View File

@@ -0,0 +1,31 @@
## Phi-3-vision-128K-Instruct vs MiniCPM-Llama3-V 2.5
Comparison results of Phi-3-vision-128K-Instruct and MiniCPM-Llama3-V 2.5, regarding the model size, hardware requirements, and performances on multiple popular benchmarks.
我们提供了从模型参数、硬件需求、全面性能指标等方面对比 Phi-3-vision-128K-Instruct 和 MiniCPM-Llama3-V 2.5 的结果。
## Hardeware Requirements (硬件需求)
With in4 quantization, MiniCPM-Llama3-V 2.5 delivers smooth inference of 6-8 tokens/s with only 8GB of GPU memory.
通过 in4 量化MiniCPM-Llama3-V 2.5 仅需 8GB 显存即可提供 6-8 tokens/s 的流畅推理。
| Model模型 | GPU Memory显存 |
|:----------------------|:-------------------:|
| [MiniCPM-Llama3-V 2.5](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/) | 19 GB |
| Phi-3-vision-128K-Instruct | 12 GB |
| [MiniCPM-Llama3-V 2.5 (int4)](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4/) | 8 GB |
## Model Size and Peformance (模型参数和性能)
| | Phi-3-vision-128K-Instruct | MiniCPM-Llama3-V 2.5|
|:-|:----------:|:-------------------:|
| Size参数 | **4B** | 8B|
| OpenCompass | 53.7 | **58.8** |
| OCRBench | 639.0 | **725.0**|
| RealworldQA | 58.8 | **63.5**|
| TextVQA | 72.2 | **76.6** |
| ScienceQA| **90.8** | 89.0 |
| POPE | 83.4 | **87.2** |

View File

@@ -29,5 +29,5 @@ uvicorn==0.24.0.post1
sentencepiece==0.1.99
accelerate==0.30.1
socksio==1.0.0
gradio==4.31.4
gradio_client==0.16.4
gradio
gradio_client