From 9efd3a2c2dd2d333c1fb22a201896937aa2c4514 Mon Sep 17 00:00:00 2001 From: yiranyyu <2606375857@qq.com> Date: Fri, 31 May 2024 22:08:02 +0800 Subject: [PATCH] update readme --- README.md | 14 ++++++++------ README_en.md | 46 ++++++++++++++++++++++++---------------------- 2 files changed, 32 insertions(+), 28 deletions(-) diff --git a/README.md b/README.md index 79456ff..53fe882 100644 --- a/README.md +++ b/README.md @@ -53,8 +53,7 @@ - [MiniCPM-Llama3-V 2.5](#minicpm-llama3-v-25) - [MiniCPM-V 2.0](#minicpm-v-20) -- [Online Demo](#online-demo) -- [Gradio-based Demo](#gradio-based-demo) +- [Chat with Our Demo on Gradio](#chat-with-our-demo-on-gradio) - [Install](#install) - [Inference](#inference) - [Model Zoo](#model-zoo) @@ -458,14 +457,17 @@ We deploy MiniCPM-V 2.0 on end devices. The demo video is the raw screen recordi | OmniLMM-12B | [Document](./omnilmm_en.md) | +## Chat with Our Demo on Gradio -## Online Demo -Click here to try out the Demo of [MiniCPM-Llama3-V 2.5](https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5) | [MiniCPM-V 2.0](https://huggingface.co/spaces/openbmb/MiniCPM-V-2). +We provide online and local demo powered by HuggingFace [Gradio](https://www.gradio.app/guides/quickstart), the most popular model deployment framework nowadays. It supports streaming outputs, progress bars, queuing, alerts, and other useful features. -## Gradio-based Demo +### Online Demo -We supports buliding local WebUI demo with [Gradio](https://www.gradio.app/guides/quickstart), which inherently supports queuing, streaming outputs, alerts, progress_bars and other useful features! +Click here to try out the online demo of [MiniCPM-Llama3-V 2.5](https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5) | [MiniCPM-V 2.0](https://huggingface.co/spaces/openbmb/MiniCPM-V-2) on HuggingFace Spaces. +### Local WebUI Demo + +You can easily build your own local WebUI demo with Gradio use the following commands. ```shell pip install -r requirements.txt diff --git a/README_en.md b/README_en.md index d105750..53fe882 100644 --- a/README_en.md +++ b/README_en.md @@ -53,14 +53,13 @@ - [MiniCPM-Llama3-V 2.5](#minicpm-llama3-v-25) - [MiniCPM-V 2.0](#minicpm-v-20) -- [Online Demo](#online-demo) +- [Chat with Our Demo on Gradio](#chat-with-our-demo-on-gradio) - [Install](#install) - [Inference](#inference) - [Model Zoo](#model-zoo) - [Multi-turn Conversation](#multi-turn-conversation) - [Inference on Mac](#inference-on-mac) - [Deployment on Mobile Phone](#deployment-on-mobile-phone) - - [WebUI Demo](#webui-demo) - [Inference with llama.cpp](#inference-with-llamacpp) - [Inference with vLLM](#inference-with-vllm) - [Fine-tuning](#fine-tuning) @@ -458,9 +457,30 @@ We deploy MiniCPM-V 2.0 on end devices. The demo video is the raw screen recordi | OmniLMM-12B | [Document](./omnilmm_en.md) | +## Chat with Our Demo on Gradio + +We provide online and local demo powered by HuggingFace [Gradio](https://www.gradio.app/guides/quickstart), the most popular model deployment framework nowadays. It supports streaming outputs, progress bars, queuing, alerts, and other useful features. + +### Online Demo + +Click here to try out the online demo of [MiniCPM-Llama3-V 2.5](https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5) | [MiniCPM-V 2.0](https://huggingface.co/spaces/openbmb/MiniCPM-V-2) on HuggingFace Spaces. + +### Local WebUI Demo + +You can easily build your own local WebUI demo with Gradio use the following commands. + +```shell +pip install -r requirements.txt +``` + +```shell +# For NVIDIA GPUs, run: +python web_demo_2.5.py --device cuda + +# For Mac with MPS (Apple silicon or AMD GPUs), run: +PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps +``` -## Online Demo -Click here to try out the Demo of [MiniCPM-Llama3-V 2.5](https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5) | [MiniCPM-V 2.0](https://huggingface.co/spaces/openbmb/MiniCPM-V-2). ## Install @@ -582,24 +602,6 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python test.py ### Deployment on Mobile Phone MiniCPM-V 2.0 can be deployed on mobile phones with Android operating systems. 🚀 Click [here](https://github.com/OpenBMB/mlc-MiniCPM) to install apk. MiniCPM-Llama3-V 2.5 coming soon. -### WebUI Demo - -
-Click to see how to deploy WebUI demo on different devices - -```shell -pip install -r requirements.txt -``` - -```shell -# For NVIDIA GPUs, run: -python web_demo_2.5.py --device cuda - -# For Mac with MPS (Apple silicon or AMD GPUs), run: -PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps -``` -
- ### Inference with llama.cpp MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. This implementation supports smooth inference of 6~8 token/s on mobile phones (test environment:Xiaomi 14 pro + Snapdragon 8 Gen 3).