From 112bace165c6141c31626bb4ae3657242ed43941 Mon Sep 17 00:00:00 2001 From: yiranyyu <2606375857@qq.com> Date: Fri, 31 May 2024 15:03:58 +0800 Subject: [PATCH] update readme --- README.md | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index d105750..79456ff 100644 --- a/README.md +++ b/README.md @@ -54,13 +54,13 @@ - [MiniCPM-Llama3-V 2.5](#minicpm-llama3-v-25) - [MiniCPM-V 2.0](#minicpm-v-20) - [Online Demo](#online-demo) +- [Gradio-based Demo](#gradio-based-demo) - [Install](#install) - [Inference](#inference) - [Model Zoo](#model-zoo) - [Multi-turn Conversation](#multi-turn-conversation) - [Inference on Mac](#inference-on-mac) - [Deployment on Mobile Phone](#deployment-on-mobile-phone) - - [WebUI Demo](#webui-demo) - [Inference with llama.cpp](#inference-with-llamacpp) - [Inference with vLLM](#inference-with-vllm) - [Fine-tuning](#fine-tuning) @@ -462,6 +462,24 @@ We deploy MiniCPM-V 2.0 on end devices. The demo video is the raw screen recordi ## Online Demo Click here to try out the Demo of [MiniCPM-Llama3-V 2.5](https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5) | [MiniCPM-V 2.0](https://huggingface.co/spaces/openbmb/MiniCPM-V-2). +## Gradio-based Demo + +We supports buliding local WebUI demo with [Gradio](https://www.gradio.app/guides/quickstart), which inherently supports queuing, streaming outputs, alerts, progress_bars and other useful features! + + +```shell +pip install -r requirements.txt +``` + +```shell +# For NVIDIA GPUs, run: +python web_demo_2.5.py --device cuda + +# For Mac with MPS (Apple silicon or AMD GPUs), run: +PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps +``` + + ## Install 1. Clone this repository and navigate to the source folder @@ -582,24 +600,6 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 python test.py ### Deployment on Mobile Phone MiniCPM-V 2.0 can be deployed on mobile phones with Android operating systems. 🚀 Click [here](https://github.com/OpenBMB/mlc-MiniCPM) to install apk. MiniCPM-Llama3-V 2.5 coming soon. -### WebUI Demo - -
-Click to see how to deploy WebUI demo on different devices - -```shell -pip install -r requirements.txt -``` - -```shell -# For NVIDIA GPUs, run: -python web_demo_2.5.py --device cuda - -# For Mac with MPS (Apple silicon or AMD GPUs), run: -PYTORCH_ENABLE_MPS_FALLBACK=1 python web_demo_2.5.py --device mps -``` -
- ### Inference with llama.cpp MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail. This implementation supports smooth inference of 6~8 token/s on mobile phones (test environment:Xiaomi 14 pro + Snapdragon 8 Gen 3).