From 1c51a220f0b1696f747281be4aa8ef3611e3e7f2 Mon Sep 17 00:00:00 2001 From: yiranyyu <2606375857@qq.com> Date: Tue, 14 Jan 2025 21:20:10 +0800 Subject: [PATCH] Update Demo --- README.md | 7 +++---- README_zh.md | 5 ++--- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 0be8937..98ead33 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@
- MiniCPM-o 2.6 🤗 CN🤖 US🤖 | MiniCPM-V 2.6 🤗 🤖 | + MiniCPM-o 2.6 🤗 🤖 | MiniCPM-V 2.6 🤗 🤖 | Technical Blog Coming Soon
@@ -131,8 +131,7 @@ Advancing popular visual capabilites from MiniCPM-V series, MiniCPM-o 2.6 can pr In addition to its friendly size, MiniCPM-o 2.6 also shows **state-of-the-art token density** (i.e., number of pixels encoded into each visual token). **It produces only 640 tokens when processing a 1.8M pixel image, which is 75% fewer than most models**. This directly improves the inference speed, first-token latency, memory usage, and power consumption. As a result, MiniCPM-o 2.6 can efficiently support **multimodal live streaming** on end-side devices such as iPad. - 💫 **Easy Usage.** -MiniCPM-o 2.6 can be easily used in various ways: (1) [llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-omni/examples/llava/README-minicpmo2.6.md) support for efficient CPU inference on local devices, (2) [int4](https://huggingface.co/openbmb/MiniCPM-o-2_6-int4) and [GGUF](https://huggingface.co/openbmb/MiniCPM-o-2_6-gguf) format quantized models in 16 sizes, (3) [vLLM](#efficient-inference-with-llamacpp-ollama-vllm) support for high-throughput and memory-efficient inference, (4) fine-tuning on new domains and tasks with [LLaMA-Factory](./docs/llamafactory_train.md), (5) quick local WebUI demo setup with [Gradio](#chat-with-our-demo-on-gradio), and (6) online web demo on [CN](https://minicpm-omni-webdemo.modelbest.cn/ -) server and [US](https://minicpm-omni-webdemo-us.modelbest.cn/) server. +MiniCPM-o 2.6 can be easily used in various ways: (1) [llama.cpp](https://github.com/OpenBMB/llama.cpp/blob/minicpm-omni/examples/llava/README-minicpmo2.6.md) support for efficient CPU inference on local devices, (2) [int4](https://huggingface.co/openbmb/MiniCPM-o-2_6-int4) and [GGUF](https://huggingface.co/openbmb/MiniCPM-o-2_6-gguf) format quantized models in 16 sizes, (3) [vLLM](#efficient-inference-with-llamacpp-ollama-vllm) support for high-throughput and memory-efficient inference, (4) fine-tuning on new domains and tasks with [LLaMA-Factory](./docs/llamafactory_train.md), (5) quick local WebUI demo setup with [Gradio](#chat-with-our-demo-on-gradio), and (6) online web demo on [server](https://minicpm-omni-webdemo-us.modelbest.cn/). **Model Architecture.** @@ -1808,7 +1807,7 @@ We provide online and local demos powered by Hugging Face Gradio