From 3e3c6299b375108194ffb7e2fcd193815ec9744b Mon Sep 17 00:00:00 2001 From: root <403644786@qq.com> Date: Thu, 18 Jul 2024 15:33:38 +0800 Subject: [PATCH] =?UTF-8?q?=E6=8C=89=E7=85=A7=E8=A6=81=E6=B1=82=E4=BF=AE?= =?UTF-8?q?=E6=94=B9=E4=BA=86=E6=A0=87=E9=A2=98=EF=BC=8C=E6=9C=AF=E8=AF=AD?= =?UTF-8?q?=EF=BC=8C=E7=AB=A0=E8=8A=82=E6=8F=8F=E8=BF=B0?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/xinference_infer.md | 73 +++++++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/docs/xinference_infer.md b/docs/xinference_infer.md index 1b7bceb..b7be1c3 100644 --- a/docs/xinference_infer.md +++ b/docs/xinference_infer.md @@ -1,21 +1,26 @@ -## about xinference -xinference is a unified inference platform that provides a unified interface for different inference engines. It supports LLM, text generation, image generation, and more.but it's not bigger than swift too much. +## Xinference Infer +Xinference is a unified inference platform that provides a unified interface for different inference engines. It supports LLM, text generation, image generation, and more.but it's not bigger than Swift too much. -## xinference install + +### Xinference install +Xinference can be installed simply by using the following easy bash code: ```shell pip install "xinference[all]" ``` -## quick start -1. start xinference +### Quick start +The initial steps for conducting inference with Xinference involve downloading the model during the first launch. +1. Start xinference in the terminal: ```shell xinference ``` -2. start the web ui. +2. Start the web ui. 3. Search for "MiniCPM-Llama3-V-2_5" in the search box. -[alt text](../assets/xinferenc_demo_image/xinference_search_box.png) -4. find and click the MiniCPM-Llama3-V-2_5 button. -5. follow the config and launch the model. + +![alt text](../assets/xinferenc_demo_image/xinference_search_box.png) + +4. Find and click the MiniCPM-Llama3-V-2_5 button. +5. Follow the config and launch the model. ```plaintext Model engine : Transformers model format : pytorch @@ -24,33 +29,39 @@ quantization : none N-GPU : auto Replica : 1 ``` -6. after first click the launch button,xinference will download the model from huggingface. we should click the webui button. -![alt text](../assets/xinferenc_demo_image/xinference_webui_button.png) -7. upload the image and chatting with the MiniCPM-Llama3-V-2_5 +6. After first click the launch button,xinference will download the model from huggingface. We should click the webui button. -## local MiniCPM-Llama3-V-2_5 launch -1. start xinference +![alt text](../assets/xinferenc_demo_image/xinference_webui_button.png) + +7. Upload the image and chatting with the MiniCPM-Llama3-V-2_5 + +### Local MiniCPM-Llama3-V-2_5 Launch +If you have already downloaded the MiniCPM-Llama3-V-2_5 model locally, you can proceed with Xinference inference following these steps: +1. Start xinference ```shell -xinference + xinference ``` -2. start the web ui. +2. Start the web ui. 3. To register a new model, follow these steps: the settings highlighted in red are fixed and cannot be changed, whereas others are customizable according to your needs. Complete the process by clicking the 'Register Model' button. + ![alt text](../assets/xinferenc_demo_image/xinference_register_model1.png) ![alt text](../assets/xinferenc_demo_image/xinference_register_model2.png) -4. After completing the model registration, proceed to 'Custom Models' and locate the model you just registered. -5. follow the config and launch the model. -```plaintext -Model engine : Transformers -model format : pytorch -Model size : 8 -quantization : none -N-GPU : auto -Replica : 1 -``` -6. after first click the launch button,xinference will download the model from huggingface. we should click the chat button. -![alt text](../assets/xinferenc_demo_image/xinference_webui_button.png) -7. upload the image and chatting with the MiniCPM-Llama3-V-2_5 -## FAQ +4. After completing the model registration, proceed to 'Custom Models' and locate the model you just registered. +5. Follow the config and launch the model. +```plaintext + Model engine : Transformers + model format : pytorch + Model size : 8 + quantization : none + N-GPU : auto + Replica : 1 +``` +6. After first click the launch button,Xinference will download the model from Huggingface. we should click the chat button. +![alt text](../assets/xinferenc_demo_image/xinference_webui_button.png) +7. Upload the image and chatting with the MiniCPM-Llama3-V-2_5 + +### FAQ 1. Why can't the sixth step open the WebUI? -maybe your firewall or mac os to prevent the web to open. \ No newline at end of file + + Maybe your firewall or mac os to prevent the web to open. \ No newline at end of file