mirror of
https://github.com/OpenBMB/MiniCPM-V.git
synced 2026-02-04 17:59:18 +08:00
Update to MiniCPM-V 2.6
This commit is contained in:
30
docs/faqs.md
Normal file
30
docs/faqs.md
Normal file
@@ -0,0 +1,30 @@
|
||||
### FAQs
|
||||
|
||||
<details>
|
||||
<summary>Q: How to choose between sampling or beam search for inference </summary>
|
||||
|
||||
In various scenarios, the quality of results obtained from beam search and sampling decoding strategies can vary. You can determine your decoding strategy based on the following aspects:
|
||||
|
||||
If you have the following needs, consider using sampling decoding:
|
||||
|
||||
1. You require faster inference speed.
|
||||
2. You wish for a streaming generation approach.
|
||||
3. Your task necessitates some open-ended responses.
|
||||
|
||||
If your task is about providing deterministic answers, you might want to experiment with beam search to see if it can achieve better outcomes.
|
||||
</details>
|
||||
|
||||
|
||||
<details>
|
||||
<summary>Q: How to ensure that the model generates results of sufficient length</summary>
|
||||
|
||||
We've observed that during multi-language inference on MiniCPM-V 2.6, the generation sometimes ends prematurely. You can improve the results by passing a `min_new_tokens` parameter.
|
||||
```python
|
||||
res = model.chat(
|
||||
image=None,
|
||||
msgs=msgs,
|
||||
tokenizer=tokenizer,
|
||||
min_new_tokens=100
|
||||
)
|
||||
```
|
||||
</details>
|
||||
Reference in New Issue
Block a user