llava_v1.5_13b_qinstruct_preview_v0.1

我要开发同款
匿名用户2024年07月31日
32阅读
所属分类ai、llava、Pytorch
开源地址https://modelscope.cn/models/qfuture/llava_v1.5_13b_qinstruct_preview_v0.1

作品详情

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

1Nanyang Technological University, 2Shanghai Jiaotong University, 3Sensetime Research, 4I2R@A*STAR
*Equal contribution. #Corresponding author.

Quick Start

LLaVA-v1.5

Install LLaVA.

git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
pip install -e .

Simple Interactive Demos.

See the codes and scripts below.

Example Code (Single Query)

from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model
model_path = "teowu/llava_v1.5_7b_qinstruct_preview_v0.1" 
prompt = "Rate the quality of the image. Think step by step."
image_file = "fig/sausage.jpg"
args = type('Args', (), {
    "model_path": model_path,
    "model_base": None,
    "model_name": get_model_name_from_path(model_path),
    "query": prompt,
    "conv_mode": None,
    "image_file": image_file,
    "sep": ",",
})()
eval_model(args)

Example Code (CLI Demo for Multi-turn Conversation)

python -m llava.serve.cli \
    --model-path teowu/llava_v1.5_7b_qinstruct_preview_v0.1 \
    --image-file "fig/sausage.jpg" \

Note: The results may contain randomness as do_sample=True is enabled during conversation mode.

Quantitative Evaluations

Multi-choice question (MCQ) in Q-Bench.

python eval_scripts/llava_v1.5/eval_qbench_mcq.py

Image/Video Quality Assessment

Image Quality Assessment:

python eval_scripts/llava_v1.5/eval_image_quality.py

Video Quality Assessment:

python eval_scripts/llava_v1.5/eval_video_quality.py

mPLUG-Owl-2

Coming soon.

InternLM-XComposer-VL

Coming soon.

Model Zoo

All weights are converted into Huggingface format and totally compatible with the base repositories (LLaVA, mPLUG-Owl, InternLM-XComposer). After installing the base repositories, you can change the HF-path in the original evaluation scripts into the following ones, so as to automatically download the Q-Instruct-tuned versions.

Released:

Coming Soon:

  • mPLUG-Owl-2 (mix)
  • InternLM-XComposer-VL (mix)

Training

At present, we only provide the training scripts with LLaVA-v1.5. Please see Training Docs for more details.

License

Researchers and open-source developers are free to use the Q-Instruct dataset and the fine-tuned weights as provided for the four MLLMs. We also allow commercial use, while any commercial use should be pre-permitted by our team. Please email haoning001@e.ntu.edu.sg to gain the permission for commercial use.

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论