llava_v1.5_13b_qinstruct_preview_v0.1

我要开发同款
匿名用户2024年07月31日
55阅读

技术信息

开源地址
https://modelscope.cn/models/qfuture/llava_v1.5_13b_qinstruct_preview_v0.1

作品详情

Q-Istruct: Improvig Low-level Visual Abilities for Multi-modality Foudatio Models

1Nayag Techological Uiversity, 2Shaghai Jiaotog Uiversity, 3Sesetime Research, 4I2R@A*STAR
*Equal cotributio. #Correspodig author.

Quick Start

LLaVA-v1.5

Istall LLaVA.

git cloe https://github.com/haotia-liu/LLaVA.git
cd LLaVA
pip istall -e .

Simple Iteractive Demos.

See the codes ad scripts below.

Example Code (Sigle Query)

from llava.mm_utils import get_model_ame_from_path
from llava.eval.ru_llava import eval_model
model_path = "teowu/llava_v1.5_7b_qistruct_preview_v0.1" 
prompt = "Rate the quality of the image. Thik step by step."
image_file = "fig/sausage.jpg"
args = type('Args', (), {
    "model_path": model_path,
    "model_base": Noe,
    "model_ame": get_model_ame_from_path(model_path),
    "query": prompt,
    "cov_mode": Noe,
    "image_file": image_file,
    "sep": ",",
})()
eval_model(args)

Example Code (CLI Demo for Multi-tur Coversatio)

pytho -m llava.serve.cli \
    --model-path teowu/llava_v1.5_7b_qistruct_preview_v0.1 \
    --image-file "fig/sausage.jpg" \

Note: The results may cotai radomess as do_sample=True is eabled durig coversatio mode.

Quatitative Evaluatios

Multi-choice questio (MCQ) i Q-Bech.

pytho eval_scripts/llava_v1.5/eval_qbech_mcq.py

Image/Video Quality Assessmet

Image Quality Assessmet:

pytho eval_scripts/llava_v1.5/eval_image_quality.py

Video Quality Assessmet:

pytho eval_scripts/llava_v1.5/eval_video_quality.py

mPLUG-Owl-2

Comig soo.

IterLM-XComposer-VL

Comig soo.

Model Zoo

All weights are coverted ito Huggigface format ad totally compatible with the base repositories (LLaVA, mPLUG-Owl, IterLM-XComposer). After istallig the base repositories, you ca chage the HF-path i the origial evaluatio scripts ito the followig oes, so as to automatically dowload the Q-Istruct-tued versios.

Released:

Comig Soo:

  • mPLUG-Owl-2 (mix)
  • IterLM-XComposer-VL (mix)

Traiig

At preset, we oly provide the traiig scripts with LLaVA-v1.5. Please see Traiig Docs for more details.

Licese

Researchers ad ope-source developers are free to use the Q-Istruct dataset ad the fie-tued weights as provided for the four MLLMs. We also allow commercial use, while ay commercial use should be pre-permitted by our team. Please email haoig001@e.tu.edu.sg to gai the permissio for commercial use.

功能介绍

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models Ha

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论