LLaVA视觉问答模型_开源AI项目-程序员客栈

开源地址
https://modelscope.cn/models/xingzi/llava_visual-question-answering授权协议
Apache License 2.0

模型描述 (Model Descriptio)

? LLaVA: Large Laguage ad Visio Assistat

Visual istructio tuig towards large laguage ad visio models with GPT-4 level capabilities.

[Project Page] [Paper]

Visual Istructio Tuig
Haotia Liu, Chuyua Li, Qigyag Wu, Yog Jae Lee (*Equal Cotributio)

Geerated by GLIGEN via "a cute lava llama with glasses" ad box prompt

运行环境 (Operatig eviromet)

Istall

git cloe the origial repository

git cloe https://github.com/haotia-liu/LLaVA.git
cd LLaVA

Istall Package

coda create - llava pytho=3.10 -y
coda activate llava
进入pyproject.toml所在文件夹
pip istall --upgrade pip  # eable PEP 660 support
pip istall -e .

LLaVA Weights

提供模型运行所需完整13B权重，使用需遵循LLaMA model licese。

代码范例 (Code example)

The curret implemetatio oly supports for a sigle-tur Q-A sessio, ad the iteractive CLI is WIP.
This also serves as a example for users to build customized iferece scripts. （运行时，可将ms_wrapper.py文件复制到原git仓库路径下，添加下列代码运行）

from modelscope.models import Model
from modelscope.pipelies import pipelie

model_id = 'xigzi/llava_visual-questio-aswerig'
iferece = pipelie('llava-task', model=model_id, model_revisio='v1.1.0')

image_file = "https://llava-vl.github.io/static/images/view.jpg"
query = "What are the thigs I should be cautious about whe I visit here?"
cov_mode = Noe
iputs = {'image_file': image_file, 'query': query}

output = iferece(iputs, cov_mode=cov_mode)

prit(output)

#注：模型加载可能需要几分钟的时间

Citatio

If you fid LLaVA useful for your your research ad applicatios, please cite usig this BibTeX:

@misc{liu2023llava,
      title={Visual Istructio Tuig}, 
      author={Liu, Haotia ad Li, Chuyua ad Wu, Qigyag ad Lee, Yog Jae},
      publisher={arXiv:2304.08485},
      year={2023},
}

Ackowledgemet

Vicua: the codebase we built upo, ad our base model Vicua-13B that has the amazig laguage capabilities!

模型描述 (Model Description) ? LLaVA: Large Language and Vision Assistant Visual instruction tuning towa

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!

下载安装【程序员客栈】APP

实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

前往安装

LLaVA视觉问答模型

技术信息

作品详情