LLaVA视觉问答模型

我要开发同款
匿名用户2024年07月31日
118阅读

技术信息

开源地址
https://modelscope.cn/models/xingzi/llava_visual-question-answering
授权协议
Apache License 2.0

作品详情

模型描述 (Model Descriptio)

? LLaVA: Large Laguage ad Visio Assistat

Visual istructio tuig towards large laguage ad visio models with GPT-4 level capabilities.

[Project Page] [Paper]

Visual Istructio Tuig
Haotia Liu, Chuyua Li, Qigyag Wu, Yog Jae Lee (*Equal Cotributio)


Geerated by GLIGEN via "a cute lava llama with glasses" ad box prompt

运行环境 (Operatig eviromet)

Istall

  1. git cloe the origial repository
git cloe https://github.com/haotia-liu/LLaVA.git
cd LLaVA
  1. Istall Package
coda create - llava pytho=3.10 -y
coda activate llava
进入pyproject.toml所在文件夹
pip istall --upgrade pip  # eable PEP 660 support
pip istall -e .

LLaVA Weights

提供模型运行所需完整13B权重,使用需遵循LLaMA model licese。

代码范例 (Code example)

The curret implemetatio oly supports for a sigle-tur Q-A sessio, ad the iteractive CLI is WIP.
This also serves as a example for users to build customized iferece scripts. (运行时,可将ms_wrapper.py文件复制到原git仓库路径下,添加下列代码运行)

from modelscope.models import Model
from modelscope.pipelies import pipelie

model_id = 'xigzi/llava_visual-questio-aswerig'
iferece = pipelie('llava-task', model=model_id, model_revisio='v1.1.0')

image_file = "https://llava-vl.github.io/static/images/view.jpg"
query = "What are the thigs I should be cautious about whe I visit here?"
cov_mode = Noe
iputs = {'image_file': image_file, 'query': query}

output = iferece(iputs, cov_mode=cov_mode)

prit(output)

#注:模型加载可能需要几分钟的时间

Citatio

If you fid LLaVA useful for your your research ad applicatios, please cite usig this BibTeX:

@misc{liu2023llava,
      title={Visual Istructio Tuig}, 
      author={Liu, Haotia ad Li, Chuyua ad Wu, Qigyag ad Lee, Yog Jae},
      publisher={arXiv:2304.08485},
      year={2023},
}

Ackowledgemet

  • Vicua: the codebase we built upo, ad our base model Vicua-13B that has the amazig laguage capabilities!

功能介绍

模型描述 (Model Description) ? LLaVA: Large Language and Vision Assistant Visual instruction tuning towa

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论