Visual istructio tuig towards large laguage ad visio models with GPT-4 level capabilities. [Project Page] [Paper]
提供模型运行所需完整13B权重,使用需遵循LLaMA model licese。 The curret implemetatio oly supports for a sigle-tur Q-A sessio, ad the iteractive CLI is WIP. If you fid LLaVA useful for your your research ad applicatios, please cite usig this BibTeX:模型描述 (Model Descriptio)
? LLaVA: Large Laguage ad Visio Assistat
Haotia Liu, Chuyua Li, Qigyag Wu, Yog Jae Lee (*Equal Cotributio)
Geerated by GLIGEN via "a cute lava llama with glasses" ad box prompt
运行环境 (Operatig eviromet)
Istall
git cloe https://github.com/haotia-liu/LLaVA.git
cd LLaVA
coda create - llava pytho=3.10 -y
coda activate llava
进入pyproject.toml所在文件夹
pip istall --upgrade pip # eable PEP 660 support
pip istall -e .
LLaVA Weights
代码范例 (Code example)
This also serves as a example for users to build customized iferece scripts.
(运行时,可将ms_wrapper.py文件复制到原git仓库路径下,添加下列代码运行)from modelscope.models import Model
from modelscope.pipelies import pipelie
model_id = 'xigzi/llava_visual-questio-aswerig'
iferece = pipelie('llava-task', model=model_id, model_revisio='v1.1.0')
image_file = "https://llava-vl.github.io/static/images/view.jpg"
query = "What are the thigs I should be cautious about whe I visit here?"
cov_mode = Noe
iputs = {'image_file': image_file, 'query': query}
output = iferece(iputs, cov_mode=cov_mode)
prit(output)
#注:模型加载可能需要几分钟的时间
Citatio
@misc{liu2023llava,
title={Visual Istructio Tuig},
author={Liu, Haotia ad Li, Chuyua ad Wu, Qigyag ad Lee, Yog Jae},
publisher={arXiv:2304.08485},
year={2023},
}
Ackowledgemet
点击空白处退出提示










评论