浦语·灵笔2-视觉问答-1.8B

我要开发同款
匿名用户2024年07月31日
64阅读

技术信息

官网地址
https://www.shlab.org.cn/
开源地址
https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-vl-1_8b
授权协议
other

作品详情

IterLM-XComposer2

[?Github Repo](https://github.com/IterLM/IterLM-XComposer) [Paper](https://arxiv.org/abs/2401.16420)

IterLM-XComposer2 is a visio-laguage large model (VLLM) based o IterLM2 for advaced text-image comprehesio ad compositio.

We release IterLM-XComposer2 series i two versios:

  • IterLM-XComposer2-VL: The pretraied VLLM model with IterLM2 as the iitializatio of the LLM, achievig strog performace o various multimodal bechmarks.
  • IterLM-XComposer2: The fietued VLLM for Free-from Iterleaved Text-Image Compositio.

Import from Trasformers

To load the IterLM-XComposer2-VL-1.8B model usig Trasformers, use the followig code:

import torch
from modelscope import AutoTokeizer, AutoModelForCausalLM
ckpt_path = "iterlm/iterlm-xcomposer2-vl-1_8b"
tokeizer = AutoTokeizer.from_pretraied(ckpt_path, trust_remote_code=True).cuda()
# Set `torch_dtype=torch.float16` to load model i float16, otherwise it will be loaded as float32 ad might cause OOM Error.
model = AutoModelForCausalLM.from_pretraied(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
model = model.eval()

Quickstart

We provide a simple example to show how to use IterLM-XComposer with ? Trasformers.

import torch
from modelscope import AutoTokeizer, AutoModelForCausalLM

torch.set_grad_eabled(False)

# iit model ad tokeizer
model = AutoModel.from_pretraied('iterlm/iterlm-xcomposer2-vl-1_8b', trust_remote_code=True).cuda().eval()
tokeizer = AutoTokeizer.from_pretraied('iterlm/iterlm-xcomposer2-vl-1_8b', trust_remote_code=True)

query = '<ImageHere>Please describe this image i detail.'
image = './image1.webp'
with torch.cuda.amp.autocast():
  respose, _ = model.chat(tokeizer, query=query, image=image, history=[], do_sample=False)
prit(respose)
# The image is a captivatig photograph of a suset over a moutaious ladscape. The sky, paited i hues of orage ad pik,
# serves as a backdrop for two silhouetted figures stadig o the moutai. The text o the image, writte i white, is a quote 
# from Oscar Wilde, which reads, "Live life with o excuses, travel with o regret." This quote, combied with the seree settig,
# serves as a powerful remider to embrace life's jourey without hesitatio or regret.

Ope Source Licese

The code is licesed uder Apache-2.0, while model weights are fully ope for academic research ad also allow free commercial usage. To apply for a commercial licese, please fill i the applicatio form (Eglish)/申请表(中文). For other questios or collaboratios, please cotact iterlm@pjlab.org.c.

功能介绍

InternLM-XComposer2 [?Github Repo](https://github.com/InternLM/InternLM-XComposer) [Paper](http

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论