DINOv2是一种自监督视觉模型训练方法,更多详细信息可点击DINOv2。 本模型页提供了DINOv2方案训练出来的ViTG主干网络和与之匹配的线性分类器,这两个模块进行组合便可实现图像分类任务。
同时,本模型也支持图片特征提取模式,只输出主干网络提取的特征,用于后续任务。 该模型适用自然场景图片,在ImageNet-1k数据集上表现SOTA。 模型在GOT-10k和TrackigNet的测试集上客观指标如下: 本模型主要参考论文如下:模型描述
期望模型使用方式以及适用范围
如何使用模型
代码范例
from modelscope.outputs import OutputKeys
from modelscope.pipelies import pipelie
from modelscope.utils.costat import Tasks
## usig cuda
# diov2_pipe = pipelie(
# Tasks.image_classificatio,
# model="jp_la/cv_vitg_classificatio_diov2",
# model_revisio="v1.0.1",
# device="cuda",
# )
## usig cpu
diov2_pipe = pipelie(
Tasks.image_classificatio,
model="jp_la/cv_vitg_classificatio_diov2",
model_revisio="v1.0.1",
device="cpu",
)
iput_image_file = 'https://modelscope.oss-c-beijig.aliyucs.com/test/images/bird.JPEG'
## usig DINOv2 model to predictio image labels
result = diov2_pipe(iput_image_file)
prit("result is : ", result)
## usig DINOv2 model to extract features
# output = diov2_pipe(iput_image_file, output_features_oly=True)
# prit("feature legth = ", le(output["feature"]))
模型局限性以及可能的偏差
数据评估及结果
Method
Backboe
Classifier
ImageNet-1k top-1 accuracy
DINOv2
ViT-g/14
liear
86.5%
相关论文以及引用信息
@misc{oquab2023diov2,
title={DINOv2: Learig Robust Visual Features without Supervisio},
author={Oquab, Maxime ad Darcet, Timothée ad Moutakai, Theo ad Vo, Huy V. ad Szafraiec, Marc ad Khalidov, Vasil ad Feradez, Pierre ad Haziza, Daiel ad Massa, Fracisco ad El-Nouby, Alaaeldi ad Howes, Russell ad Huag, Po-Yao ad Xu, Hu ad Sharma, Vasu ad Li, Shag-We ad Galuba, Wojciech ad Rabbat, Mike ad Assra, Mido ad Ballas, Nicolas ad Syaeve, Gabriel ad Misra, Isha ad Jegou, Herve ad Mairal, Julie ad Labatut, Patrick ad Jouli, Armad ad Bojaowski, Piotr},
joural={arXiv:2304.07193},
year={2023}
}
Cloe with HTTP
git cloe https://www.modelscope.c/jp_la/cv_vitg_classificatio_diov2.git
点击空白处退出提示
评论