描述

Swin Transformer v2 模型在 ImageNet-1k 上以 256x256 的分辨率进行预训练。 Liu 等人在论文《Swin Transformer V2：Scaling UpCapacity and Resolution》中对此进行了介绍。并首次在GitHub中发布。

本版本来自 Huggingface 仓库，用以方便因各种原因无法在原仓库中下载的情况。

如何使用

依赖安装

# 安装 modelscope
!pip install modelscope

# 使用 snapshot_download 下载模型可能需要（这几个包可能需要重启 Runtime）
!pip install urllib3 --upgrade
!pip install requests --upgrade
!pip install spotipy --upgrade

特征提取

from modelscope import snapshot_download
from transformers import AutoImageProcessor, Swinv2Model
from PIL import Image
import requests, torch

model_dir = snapshot_download('Alien1996/Mirror_swinv2-base-patch4-window8-256')
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

image_processor = AutoImageProcessor.from_pretrained(model_dir)
model = Swinv2Model.from_pretrained(model_dir)

inputs = image_processor(image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

last_hidden_states = outputs.last_hidden_state
list(last_hidden_states.shape)

分类任务

from modelscope import snapshot_download
from transformers import AutoImageProcessor, Swinv2ForImageClassification
from PIL import Image
import requests, torch

model_dir = snapshot_download('Alien1996/Mirror_swinv2-base-patch4-window8-256')
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

image_processor = AutoImageProcessor.from_pretrained(model_dir)
model = Swinv2ForImageClassification.from_pretrained(model_dir)

inputs = image_processor(image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])

引用

@article{DBLP:journals/corr/abs-2111-09883,
  author    = {Ze Liu and
               Han Hu and
               Yutong Lin and
               Zhuliang Yao and
               Zhenda Xie and
               Yixuan Wei and
               Jia Ning and
               Yue Cao and
               Zheng Zhang and
               Li Dong and
               Furu Wei and
               Baining Guo},
  title     = {Swin Transformer {V2:} Scaling Up Capacity and Resolution},
  journal   = {CoRR},
  volume    = {abs/2111.09883},
  year      = {2021},
  url       = {https://arxiv.org/abs/2111.09883},
  eprinttype = {arXiv},
  eprint    = {2111.09883},
  timestamp = {Thu, 02 Dec 2021 15:54:22 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2111-09883.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

【镜像】microsoft/swinv2-base-patch4-window8-256

作品详情

描述

如何使用

依赖安装

特征提取

分类任务

引用

重点城市程序员兼职推荐

重点岗位程序员兼职推荐