EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything.

Comparative analysis of FastSAM and SAM

模型描述

利用 SAM 的掩码图像预训练 (SAMI), 并用轻量级编码器训练掩码图像模型，从而从 SAM 的 ViT-H 而不是图像补丁重建特征，产生的通用 ViT 骨干可用于下游任务，如图像分类、物体检测和分割等。然后，研究者利用 SAM 解码器对预训练的轻量级编码器进行微调，以完成任何分割任务。通过 SAMI 预训练，可以在 ImageNet-1K 上训练 ViT-Tiny/-Small/-Base 等模型，并提高泛化性能。对于 ViT-Small 模型，研究者在 ImageNet-1K 上进行 100 次微调后，其 Top-1 准确率达到 82.7%，优于其他最先进的图像预训练基线。提出的 EfficientSAM 参数减少了 20 倍，但运行时间快了 20 倍，只与原始 SAM 模型的差距在 2 个百分点以内，大大优于 MobileSAM/FastSAM。 EfficientSAM 包含两个阶段：1）在 ImageNet 上对 SAMI 进行预训练（上）；2）在 SA-1B 上微调 SAM（下）。

Instance segmentation results

期望模型使用方式以及适用范围

本模型适用范围较广，能对图片中包含的大部分感兴趣物体（COCO things 80类）根据提示(点、框、文本)进行分割。

如何使用

在ModelScope框架上，提供输入图片，即可通过简单的Pipeline调用来使用。

代码范例

from modelscope.models import Model
from modelscope.pipelines import pipeline
from urllib import request
from PIL import Image

model = 'damo/cv_efficientsam-s_image-instance-segmentation_sa1b'
pipe = pipeline('efficient-sam-s-task', model=model)

image_path = './input.jpg'
image_url = 'http://k.sinaimg.cn/n/sinacn18/380/w1698h1082/20180810/b678-hhnunsq9451531.png/w700d1q75cms.jpg'
request.urlretrieve(image_url, image_path)

inputs = {
    'img_path': image_path,                    # 输入图像路径
    'device': 'cpu',                           # 使用‘cpu’或者‘cuda’
    'input_points': [[300, 200], [450, 220]],  # 使用点提示进行分割, 输入需要分割的对象的坐标[x,y]，多个对象需要输入多个坐标
    'input_labels': [1, 1],                    # points: [x,y], pointlabel: 0:background, 1:foreground，返回mask
}
mask, masked_image = pipe(inputs) # 返回: 对象的mask, 对象的图像

Image.fromarray(mask).save(f"./mask.png")
Image.fromarray(masked_image).save(f"./masked_image.png")

范例的输入和输出:

Output

模型局限性以及可能的偏差

部分感兴趣物体占比太小或遮挡严重可能会影响分割结果

训练数据介绍

分割任何10亿(SA-1B)是一个数据集，SA-1B由1100万张多样化、高分辨率、隐私保护图像和使用数据引擎收集的1.1B高质量分割掩码组成。

数据评估及结果

Instance segmentation results

Clone with HTTP

  git clone https://www.modelscope.cn/damo/cv_efficientsam-s_image-instance-segmentation_sa1b.git

引用

如果该模型对你有所帮助，请引用相关的论文：

 @article{Xiong_Varadarajan_Wu_Xiang_Xiao_Zhu_Dai_Wang_Sun_Iandola_et al._2023,  
 title={EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything}, 
 author={Xiong, Yunyang and Varadarajan, Bala and Wu, Lemeng and Xiang, Xiaoyu and Xiao, Fanyi and Zhu, Chenchen and Dai, Xiaoliang and Wang, Dilin and Sun, Fei and Iandola, Forrest and Krishnamoorthi, Raghuraman and Chandra, Vikas}, 
 year={2023}, 
 month={Dec}, 
 language={en-US} 
 }

EfficientSAM-S轻量级分割一切模型

作品详情