IP-Adapter-FaceID_开源AI项目-程序员客栈

Introduction

An experimental version of IP-Adapter-FaceID: we use face ID embedding from a face recognition model instead of CLIP image embedding, additionally, we use LoRA to improve ID consistency. IP-Adapter-FaceID can generate various style images conditioned on a face with only text prompts.

results

Usage

Firstly, you should use insightface to extract face ID embedding:

pip install mxnet
pip install insightface>=0.2
pip install ip_adapter

import cv2
from insightface.app import FaceAnalysis
import torch
from diffusers import StableDiffusionPipeline, DDIMScheduler, AutoencoderKL
from ip_adapter.ip_adapter_faceid import IPAdapterFaceID
from PIL import Image
from modelscope import snapshot_download

app = FaceAnalysis(name="buffalo_l",root="/root/.insightface/models" ,providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))

image = cv2.imread("/mnt/workspace/yk_dir/person.jpg")
faces = app.get(image)

faceid_embeds = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0)

base_model_path = "AI-ModelScope/Realistic_Vision_V5.1_noVAE"
local_base = snapshot_download(base_model_path,revision='master')

vae_model_path = "zhuzhukeji/sd-vae-ft-mse"
local_vae = snapshot_download(vae_model_path,revision='master')

local_ip = snapshot_download("AI-ModelScope/IP-Adapter-FaceID",revision='master')

ip_ckpt = local_ip+"/"+"ip-adapter-faceid_sd15.bin"
device = "cuda"

noise_scheduler = DDIMScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    beta_schedule="scaled_linear",
    clip_sample=False,
    set_alpha_to_one=False,
    steps_offset=1,
)
vae = AutoencoderKL.from_pretrained(local_vae).to(dtype=torch.float16)
pipe = StableDiffusionPipeline.from_pretrained(
    local_base,
    torch_dtype=torch.float16,
    scheduler=noise_scheduler,
    vae=vae,
    feature_extractor=None,
    safety_checker=None
)

# load ip-adapter
ip_model = IPAdapterFaceID(pipe, ip_ckpt, device)

# generate image
prompt = "photo of a woman in red dress in a garden"
negative_prompt = "monochrome, lowres, bad anatomy, worst quality, low quality, blurry"

images = ip_model.generate(
    prompt=prompt, negative_prompt=negative_prompt, faceid_embeds=faceid_embeds, num_samples=4, width=512, height=768, num_inference_steps=30, seed=2023
)

images[0].save("woman.png")

Limitations and Bias

The model does not achieve perfect photorealism and ID consistency.
The generalization of the model is limited due to limitations of the training data, base model and face recognition model.

Non-commercial use

This model is released exclusively for research purposes and is not intended for commercial use.

IP-Adapter-FaceID

作品详情

Introduction

Usage

Limitations and Bias

Non-commercial use

重点城市程序员兼职推荐

重点岗位程序员兼职推荐