[Bug/Model Request]: Qdrant/clip-ViT-B-32-vision model image embeddings are not the same as Hugginface or openai-clip #357

onurtunali · 2024-10-06T15:56:27Z

What happened?

I am not a vision expert so apologies in advance if my interpretation of the situation is incorrect. While working with Fastembed, I have observed that image embeddings generated by "Qdrant/clip-ViT-B-32" model are not the same as Hugginface "openai/clip-vit-base-patch32" model or OpenAI ""ViT-B/32" model.

I am adding an mre:

Versions:
fastembed: 0.3.6
transformers: 4.42.3
openai-clip: 1.0.1
python: 3.10.14

import io
import urllib

import clip
import numpy as np
import torch
from fastembed import ImageEmbedding
from PIL import Image
from transformers import CLIPModel, CLIPProcessor

N_TRIALS = 5
fastembed_trial_results = []
hf_trial_results = []

fastembed_model = ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision")
openai_clip_model, openai_preprocess = clip.load("ViT-B/32")
hf_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
hf_processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")


def generate_sample_data(sample_size=5, image_size=600):
    """Get random images with given sample_size."""
    images = []
    for _ in range(sample_size):
        response = urllib.request.urlopen(f"https://picsum.photos/{image_size}")
        image = Image.open(io.BytesIO(response.read()))
        images.append(image)
    return images


def get_fastembed_embeddings(images):
    fastemed_embeddins = list(fastembed_model.embed(images))
    fastemed_embeddins = np.vstack(fastemed_embeddins)
    return fastemed_embeddins


def get_openai_embeddings(images):
    image_tensors = torch.vstack([openai_preprocess(i).unsqueeze(0) for i in images])

    with torch.no_grad():
        openai_embeddings = openai_clip_model.encode_image(image_tensors).numpy()
    return openai_embeddings


def get_hf_embeddings(images):
    hf_model.eval()
    input_dict = hf_processor(images=images, return_tensors="pt")
    with torch.no_grad():
        hf_embeddings = hf_model.get_image_features(**input_dict).numpy()
    return hf_embeddings


for t in range(N_TRIALS):
    images = generate_sample_data()

    fastemed_embeddins = get_fastembed_embeddings(images)
    openai_embeddings = get_openai_embeddings(images)
    hf_embeddings = get_hf_embeddings(images)

    if np.allclose(fastemed_embeddins, openai_embeddings, atol=0.001):
        print(f"Trial {t} Fastembed same with openai")
        fastembed_trial_results.append(True)
    else:
        print(f"Trial {t} Fastembed is NOT the same with openai")
        fastembed_trial_results.append(False)

    if np.allclose(openai_embeddings, hf_embeddings, atol=0.001):
        print(f"Trial {t} Hf same with openai")
        hf_trial_results.append(True)
    else:
        print(f"Trial {t} Hf is NOT the same with openai")
        hf_trial_results.append(False)

print(f"Out of {N_TRIALS}, {sum(fastembed_trial_results)} are the same for Fastembed")
print(f"Out of {N_TRIALS}, {sum(hf_trial_results)} are the same for HF")

Here is the colab version:

What Python version are you on? e.g. python --version

Python 3.10.14

Version

0.2.7 (Latest)

What os are you seeing the problem on?

Linux, MacOS

Relevant stack traces and/or logs

Trial 0 Fastembed is NOT the same with openai
Trial 0 Hf same with openai
Trial 1 Fastembed is NOT the same with openai
Trial 1 Hf same with openai
Trial 2 Fastembed is NOT the same with openai
Trial 2 Hf same with openai
Trial 3 Fastembed is NOT the same with openai
Trial 3 Hf same with openai
Trial 4 Fastembed is NOT the same with openai
Trial 4 Hf same with openai
Out of 5, 0 are the same for Fastembed
Out of 5, 5 are the same for HF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug/Model Request]: Qdrant/clip-ViT-B-32-vision model image embeddings are not the same as Hugginface or openai-clip #357

[Bug/Model Request]: Qdrant/clip-ViT-B-32-vision model image embeddings are not the same as Hugginface or openai-clip #357

onurtunali commented Oct 6, 2024

[Bug/Model Request]: Qdrant/clip-ViT-B-32-vision model image embeddings are not the same as Hugginface or openai-clip #357

[Bug/Model Request]: Qdrant/clip-ViT-B-32-vision model image embeddings are not the same as Hugginface or openai-clip #357

Comments

onurtunali commented Oct 6, 2024

What happened?

What Python version are you on? e.g. python --version

Version

What os are you seeing the problem on?

Relevant stack traces and/or logs