You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am not a vision expert so apologies in advance if my interpretation of the situation is incorrect. While working with Fastembed, I have observed that image embeddings generated by "Qdrant/clip-ViT-B-32" model are not the same as Hugginface "openai/clip-vit-base-patch32" model or OpenAI ""ViT-B/32" model.
importioimporturllibimportclipimportnumpyasnpimporttorchfromfastembedimportImageEmbeddingfromPILimportImagefromtransformersimportCLIPModel, CLIPProcessorN_TRIALS=5fastembed_trial_results= []
hf_trial_results= []
fastembed_model=ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision")
openai_clip_model, openai_preprocess=clip.load("ViT-B/32")
hf_model=CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
hf_processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
defgenerate_sample_data(sample_size=5, image_size=600):
"""Get random images with given sample_size."""images= []
for_inrange(sample_size):
response=urllib.request.urlopen(f"https://picsum.photos/{image_size}")
image=Image.open(io.BytesIO(response.read()))
images.append(image)
returnimagesdefget_fastembed_embeddings(images):
fastemed_embeddins=list(fastembed_model.embed(images))
fastemed_embeddins=np.vstack(fastemed_embeddins)
returnfastemed_embeddinsdefget_openai_embeddings(images):
image_tensors=torch.vstack([openai_preprocess(i).unsqueeze(0) foriinimages])
withtorch.no_grad():
openai_embeddings=openai_clip_model.encode_image(image_tensors).numpy()
returnopenai_embeddingsdefget_hf_embeddings(images):
hf_model.eval()
input_dict=hf_processor(images=images, return_tensors="pt")
withtorch.no_grad():
hf_embeddings=hf_model.get_image_features(**input_dict).numpy()
returnhf_embeddingsfortinrange(N_TRIALS):
images=generate_sample_data()
fastemed_embeddins=get_fastembed_embeddings(images)
openai_embeddings=get_openai_embeddings(images)
hf_embeddings=get_hf_embeddings(images)
ifnp.allclose(fastemed_embeddins, openai_embeddings, atol=0.001):
print(f"Trial {t} Fastembed same with openai")
fastembed_trial_results.append(True)
else:
print(f"Trial {t} Fastembed is NOT the same with openai")
fastembed_trial_results.append(False)
ifnp.allclose(openai_embeddings, hf_embeddings, atol=0.001):
print(f"Trial {t} Hf same with openai")
hf_trial_results.append(True)
else:
print(f"Trial {t} Hf is NOT the same with openai")
hf_trial_results.append(False)
print(f"Out of {N_TRIALS}, {sum(fastembed_trial_results)} are the same for Fastembed")
print(f"Out of {N_TRIALS}, {sum(hf_trial_results)} are the same for HF")
Here is the colab version:
What Python version are you on? e.g. python --version
Python 3.10.14
Version
0.2.7 (Latest)
What os are you seeing the problem on?
Linux, MacOS
Relevant stack traces and/or logs
Trial 0 Fastembed is NOT the same with openai
Trial 0 Hf same with openai
Trial 1 Fastembed is NOT the same with openai
Trial 1 Hf same with openai
Trial 2 Fastembed is NOT the same with openai
Trial 2 Hf same with openai
Trial 3 Fastembed is NOT the same with openai
Trial 3 Hf same with openai
Trial 4 Fastembed is NOT the same with openai
Trial 4 Hf same with openai
Out of 5, 0 are the same for Fastembed
Out of 5, 5 are the same for HF
The text was updated successfully, but these errors were encountered:
What happened?
I am not a vision expert so apologies in advance if my interpretation of the situation is incorrect. While working with Fastembed, I have observed that image embeddings generated by "Qdrant/clip-ViT-B-32" model are not the same as Hugginface "openai/clip-vit-base-patch32" model or OpenAI ""ViT-B/32" model.
I am adding an mre:
Versions:
fastembed: 0.3.6
transformers: 4.42.3
openai-clip: 1.0.1
python: 3.10.14
Here is the colab version:
What Python version are you on? e.g. python --version
Python 3.10.14
Version
0.2.7 (Latest)
What os are you seeing the problem on?
Linux, MacOS
Relevant stack traces and/or logs
The text was updated successfully, but these errors were encountered: