Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Inconsistent Case Sensitivity in BM25 Indices Between GPU and CPU Deployment #465

Open
nauyan opened this issue Feb 6, 2025 · 2 comments

Comments

@nauyan
Copy link

nauyan commented Feb 6, 2025

What happened?

I'm not sure if this is the right place to ask, but I'm using the langchain_qdrant.FastEmbedSparse library, which is built on top of FastEmbed. Currently, I'm using Qdrant/bm25 from LangChain like this: from langchain_qdrant import FastEmbedSparse
The issue I'm facing is that when deploying the model on a GPU, the generated indices are case-insensitive, whereas on a CPU, they are case-sensitive. I need them to be consistent across both.

Could you help me understand why this discrepancy occurs and how to ensure uniform behavior?

What is the expected behaviour?

it should generate same indices for same words irrespective of CPU or GPU.

A minimal reproducible example

from langchain_qdrant import FastEmbedSparse
FastEmbedSparse(providers=provider, cuda = use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)

then you can use embed_documents or embed_query method for generating sparse embeddings.

What Python version are you on? e.g. python --version

python 3.11

FastEmbed version

for GPU:
langchain-qdrant==0.1.3
fastembed-gpu==0.4.0

for CPU:
langchain-qdrant==0.1.3
fastembed==0.3.6

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

here are the embeddings generated by CPU:
word: arab, indices: "1376768849"
word: Arab, indices: "1945608989"
word: ARAB, indices: "518081365"

here are the embeddings generated by GPU:
word: ARAB, indices: "1376768849"
word: arab, indices: "1376768849"
word: Arab, indices: "1376768849"
@joein
Copy link
Member

joein commented Feb 6, 2025

Hello @nauyan

The problem is not between cpu and gpu packages, but in the versions. The code in cpu and gpu packages is identical, the only difference between them that cpu version utilises onnxruntime and gpu - onnxruntime-gpu.

Unfortunately, there were some bugs in our bm25 implementation which we have addressed in the later versions

@nauyan
Copy link
Author

nauyan commented Feb 6, 2025

@joein Thanks for the reply. so how can I resolve this issue? I want consistent embeddings in both cases? I have noticed that the CPU embeddings are case sensititve that's why I am getting different CPU indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants