[Bug]: Inconsistent Case Sensitivity in BM25 Indices Between GPU and CPU Deployment #465

nauyan · 2025-02-06T09:54:13Z

What happened?

I'm not sure if this is the right place to ask, but I'm using the langchain_qdrant.FastEmbedSparse library, which is built on top of FastEmbed. Currently, I'm using Qdrant/bm25 from LangChain like this: from langchain_qdrant import FastEmbedSparse
The issue I'm facing is that when deploying the model on a GPU, the generated indices are case-insensitive, whereas on a CPU, they are case-sensitive. I need them to be consistent across both.

Could you help me understand why this discrepancy occurs and how to ensure uniform behavior?

What is the expected behaviour?

it should generate same indices for same words irrespective of CPU or GPU.

A minimal reproducible example

from langchain_qdrant import FastEmbedSparse
FastEmbedSparse(providers=provider, cuda = use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)

then you can use embed_documents or embed_query method for generating sparse embeddings.

What Python version are you on? e.g. python --version

python 3.11

FastEmbed version

for GPU:
langchain-qdrant==0.1.3
fastembed-gpu==0.4.0

for CPU:
langchain-qdrant==0.1.3
fastembed==0.3.6

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

here are the embeddings generated by CPU:
word: arab, indices: "1376768849"
word: Arab, indices: "1945608989"
word: ARAB, indices: "518081365"

here are the embeddings generated by GPU:
word: ARAB, indices: "1376768849"
word: arab, indices: "1376768849"
word: Arab, indices: "1376768849"

The text was updated successfully, but these errors were encountered:

joein · 2025-02-06T11:34:21Z

Hello @nauyan

The problem is not between cpu and gpu packages, but in the versions. The code in cpu and gpu packages is identical, the only difference between them that cpu version utilises onnxruntime and gpu - onnxruntime-gpu.

Unfortunately, there were some bugs in our bm25 implementation which we have addressed in the later versions

nauyan · 2025-02-06T11:52:51Z

@joein Thanks for the reply. so how can I resolve this issue? I want consistent embeddings in both cases? I have noticed that the CPU embeddings are case sensititve that's why I am getting different CPU indices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Inconsistent Case Sensitivity in BM25 Indices Between GPU and CPU Deployment #465

[Bug]: Inconsistent Case Sensitivity in BM25 Indices Between GPU and CPU Deployment #465

nauyan commented Feb 6, 2025

joein commented Feb 6, 2025

nauyan commented Feb 6, 2025

[Bug]: Inconsistent Case Sensitivity in BM25 Indices Between GPU and CPU Deployment #465

[Bug]: Inconsistent Case Sensitivity in BM25 Indices Between GPU and CPU Deployment #465

Comments

nauyan commented Feb 6, 2025

What happened?

What is the expected behaviour?

A minimal reproducible example

What Python version are you on? e.g. python --version

FastEmbed version

What os are you seeing the problem on?

Relevant stack traces and/or logs

joein commented Feb 6, 2025

nauyan commented Feb 6, 2025