You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not sure if this is the right place to ask, but I'm using the langchain_qdrant.FastEmbedSparse library, which is built on top of FastEmbed. Currently, I'm using Qdrant/bm25 from LangChain like this: from langchain_qdrant import FastEmbedSparse
The issue I'm facing is that when deploying the model on a GPU, the generated indices are case-insensitive, whereas on a CPU, they are case-sensitive. I need them to be consistent across both.
Could you help me understand why this discrepancy occurs and how to ensure uniform behavior?
What is the expected behaviour?
it should generate same indices for same words irrespective of CPU or GPU.
A minimal reproducible example
from langchain_qdrant import FastEmbedSparse
FastEmbedSparse(providers=provider, cuda = use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)
then you can use embed_documents or embed_query method for generating sparse embeddings.
What Python version are you on? e.g. python --version
python 3.11
FastEmbed version
for GPU:
langchain-qdrant==0.1.3
fastembed-gpu==0.4.0
for CPU:
langchain-qdrant==0.1.3
fastembed==0.3.6
What os are you seeing the problem on?
Linux
Relevant stack traces and/or logs
here are the embeddings generated by CPU:
word: arab, indices: "1376768849"
word: Arab, indices: "1945608989"
word: ARAB, indices: "518081365"
here are the embeddings generated by GPU:
word: ARAB, indices: "1376768849"
word: arab, indices: "1376768849"
word: Arab, indices: "1376768849"
The text was updated successfully, but these errors were encountered:
The problem is not between cpu and gpu packages, but in the versions. The code in cpu and gpu packages is identical, the only difference between them that cpu version utilises onnxruntime and gpu - onnxruntime-gpu.
Unfortunately, there were some bugs in our bm25 implementation which we have addressed in the later versions
@joein Thanks for the reply. so how can I resolve this issue? I want consistent embeddings in both cases? I have noticed that the CPU embeddings are case sensititve that's why I am getting different CPU indices.
What happened?
I'm not sure if this is the right place to ask, but I'm using the
langchain_qdrant.FastEmbedSparse
library, which is built on top of FastEmbed. Currently, I'm usingQdrant/bm25
fromLangChain
like this:from langchain_qdrant import FastEmbedSparse
The issue I'm facing is that when deploying the model on a GPU, the generated indices are case-insensitive, whereas on a CPU, they are case-sensitive. I need them to be consistent across both.
Could you help me understand why this discrepancy occurs and how to ensure uniform behavior?
What is the expected behaviour?
it should generate same indices for same words irrespective of CPU or GPU.
A minimal reproducible example
then you can use
embed_documents
orembed_query
method for generating sparse embeddings.What Python version are you on? e.g. python --version
python 3.11
FastEmbed version
for GPU:
langchain-qdrant==0.1.3
fastembed-gpu==0.4.0
for CPU:
langchain-qdrant==0.1.3
fastembed==0.3.6
What os are you seeing the problem on?
Linux
Relevant stack traces and/or logs
The text was updated successfully, but these errors were encountered: