You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A limitation of the current similarity-based approach is that, when the input prompts have really low similarity score, recommendations may have nothing to do with the entered prompt (or null) due to a topic/term that is out of the distribution of the dataset used to train the sentence transformer used. Hence, to identify these cases, we need (1) to log model responses (already pointed out by the issue #20 ) and (2) keep track of the low similarity scores, so we can detect these out of distribution cases and inform edge cases regrading sentence transformers used.
Expected Behavior
Once having a logging mechanism, we should employ a log analytics component to detect these terms/topics with low similarity score.
The goal is to detect edge cases and inform developers using the API.
Possible Approach
We can provide a warning at the API level, so that people have a better sense on why some input prompts are getting no recommendation. This information can help developers connecting with the API to properly select a different sentence transformer or fine-tune an existing one. The goal is to increase transparency for developers using the API.
Steps to Reproduce
NA.
Context
Recommendation.
The text was updated successfully, but these errors were encountered:
It depends on the embedding being used. For all-minilm-l6-v2 it would be less than 0.1.
I'd suggest you to test our swagger with a prompt that we know will result in recommendations, then, change the prompt to things that are not in our embeddings. Then, you would find examples of low similarity scores for that sentence transformer.
Description (Actual Behavior)
A limitation of the current similarity-based approach is that, when the input prompts have really low similarity score, recommendations may have nothing to do with the entered prompt (or null) due to a topic/term that is out of the distribution of the dataset used to train the sentence transformer used. Hence, to identify these cases, we need (1) to log model responses (already pointed out by the issue #20 ) and (2) keep track of the low similarity scores, so we can detect these out of distribution cases and inform edge cases regrading sentence transformers used.
Expected Behavior
Once having a logging mechanism, we should employ a log analytics component to detect these terms/topics with low similarity score.
The goal is to detect edge cases and inform developers using the API.
Possible Approach
We can provide a warning at the API level, so that people have a better sense on why some input prompts are getting no recommendation. This information can help developers connecting with the API to properly select a different sentence transformer or fine-tune an existing one. The goal is to increase transparency for developers using the API.
Steps to Reproduce
NA.
Context
Recommendation.
The text was updated successfully, but these errors were encountered: