-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Searches result in duplicate dictionary entries #374
Comments
@Madoshakalaka This is one more for you. |
@aarppe sure thing 🤣 |
@Madoshakalaka, I think this is because results from the FST aren't being de-duplicated. See here:
This one is tricky, because according to the descriptive analyzer, there are TWO valid analyses with different tags @aarppe, how should we handle this situation? According to the FST, having two results for |
The latest FST gives in fact three analyses:
This is in particular tricky since the spell-relax corrections are tagged with I had previously revised the list of non-standard forms LEXC file to include only those spelling deviances that cannot be dealt with spell-relax rules, to avoid double analyses. I'll comment out the tan'si form in What is even further tricky is that there are two legitimate lemmas for tânisi. One which is an interrogative/adverbial particle (the first one below), and the other which is an interjection (the second one below):
So, we'd have to use a feature pair to match the second one, and the lack of any additional features (exact match) to match the first one. |
This behaviour is no longer the same. In fact, the above definitions for tan'si are no longer in the dictionary. |
As part of recent fixes in My gut feeling tells me that this is not the expected match for the MD entry, but that the MD entry should go in the other one. To achieve that, it would be sufficient to change the FST Analysis for the entry in Maskwacis.tsv to add the |
I'm seeing at least a few cases where searches result in the same Cree dictionary entry being presented twice, e.g. searching with tan'si with tânisi:
Of course, the dictionary entry for tânisi should be shown only once.
However, searching with tânisi gives only one result, as expected:
The text was updated successfully, but these errors were encountered: