Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: redirections to locutions #2

Open
rubenperezm opened this issue Jan 19, 2025 · 0 comments
Open

enhancement: redirections to locutions #2

rubenperezm opened this issue Jan 19, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@rubenperezm
Copy link
Owner

rubenperezm commented Jan 19, 2025

There could be 4 different IDs in a page:

  • Definition id (in the article tag)
  • Meaning id (in the li of the first ol within an article)
  • Locution Name id (locutions appear in a h3)
  • Locution Meaning id (in li of a locution's ol)

We are currently working with the first, second and fourth types. The third type is not being extracted since most of the redirections to locutions refer to one of their meanings. However, there are ~700 (~0.5% of the total number of meanings) that refer to the locution name. At this point we consider them exceptions. It is not a big deal since locutions are barely asked in the show, but we probably we could handle them by extracting the h3 IDs and creating a dict to store the relations (key: locution name id, value: locution meaning id) to use it in the next stage.

@rubenperezm rubenperezm added the enhancement New feature or request label Jan 19, 2025
@rubenperezm rubenperezm self-assigned this Jan 19, 2025
rubenperezm added a commit that referenced this issue Jan 27, 2025
We have left the abbreviations at the end. This issue was meant for context abbrs, not for word abbrs, but it is not a problem to have those in the meaning.
rubenperezm added a commit that referenced this issue Jan 27, 2025
fix: abbrs between meanings #2
@rubenperezm rubenperezm reopened this Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant