-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subject heading remapping #88
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
niquerio
force-pushed
the
subject-heading-remapping
branch
2 times, most recently
from
August 14, 2024 21:00
aae4851
to
09be055
Compare
niquerio
force-pushed
the
subject-heading-remapping
branch
from
August 19, 2024 21:51
a18f160
to
b0b9687
Compare
Translation map is a bit of a misnomer since more than a one to one mapping. It's only the same in that it's some config that's generated before doing indexing.
Subjects::Field determines whether or not a field is remediable or has already been remediable and knows how to change it into its remediated or deprecated form Subjects::RemediationMap makes the SubjectHeadingRemediation "Translation Map" into something Subjects::Field can work with.
Adds a method for finding the already remediated subject fields. It also adds all of the helper methods. The complexity with saving saving fields is so that the expensive normalization operations are done as few times as possible.
The methods added are ones that are used to build the output for the subjects macros.
Adds the non_lc_subject_fields and remediated_lc_subject_fields methods. Also stores output from subject_fields and lc_subject_fields
remediated_lc_subjects and subject_browse_subjects are added to Subjects. The subject macros are added for them as well, and the specs are updated so that everything works.
Subjects handles figuring out what the facets are. There are two sets of facet-like objects. One is topics, which has everything including deprecated terms. It is used for searching. The other is subject_facets which only has approved terms.
Saves the Subjects object for a record to the clipboard because there are many expensive operations. The commit also removes the skip_FAST macro because that logic has been moved to the Subjects class
The delimiter need for Library Search is " -- " because of line breaking problems. Catalog browse uses the LC standard of "--" without spaces.
topicAllStr exists in the record so that the full list of searchable topics is viewable in Solr docs. topics aren't copied into topicStr anymore because topicStr is now populated by Subjects::subject_facets. remediated_lc_subject_display is added so we can have a separate field in catalog record displays.
niquerio
force-pushed
the
subject-heading-remapping
branch
from
August 23, 2024 16:45
b0b9687
to
b0db399
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds logic for enabling subject heading remapping.
A "translation map" is generated from Authority records that explain the remapping rules. The map is set up like this:
So it is not a one-to-one mapping like other traject translation maps.
With this mapping, subject fields within a record are divided into the following groups:
lc_subject_fields
,remediable_subject_fields
,already_remediated_subject_fields
, andnon_lc_subject_fields
. With those groups, we can generate solr fields to enable Library Search to not show deprecated terms in the UI, but still enable searching on deprecated terms and grouping the appropriate items together.