PHO-95: holdings with nil enum match any htitems #297
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
A partner reported that an OCLC number they hold did not appear to match any items in HathiTrust in their "ETAS overlap report".
https://github.com/hathitrust/holdings-backend/blob/main/lib/calculate_format.rb#L17
https://github.com/hathitrust/holdings-backend/blob/main/lib/calculate_format.rb#L43-L47
Whether a holding matches a cluster (i.e. one or more catalog records sharing a set of equivalent OCNs) or a HathiTrust item is dependent on context.
organizations_with_holdings_but_no_matches.include?(@org)
forcopy_count
: https://github.com/hathitrust/holdings-backend/blob/main/lib/overlap/multi_part_overlap.rb#L10-L14 in the production holdings tablematching_holdings
, which does NOT account for the case when no member-reported enum matches any enum for items in HathiTrust: https://github.com/hathitrust/holdings-backend/blob/main/lib/overlap/multi_part_overlap.rb#L36-L40 ; https://github.com/hathitrust/holdings-backend/blob/main/lib/reports/etas_organization_overlap_report.rb#L82. The intended semantics appear to be that a holding matches an item either if theirn_enum
matches, or if the holding has no reportedn_enum
.This behavior is complex and confusing because it was replicated from the old holdings system without a comprehensive review of the requirements (that is, the requirements were taken to be the observed behavior of the old system.)
https://hathitrust.atlassian.net/browse/HT-2726
https://hathitrust.atlassian.net/browse/HT-2727
Changing the behavior overall is out of scope here (either changing the behavior for the ETAS overlap report so that it includes cases where no enum matches, or my preferred option of doing away with enum-based matching entirely as we do with serials)
See also the investigation and notes on https://hathitrust.atlassian.net/jira/core/projects/TTO/board?groupBy=status&selectedIssue=TTO-168
This change
This change allows holdings with
nil
normalized enumeration (n_enum
) to match HathiTrust items. This appears to be an unanticipated case with the data, but does not change the underlying semantics of matching. Such holdings/items will match and appear in the ETAS overlap report.Separately, we should determine whether the data loading process is unexpectedly loading
nil
values forn_enum
, or if this is a byproduct of data from the old system.Because the cost report and production overlap table already consider such cases to match via
organizations_with_holdings_but_no_matches
, this should only affect the ETAS overlap report -- that is, the items/holdings already match for the cost report and production holdings table.