Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEV-1440 - write journal when indexing full file #62

Merged
merged 1 commit into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 32 additions & 14 deletions lib/cictl/index_command.rb
Original file line number Diff line number Diff line change
Expand Up @@ -42,27 +42,20 @@ def all
puts "5 second delay if you need it..."
sleep 5
end

# Make sure there is a full MARC file to work on
preflight(last_full_marc_file)

logger.info "Empty full Solr records"
solr_client.empty_records!
logger.info "Load most recent set of deleted records into solr"
if DeletedRecords.most_recent_non_empty_file
logger.info "Found #{DeletedRecords.most_recent_non_empty_file}"
solr_client.send_jsonl(DeletedRecords.most_recent_non_empty_file)
else
logger.error "Can't find any non_empty deleted_record files in #{DeletedRecords.save_directory}"
end
logger.info "Commit"
solr_client.commit!
logger.info "Using full marcfile #{last_full_marc_file}"
# Calling the Thor "file" command.
call_file_command last_full_marc_file

load_deleted_records

index_full_file(last_full_marc_file)

# "since" command for a month starts on the last day of last month
# because there will generally be both an "upd" and a "full" file.
call_since_command last_full_marc_file.to_datetime
logger.info "Commit"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The individual date commands commit, so we don't need to commit again afterwards (although it doesn't hurt)

solr_client.commit!
postflight
end

Expand Down Expand Up @@ -185,5 +178,30 @@ def index_deletes_for_date(date)
logger.warn "could not find delfile '#{delfile}'"
end
end

def load_deleted_records
logger.info "Load most recent set of deleted records into solr"
if DeletedRecords.most_recent_non_empty_file
logger.info "Found #{DeletedRecords.most_recent_non_empty_file}"
solr_client.send_jsonl(DeletedRecords.most_recent_non_empty_file)
else
logger.error "Can't find any non_empty deleted_record files in #{DeletedRecords.save_directory}"
end
end

# Loads a full file and (assuming no exceptions were thrown)
# records that to the journal
def index_full_file(file)
logger.info "Using full marcfile #{file}"
# Calling the Thor "file" command.
call_file_command file

logger.info "Commit"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, we were committing after loading deleted records, but not after we loaded the full file. We probably want to commit after we load the full file.

solr_client.commit!

journal = Journal.new(date: file.to_datetime.to_date, full: true)
logger.info("write journal file #{journal.path}")
journal.write!
end
end
end
2 changes: 2 additions & 0 deletions lib/services.rb
Original file line number Diff line number Diff line change
Expand Up @@ -104,4 +104,6 @@ def env_local_file
Services.register(:collection_map) do
CICTL::CollectionMap.new.to_translation_map
end

Services.register(:job_name) { ENV.fetch("JOB_NAME", $PROGRAM_NAME) }
end
4 changes: 2 additions & 2 deletions spec/cictl/index_command_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ def metrics?
describe "#index continue" do
context "with no journal" do
it "indexes all example records" do
update_file_count = CICTL::Examples.of_type(:upd).count
file_count = CICTL::Examples.of_type(:full, :upd).count
CICTL::Commands.start(["index", "continue", "--quiet", "--log", test_log])
expect(solr_count).to eq CICTL::Examples.all_ids.count
expect(Dir.children(HathiTrust::Services[:journal_directory]).count).to eq(update_file_count)
expect(Dir.children(HathiTrust::Services[:journal_directory]).count).to eq(file_count)
expect(metrics?).to eq true
end
end
Expand Down
Loading