-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚙️ Add auto-cleanup for GoodJob #574
Conversation
This commit introduces automatic clean-up of finished successful jobs. It is here to help with the overall performance of GoodJobs as well as the application database. Note the cleanup schedule is a guess on what's appropriate. <details><summary>Code Walk Through of Good Jobs Configuration</summary> The [README](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/README.md?plain=1#L280) explains the configuration for the Scheduler. ```markdown - `cleanup_interval_seconds` (integer) Number of seconds a Scheduler will wait before cleaning up preserved jobs. Defaults to `nil`. Can also be set with the environment variable `GOOD_JOB_CLEANUP_INTERVAL_SECONDS`. ``` Here's the implementation of [GoodJob::Configuration](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/configuration.rb#L211-L220) regarding the cleanup_interval_seconds. Which is in the `GoodJob::CleanupTracker`. ```ruby def cleanup_interval_seconds value = ( rails_config[:cleanup_interval_seconds] || env['GOOD_JOB_CLEANUP_INTERVAL_SECONDS'] || DEFAULT_CLEANUP_INTERVAL_SECONDS ) value.present? ? value.to_i : nil end ``` The [GoodJob::CleanupTracker](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/cleanup_tracker.rb#L23-L29) has a `#cleanup?` method that looks at either job counts or elapsed seconds. Which informs the `GoodJob::Scheduler`. ```ruby def cleanup? (cleanup_interval_jobs && job_count > cleanup_interval_jobs) || (cleanup_interval_seconds && last_at < Time.current - cleanup_interval_seconds) || false end ``` The [GoodJob::Scheduler](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/scheduler.rb#L180-L193) observes the tasks as they complete. And one of those is conditionally running `#cleanup`. ```ruby def task_observer(time, output, thread_error) error = thread_error || (output.is_a?(GoodJob::ExecutionResult) ? output.unhandled_error : nil) GoodJob._on_thread_error(error) if error instrument("finished_job_task", { result: output, error: thread_error, time: time }) return unless output @cleanup_tracker.increment if @cleanup_tracker.cleanup? cleanup else create_task end end ``` The [GoodJob::Scheduler](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/scheduler.rb#L233-L250)'s `#cleanup` method delegates the clean_up to the performer; which is a `GoodJob::JobPerformer`. ```ruby def cleanup @cleanup_tracker.reset future = Concurrent::Future.new(args: [self, @Performer], executor: executor) do |_thr_scheduler, thr_performer| Rails.application.executor.wrap do thr_performer.cleanup end end observer = lambda do |_time, _output, thread_error| GoodJob._on_thread_error(thread_error) if thread_error create_task end future.add_observer(observer, :call) future.execute end ``` The [GoodJob::JobPerformer](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/job_performer.rb#L60-L64) then runs the general process `GoodJob.cleanup_preserved_jobs` (which is available via the CLI). ```ruby def cleanup GoodJob.cleanup_preserved_jobs end ``` The [GoodJob.cleanup_preserved_jobs](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job.rb#L130-L153) method is the one that ultimately cleans up preserved jobs. Note that the `include_discarded` does some logical hoops with some grammatical antics (e.g. `old_jobs.not_discarded unless include_discarded`). We are not including discarded jobs so the query will limit to jobs that are not_discarded. ```ruby def self.cleanup_preserved_jobs(older_than: nil) configuration = GoodJob::Configuration.new({}) older_than ||= configuration.cleanup_preserved_jobs_before_seconds_ago timestamp = Time.current - older_than include_discarded = configuration.cleanup_discarded_jobs? ActiveSupport::Notifications.instrument("cleanup_preserved_jobs.good_job", { older_than: older_than, timestamp: timestamp }) do |payload| old_jobs = GoodJob::ActiveJobJob.where('finished_at <= ?', timestamp) old_jobs = old_jobs.not_discarded unless include_discarded old_jobs_count = old_jobs.count GoodJob::Execution.where(job: old_jobs).destroy_all payload[:destroyed_records_count] = old_jobs_count end end ``` </details> Related to: - notch8/adventist-dl#690
04f4229
to
2426ef9
Compare
This commit introduces automatic clean-up of finished successful Good Job jobs. It is here to help with the overall performance of GoodJobs as well as the application database. Note the cleanup schedule is a guess on what's appropriate. <details><summary>Code Walk Through of Good Jobs Configuration</summary> The [README](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/README.md?plain=1#L280) explains the configuration for the Scheduler. ```markdown - `cleanup_interval_seconds` (integer) Number of seconds a Scheduler will wait before cleaning up preserved jobs. Defaults to `nil`. Can also be set with the environment variable `GOOD_JOB_CLEANUP_INTERVAL_SECONDS`. ``` Here's the implementation of [GoodJob::Configuration](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/configuration.rb#L211-L220) regarding the cleanup_interval_seconds. Which is in the `GoodJob::CleanupTracker`. ```ruby def cleanup_interval_seconds value = ( rails_config[:cleanup_interval_seconds] || env['GOOD_JOB_CLEANUP_INTERVAL_SECONDS'] || DEFAULT_CLEANUP_INTERVAL_SECONDS ) value.present? ? value.to_i : nil end ``` The [GoodJob::CleanupTracker](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/cleanup_tracker.rb#L23-L29) has a `#cleanup?` method that looks at either job counts or elapsed seconds. Which informs the `GoodJob::Scheduler`. ```ruby def cleanup? (cleanup_interval_jobs && job_count > cleanup_interval_jobs) || (cleanup_interval_seconds && last_at < Time.current - cleanup_interval_seconds) || false end ``` The [GoodJob::Scheduler](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/scheduler.rb#L180-L193) observes the tasks as they complete. And one of those is conditionally running `#cleanup`. ```ruby def task_observer(time, output, thread_error) error = thread_error || (output.is_a?(GoodJob::ExecutionResult) ? output.unhandled_error : nil) GoodJob._on_thread_error(error) if error instrument("finished_job_task", { result: output, error: thread_error, time: time }) return unless output @cleanup_tracker.increment if @cleanup_tracker.cleanup? cleanup else create_task end end ``` The [GoodJob::Scheduler](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/scheduler.rb#L233-L250)'s `#cleanup` method delegates the clean_up to the performer; which is a `GoodJob::JobPerformer`. ```ruby def cleanup @cleanup_tracker.reset future = Concurrent::Future.new(args: [self, @Performer], executor: executor) do |_thr_scheduler, thr_performer| Rails.application.executor.wrap do thr_performer.cleanup end end observer = lambda do |_time, _output, thread_error| GoodJob._on_thread_error(thread_error) if thread_error create_task end future.add_observer(observer, :call) future.execute end ``` The [GoodJob::JobPerformer](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/job_performer.rb#L60-L64) then runs the general process `GoodJob.cleanup_preserved_jobs` (which is available via the CLI). ```ruby def cleanup GoodJob.cleanup_preserved_jobs end ``` The [GoodJob.cleanup_preserved_jobs](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job.rb#L130-L153) method is the one that ultimately cleans up preserved jobs. Note that the `include_discarded` does some logical hoops with some grammatical antics (e.g. `old_jobs.not_discarded unless include_discarded`). We are not including discarded jobs so the query will limit to jobs that are not_discarded. ```ruby def self.cleanup_preserved_jobs(older_than: nil) configuration = GoodJob::Configuration.new({}) older_than ||= configuration.cleanup_preserved_jobs_before_seconds_ago timestamp = Time.current - older_than include_discarded = configuration.cleanup_discarded_jobs? ActiveSupport::Notifications.instrument("cleanup_preserved_jobs.good_job", { older_than: older_than, timestamp: timestamp }) do |payload| old_jobs = GoodJob::ActiveJobJob.where('finished_at <= ?', timestamp) old_jobs = old_jobs.not_discarded unless include_discarded old_jobs_count = old_jobs.count GoodJob::Execution.where(job: old_jobs).destroy_all payload[:destroyed_records_count] = old_jobs_count end end ``` </details> Related to: - notch8/adventist-dl#690 - notch8/utk-hyku#574 - https://github.com/scientist-softserv/nnp/issues/309
@@ -12,6 +12,9 @@ ENGINE_MOUNT=authorities | |||
FCREPO_HOST=fcrepo | |||
FCREPO_PORT=8080 | |||
FCREPO_URL=http://fcrepo:8080/rest | |||
GOOD_JOB_CLEANUP_DISCARDED_JOBS=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these values need to be on the server, you also should add them to the [server_name-deploy].tmpl.yml files
Such as:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShanaLMoore is correct here. I've added them so this can get merged and be part of the QA run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refer to comment
The maintainers yanked 0.3.2 version; so we're falling back to 0.3.1. Related to: - dryruby/json-canonicalization#2 - notch8/palni-palci#918
…lization 🧹 Pin to json-canonicalization 0.3.1
* upgrade bulkrax to be less likely to miss existing works when searching for them * ⚙️ Upgrade to v6.0.0 of Bulkrax --------- Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com>
…st-softserv/utk-hyku into adding-auto-clean-up-to-good-job
This commit introduces automatic clean-up of finished successful jobs. It is here to help with the overall performance of GoodJobs as well as the application database. Note the cleanup schedule is a guess on what's appropriate.
Code Walk Through of Good Jobs Configuration
The
README explains the configuration for the Scheduler.
Here's the implementation of
GoodJob::Configuration regarding the cleanup_interval_seconds. Which is in the
GoodJob::CleanupTracker
.The
GoodJob::CleanupTracker has a
#cleanup?
method that looks at either job counts or elapsed seconds. Which informs theGoodJob::Scheduler
.The
GoodJob::Scheduler observes the tasks as they complete. And one of those is conditionally running
#cleanup
.The
GoodJob::Scheduler's
#cleanup
method delegates the clean_up to the performer; which is aGoodJob::JobPerformer
.The
GoodJob::JobPerformer then runs the general process
GoodJob.cleanup_preserved_jobs
(which is available via the CLI).The
GoodJob.cleanup_preserved_jobs method is the one that ultimately cleans up preserved jobs.
Note that the
include_discarded
does some logical hoops with some grammatical antics (e.g.old_jobs.not_discarded unless include_discarded
). We are not including discarded jobs so the query will limit to jobs that are not_discarded.Related to:
Story
Refs #issuenumber
Expected Behavior Before Changes
Expected Behavior After Changes
Screenshots / Video
Notes