Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed Orchestrator Service #387

Open
martin-traverse opened this issue Jul 17, 2023 · 1 comment
Open

Distributed Orchestrator Service #387

martin-traverse opened this issue Jul 17, 2023 · 1 comment
Labels
enhancement New feature or request platform-core
Milestone

Comments

@martin-traverse
Copy link
Contributor

This feature will allow the orchestrator to run in distrbuted mode, with several instances of the orchestrator service running in a redundant, hot-hot configuration.

The existing job cache API is already designed for distributed operation. It uses a ticket system to grant tickets for performing operations such as adding, updating and removing jobs from the cache. The cache ensures that only a single ticket can be granted per job at any time.

The current implementation uses an in-process cache based on a simple Java concurrent map. To allow for running multiple orchestrator processes, a second implementation of the job cache is needed that will use a SQL database to coordinate locks, cache entries and tickets. This can re-use a lot of the foundation logic from the metadata service to handle a selection of common SQL dialects, so the job cache will support all the same dialects available for the metadata service.

As a first step, the job cache interface must be exposed as a plugin that can be configured in the TRAC platform config file using the LOCAL protocol, with an explicit contract for the cache API. Then the JDBC implementation can be added.

Other implementations using e..g Hazelcast or other in-memory distributed technologies are possible, however a SQL implementation will still easily give latencies of sub 100 ms, which is more than sufficient. A SQL implementation also meets the core principles of simplicity and reducing technology dependencies.

@martin-traverse martin-traverse added enhancement New feature or request platform-core labels Jul 17, 2023
@martin-traverse martin-traverse added this to the 0.6 milestone Jul 17, 2023
@martin-traverse
Copy link
Contributor Author

Local job cache is converted to a plugin by #414

@martin-traverse martin-traverse modified the milestones: 0.6, 0.7 Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request platform-core
Projects
None yet
Development

No branches or pull requests

1 participant