Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk force bundling #235

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

leolegenie
Copy link

@leolegenie leolegenie commented Jan 2, 2025

This PR intends to add a 'disk-force-bundling' mode to the FileSystemRepository as a performance optimization.

In concurrent scenarios, when there are a lot of concurrent transactions, in some of our usecases we have seen substantial contention on FileSystemRepository put operations, the bottleneck (in our cases at least) being the capability of the disks to serve the force operation (channel.force()). This is particularly true if we use spinning disks (but also with SSDs). By collecting all pending disk-forces in a given timeframe we can reduce the load on the disks and increase throughput substantially (in our cases from a couple of 100 tx/s to a couple of 1000 tx/s).

The general idea of the implementation is that the caller threads not directly wait on the previously synchronized put/writeToFile operations but instead get back a latch to wait on. The put/write operations put the operational data onto a queue that is then processed by a separate Thread in a loop, collecting all the writes and only doing a single force per 'bundle' of writes.

The implementation was done in a way that this 'bundling mode' is optional, per default the previous behavior is enabled.

The testcase is added on the JDBC module as it's a kind of integration test, it should demonstrate functional equivalence of the new mode to the previous behavior and also serve as indicator of the throughput improvements that can be expected.

Looking forward to feedback!

Regards, Leo

PS: All previous unit tests except PooledAlarmTimerTestJUnit are still working, but on my end at least PooledAlarmTimerTestJUnit was already not working in the base 6.0.1-SNAPSHOT version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant