Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Change recursive_mutex to mutex in DatabaseRotatingImp #5276

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/xrpld/nodestore/DatabaseRotating.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ class DatabaseRotating : public Database

/** Rotates the backends.

@param f A function executed before the rotation and under the same lock
@param f A function executed before the rotation
*/
virtual void
rotateWithLock(std::function<std::unique_ptr<NodeStore::Backend>(
Expand Down
12 changes: 11 additions & 1 deletion src/xrpld/nodestore/detail/DatabaseRotatingImp.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,19 @@ DatabaseRotatingImp::rotateWithLock(
std::function<std::unique_ptr<NodeStore::Backend>(
std::string const& writableBackendName)> const& f)
{
// backendMutex_ prevents writableBackend_ and archiveBackend_ from changing
// while the main mutex_ is released during the callback.
std::lock_guard backendLock(backendMutex_);

auto const writableBackend = [&] {
std::lock_guard lock(mutex_);
return writableBackend_;
}();

auto newBackend = f(writableBackend->getName());

std::lock_guard lock(mutex_);

auto newBackend = f(writableBackend_->getName());
archiveBackend_->setDeletePath();
archiveBackend_ = std::move(writableBackend_);
writableBackend_ = std::move(newBackend);
Expand Down
12 changes: 5 additions & 7 deletions src/xrpld/nodestore/detail/DatabaseRotatingImp.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,15 +80,13 @@ class DatabaseRotatingImp : public DatabaseRotating
sweep() override;

private:
// backendMutex_ is only needed when the *Backend_ members are modified.
// Reads are protected by the general mutex_.
std::mutex backendMutex_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this sounds like a typical single-write and one-or-more-read scenario, is it possible to use a single shared_mutex here instead of these two mutexes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible, but there are risks. The biggest one is that I'd have to take a shared_lock at the start of rotateWithLock, and upgrade it to a unique_lock after the callback. If there is somehow ever a second caller to that function, or even a different caller that upgrades the lock, there is a potential deadlock.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bthomee @vvysokikh1 Ok, it took waaaaaaay longer than it should have because I kept trying clever things that didn't work or turned out unsupported, but I rewrote the locking, and changed to a shared mutex, and I think I've got a pretty foolproof solution here. And a unit test to exercise it.

But don't take my word for it. The point of code reviews is to spot the stuff I didn't consider.

std::shared_ptr<Backend> writableBackend_;
std::shared_ptr<Backend> archiveBackend_;
// This needs to be a recursive mutex because callbacks in `rotateWithLock`
// can call function that also lock the mutex. A current example of this is
// a callback from SHAMapStoreImp, which calls `clearCaches`. This
// `clearCaches` call eventually calls `fetchNodeObject` which tries to
// relock the mutex. It would be desirable to rewrite the code so the lock
// was not held during a callback.
mutable std::recursive_mutex mutex_;

mutable std::mutex mutex_;

std::shared_ptr<NodeObject>
fetchNodeObject(
Expand Down
Loading