Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBZ-7631 Change hashing strategy to use partition columns #126

Merged
merged 1 commit into from
Mar 14, 2024

Conversation

samssh
Copy link
Contributor

@samssh samssh commented Mar 12, 2024

To ensure the proper ordering of events within the same partition, it's necessary to utilize the partition columns when determining the queue index for event insertion.

@samssh
Copy link
Contributor Author

samssh commented Mar 12, 2024

@@ -665,4 +665,8 @@ private List<Object> getPartitionKeys(PartitionUpdate pu) {
return values;
}

private int getPartitionQueueIndex(PartitionUpdate partitionUpdate) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this method be extracted into a utility class in core module and re-used in all three modules?

@jpechane
Copy link
Contributor

@samssh Thank you for the PR. I left a comment related to a small refactoring to re-use the code.
Still this is quite breaking change and I can imagine other users prefering current behaviour. Would it be possible to create a configuration option that would control the hashing startegy? Ideally enum with two options that would allow additonal expansion if needed.

@samssh
Copy link
Contributor Author

samssh commented Mar 12, 2024

@jpechane You're welcome.

Regarding the method refactoring, unfortunately, it's not possible to extract the method to a utility class in the core module because for every version of Cassandra we have a different artifact that contains the PartitionUpdate class. If you agree, I can create an abstraction and develop a wrapper class to move more common logic to the core module, but this will likely require significant changes to the codebase.

And regarding configuring the behavior, that's fine, and I will add a configuration property with an enum as soon as possible.

@jpechane
Copy link
Contributor

@samssh I see, that's unfortunate. Let's forget about the refactoring then.

@samssh samssh force-pushed the DBZ-7631 branch 2 times, most recently from c39a0a5 to 534904b Compare March 12, 2024 14:29
…guration property.

To ensure the proper ordering of events within the same partition, it's necessary to utilize the partition columns when determining the queue index for event insertion.
To maintain the previous behavior, a property has been added to the config to allow the selection of the hashing strategy.
@samssh
Copy link
Contributor Author

samssh commented Mar 12, 2024

@jpechane It should be ready.

@jpechane jpechane merged commit 27bd243 into debezium:main Mar 14, 2024
3 checks passed
@jpechane
Copy link
Contributor

@samssh Applied, thanks! Could you please create a PR against the core repo with docs update?

@samssh samssh deleted the DBZ-7631 branch March 14, 2024 08:56
@samssh
Copy link
Contributor Author

samssh commented Mar 14, 2024

Thank you. Of course! I'll get on it as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants