Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rust] Skip unnecessary calls to find_among #202

Merged
merged 2 commits into from
Nov 19, 2024

Conversation

jedav
Copy link
Contributor

@jedav jedav commented Oct 30, 2024

This change ports an optimization from the C stemmer generator to the rust generator.

In order to determine whether find_among and find_among_b are worth calling, the generated C checks that the current string is long enough to potentially match the shortest string literal in the among, then does some bitwise magic to check that the byte in the current string at the length of the shortest string in the among may match the byte at that position in at least one of the among strings. If the current string is too short or that byte doesn't match any of the amongs at that position, then we can fail-fast instead of calling find_among or find_among_b.

This change allows the generated rust stemmers to similarly skip find_among and find_among_b calls where we can quickly determine that the call could not succeed. In some stemming benchmarks I've been running, this improves the performance of the rust english stemmer by between 40% and 50%.

This change ports an optimization from the C stemmer generator to the rust generator.

In order to determine whether find_among and find_among_b are worth calling,
the generated C checks that the current string is long enough to potentially match
the shortest string literal in the among,
then does some bitwise magic to check that the characters in the current string
at length of the end of the shortest string in the among
are the same as the character at that position in at least one of the among strings.
If the current string is too short or that character doesn't match any of the amongs
at that position, then we can fail-fast instead of calling find_among.

This change allows the generated rust stemmers to similarly skip find_among calls
where we can quickly determine that find_among would not succeed.
In some stemming benchmarks I've been running, this improves the performance
of the rust english stemmer by between 40 and 50%.
@ojwb
Copy link
Member

ojwb commented Oct 31, 2024

Sounds good, but CI failed as this seems to result in different output for one of the arabic test vocabulary words so there's something not quite right in the implementation.

To reproduce locally, try make check_rust_arabic (or make check_rust to run all of them).

@jedav
Copy link
Contributor Author

jedav commented Oct 31, 2024

Ah, I see, I had missed make check. Will investigate, thanks for pointing this out.

@jedav
Copy link
Contributor Author

jedav commented Nov 1, 2024

OK, I think it's passing make check_rust after narrowing the numeric types in the bitwise operations in the comparison. The performance improvement is now only 27%, which makes some sense; previously it was skipping more find_amongs than it should have, resulting in words not being stemmed down as far as they should've been, while now it is performing those find_amongs and doing more work.

@jedav
Copy link
Contributor Author

jedav commented Nov 1, 2024

Ooh, I think understand it now, it was simplifying the !(...) != 0 to (...) == 0 that fixed it I think - in rust, ! on an integer is bitwise negation like ~ in C, not boolean negation like ! is in C.

@ojwb
Copy link
Member

ojwb commented Nov 5, 2024

I guess those ! semantics seemed like a good idea to someone, but it does rather set a trap for those coming from a C/C++ background.

The 40-50% gain did seem a bit surprising, but the C optimisation was a long time ago.

@jedav
Copy link
Contributor Author

jedav commented Nov 7, 2024

Yeah, the ! semantics definitely got me pretty good.

I wasn't as surprised with a nominal speedup of that magnitude because we were seeing the rust version as 60-70% slower than the C version in benchmarks, and it seemed possible that this optimization might be the main cause. It's still a big piece but we've found other sources for the remaining performance delta (mostly artifacts of the way we were wrapping the rust code for benchmarking, resulting in extra allocations).

Is there anything else you'd like to see changed here before this can be merged? I'm not totally clear on the process here.

@ojwb ojwb merged commit a10c9c4 into snowballstem:master Nov 19, 2024
18 checks passed
@ojwb
Copy link
Member

ojwb commented Nov 19, 2024

Sorry, was just on vacation and not really able to actually look at the changes.

Looks good, thanks.

@jedav
Copy link
Contributor Author

jedav commented Nov 20, 2024

Thanks Olly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants