Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more BTreeMap like apis to LiteMap #6068

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

arthurprs
Copy link
Contributor

@arthurprs arthurprs commented Feb 5, 2025

Add BTreeMap-like APIs to LiteMap to get closer to the std lib BTreeMap, which has familiarity benefits and allows easier integration of LiteMap into existing projects. This is a follow up of #5894

  • Add lm_extend to the traits. The vec implementation attempts have O(N) performance on the best case and an asymptotic worst case of O(NLog(N)), effectively avoiding quadratic worst cases. The end goal is to use this in the deserialization code path, which will also come in a follow-up PR.
  • Implements Extend for LiteMap. This is based on lm_extend.
  • Prevent quadratic worst case in the existing FromIter implementation by using the newly added lm_extend
  • Even though try_get_or_insert exists, the Entry API is a more familiar API with good performance, as it may avoid a potentially wasted second binary search.

Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've been missing these! A couple small comments

utils/litemap/src/map.rs Show resolved Hide resolved
utils/litemap/src/map.rs Show resolved Hide resolved
@arthurprs arthurprs changed the title Add/change BTreeMap like apis to LiteMap Add more BTreeMap like apis to LiteMap Feb 5, 2025
@sffc
Copy link
Member

sffc commented Feb 5, 2025

I'm all in favor of Entry APIs

I've pushed back for a long time on a generic extend function because it is a performance footgun; I think it is even worse than O(N^2) in the worst case; could be O(N^2 log N) (because there are N insertions, and each insertion is N log N, with log N for the lookup and N for the array insert).

FromIterator is O(N log N) which is the optimal complexity: it collects things into a Vec and then sorts the Vec, and it uses the same Vec for its eventual storage.

We have extend_from_litemap which does the more efficient O(N) thing since it knows the input is sorted, and I would support adding extend_from_btreemap since we can make a similar assumption there. But I lean toward not having a O(N^2 log N) generic extend function.

@Manishearth
Copy link
Member

I agree around extend, it could be extend_sorted() which takes an ExactSizeIterator, perhaps, but that's about it.

@sffc
Copy link
Member

sffc commented Feb 5, 2025

If someone wants to use .extend(collection), they are better off doing .extend_from_litemap(collection.into_iter().collect()). There is an extra allocation, but the asymptotic performance is much faster.

If we did add .extend, I would want it to be implemented like that, I think.

@arthurprs
Copy link
Contributor Author

I agree that extend is tricky, but I was hoping that we could add a sort + dedup-based version to curb the worst case while allowing good drop-in experience.

I agree around extend, it could be extend_sorted() which takes an ExactSizeIterator, perhaps, but that's about it.

Let's add that, sounds good.

FromIterator is O(N log N) which is the optimal complexity: it collects things into a Vec and then sorts the Vec, and it uses the same Vec for its eventual storage.

I'm afraid I could be missing something obvious, but I don't think there is a sort step right now and it's N² worst case.

@arthurprs arthurprs requested a review from a team as a code owner February 5, 2025 23:28
@sffc
Copy link
Member

sffc commented Feb 5, 2025

I see, I thought FromIterator was N log N but it isn't currently implemented that way. Maybe it should be.

fn lm_sort_from_iter<I: IntoIterator<Item = (K, V)>>(iter: I) -> Self {

@arthurprs
Copy link
Contributor Author

arthurprs commented Feb 5, 2025

I fixed the quadratic behavior and implemented from_iter and extend as described in the first comment. A sorted fast-path with fallback to sort+dedup.

I benchmarked and I think it's a good direction. Let me know if you agree with this extend version otherwise I can take it out.

Edit: duplications can still lead to quadratic behavior, so I'm going remove extend.

Edit-2: I figured out. With some scratch space we can avoid all quadratic behavior. I think this is a good direction again.

Before

litemap/from_iter_rand/small
                        time:   [916.60 ns 941.77 ns 965.82 ns]
litemap/from_iter_rand/large
                        time:   [2.3779 s 2.3863 s 2.3946 s]
litemap/from_iter_sorted/small
                        time:   [84.010 ns 84.084 ns 84.155 ns]
litemap/from_iter_sorted/large
                        time:   [725.31 µs 739.88 µs 755.20 µs]

After

litemap/from_iter_rand/small
                        time:   [814.95 ns 830.50 ns 845.94 ns]
litemap/from_iter_rand/large
                        time:   [42.048 ms 43.138 ms 44.337 ms]
litemap/from_iter_sorted/small
                        time:   [76.707 ns 76.759 ns 76.808 ns]
litemap/from_iter_sorted/large
                        time:   [640.42 µs 652.98 µs 668.38 µs]
litemap/extend_rand/large
                        time:   [108.48 ms 110.88 ms 113.52 ms]
litemap/extend_rand_dups/large
                        time:   [173.80 ms 175.58 ms 177.54 ms]
litemap/extend_from_litemap_rand/large
                        time:   [2.0401 s 2.0488 s 2.0575 s]

@arthurprs
Copy link
Contributor Author

arthurprs commented Feb 6, 2025

Once this PR is sorted, I'll open another PR to improve the serde deserialization and avoid quadratic worst case whenever possible. As it stands it's a potential attack vector.

} else {
(size_hint_lower + 1) / 2
};
self.reserve_exact(reserve);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

This function is a bit hard to follow, and I'm not convinced it is always correct, especially the things involving sorted_len.

We handle some special cases (such as the append-items already being in order) but not others (such as when they are all less than the target).

How about you just append everything to the end and have a single call to self.sort_by, and assume that self.sort_by will have a fast path if the things are in order. Easy to review, and the worst case is still N log N.

Copy link
Contributor Author

@arthurprs arthurprs Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about you just append everything to the end and have a single call to self.sort_by, .... Easy to review, and the worst case is still N log N.

That was my thinking in 5bab562 too, but I realized that the required deduplication step (after sort) had quadratic behavior. That became clear when I added bench_extend_rand_dups.

This function is a bit hard to follow, and I'm not convinced it is always correct, especially the things involving sorted_len.

I can try to improve clarity and maybe an ASCII illustration as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants