Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory allocations in HttpFileSystembasedFindPackageByIdResource #6300

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

zivkan
Copy link
Member

@zivkan zivkan commented Mar 5, 2025

Bug

Fixes: NuGet/Home#14095

Description

Following @ToddGrun's PR: #6265, in particular this comment thread: #6265 (comment)

I noticed that the code is creating one PackageInfo and PackageIdentity and ContentUri for every package version the server knows about. However, in a restore, typically only a single package version is needed. So, by creating those PackageIdentity and url strings only when needed, we can remove a whole lot of additional allocations over what the other PR did.

In addition to removing PackageInfo and associated types, I moved the JSON parsing to System.Text.Json instead of Newtonsoft.Json (which used JObject, no less). So, that provides us with just most of Todd's PR memory improvements as well.

PR Checklist

  • Meaningful title, helpful description and a linked NuGet/Home issue
  • Added tests code was refactored, no behavior changes
  • Link to an issue or pull request to update docs if this PR changes settings, environment variables, new feature, etc. N/A

}

return streamResults;
builder.Sort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builder.Sort();

does it need to not add duplicates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our public documentation for the "package content" resource states that to download a package, the URL is auto generated from the pattern: {@id}/{LOWER_ID}/{LOWER_VERSION}/{LOWER_ID}.{LOWER_VERSION}.nupkg.

If the version list contains duplicates, suggesting that there are two different packages with the same version, there's no way for NuGet to download the different versions. The download URLs will be identical. Hence servers can't have two different packages with the same version.

So, I consider it a server-side bug if there exists any NuGet server that returns duplicate versions in the index.json.

While the version list is returned to the caller in this class' GetAllVersionsAsync API, all the other APIs only check if the version exists. If there are two instances of the same version, the result will still be true, so I don't see any harm done.

There's only a question if there is harm in GetAllVersionsAsync returning duplicates. That's certainly possible if someone calls .ToDictionary on the result, or uses Dictionary.Add, using the version as the key, or .Single, as these APIs will throw an exception.

But as long as there are no unhandled exceptions, maybe I'm not being customer focused enough, but I'm not worried about server version lists containing a duplicate. I would feel comfortable telling the customer that the NuGet server they're using is providing incorrect results, which Visual Studio or the dotnet CLI is reporting accurately.

I don't know if anyone finds my justification convincing 😁

@ToddGrun
Copy link
Contributor

ToddGrun commented Mar 5, 2025

I like this PR, it was a good find that the PackageInfo didn't really need to be in this cache.

string idInLowerCase = id.ToLowerInvariant();
var baseUri = _baseUris[retry % _baseUris.Count].OriginalString;

var builder = StringBuilderPool.Shared.Rent(256);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var builder = StringBuilderPool.Shared.Rent(256);

When you debug this, do you see many of the url exceeding 256 and thus not working well with the pool?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While talking to Jeff, he pointed out that the string isn't dynamic in any way (no if (something) builder.Add), so we can use string.Concat instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still probably preferred over potentially removing items from the stringbuilder pool, but I think netfx will end up allocating the array on the heap to hold the individual strings.

zivkan added 2 commits March 6, 2025 10:33
GetAllVersionsAsync probably doesn't need to be sorted.
This lets us use a HashSet for quicker lookups in all other APIs.
Don't change the baseUri used to generate nupkg download URIs
String.Concat instead of StringBuilder
@@ -480,13 +485,14 @@ private async Task<SortedDictionary<NuGetVersion, PackageInfo>> FindPackagesById
},
async httpSourceResult =>
{
var result = new SortedDictionary<NuGetVersion, PackageInfo>();
HashSet<NuGetVersion> result = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HashSet result = [];

maybe leave this unitialized and set it to a new dictionary in an else clause below?

@zivkan zivkan marked this pull request as ready for review March 6, 2025 00:35
@zivkan zivkan requested a review from a team as a code owner March 6, 2025 00:35
@zivkan zivkan requested review from jeffkl and donnie-msft March 6, 2025 00:35

return packageInfos.Keys;
return packageVersions;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since _packageVersionsCache now contains HashSet rather than SortedDictionary, this means the public API GetAllVersionsAsync may no longer be returning a sorted list of versions. So, changing this risks breaking behaviour.

However, FindPackageById is only used by restore, within NuGet.Client. PM UI, and anything that cares about unlisted package status, does not use this API. Restore handles this list not being sorted, and all tests appear to be passing, so I think the biggest risk are apps using our NuGet.Protocol package.

private readonly IReadOnlyList<Uri> _baseUris;
private string _chosenBaseUri;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, I would have liked to make it round-robin load balance across all the base URIs that the server provided in their service index. But that would be a change in behavior which might risk breaking customers. So, we re-use the base uri that the version list was obtained from, just like the old code.

@ToddGrun
Copy link
Contributor

ToddGrun commented Mar 6, 2025

I'd love to hear about what measurements you ended up taking and how big of an improvement this is

};
var baseUri = _chosenBaseUri ?? _baseUris[0].OriginalString;

string contentUri = string.Concat(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string contentUri = string.Concat(

Would an interpolated string make this easier to read?

string contentUri = $"{baseUri}{idInLowerCase}/{normalizedVersionString}/{idInLowerCase}.{normalizedVersionString}.nupkg";

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't string.concat supposed to be faster than interpolation or format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was playing around on sharplab and they look pretty similar (at least on netfx) and although different on core, I'm not really sure that one is better than the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that interpolation would be more readable, but this seems like a hotpath and a reasonably long string, so my understanding is Concat is more performant.

For the longest time, I understood concat or + for strings to be sloppy and bad performance. But a few years ago I learned that was somehow the opposite of reality!

Therefore, if my understanding is still up to date, I'd say leave it as is or compromise for readability with:
baseUri + idInLowerCase + "/" + normalizedVersionString + "/" + idInLowerCase ...

Copy link
Contributor

@ToddGrun ToddGrun Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with whatever is decided here, but I do think the perf of interpolated strings is better than you guys think:

A simple benchmark comparing String.Concat and interpolation actually indicates interpolation performance is about the same on net472 and actually better than String.Concat in net9.

net472

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
Concat 150.5 us 1.56 us 1.46 us 54.9316 - - 282 KB
ConcatIncludeLoopCounter 233.9 us 3.23 us 3.03 us 63.2324 - - 325 KB
Interpolated 147.3 us 0.87 us 0.81 us 54.9316 - - 282 KB
InterpolatedIncludeLoopCounter 162.0 us 1.14 us 1.01 us 56.3965 - - 289 KB

net9.0

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
Concat 43.86 us 0.642 us 0.536 us 28.2593 - - 289 KB
ConcatIncludeLoopCounter 73.05 us 1.286 us 1.140 us 44.1895 - - 451 KB
Interpolated 38.67 us 0.354 us 0.331 us 19.1040 - - 195 KB
InterpolatedIncludeLoopCounter 42.38 us 0.433 us 0.384 us 19.1040 - - 195 KB

@zivkan
Copy link
Member Author

zivkan commented Mar 6, 2025

I'd love to hear about what measurements you ended up taking and how big of an improvement this is

From: #6265 (comment)

branch GC stats total allocs GC Heap Alloc Ignore Free Stacks ConsumeFlatContainer
dev 252 MB 5,512,529
this PR 246 MB 2,139,213
my PR 244 MB 669,655

But keep in mind in the comment, from the other PR, I had the caveat that I amit I pretty much don't know what I'm doing.

Here's a PerfView diff of the dev branch and this PR branch heap alloc view:
image

Finally, I thought I'd try to use BenchmarkDotNet's MemoryAnalyzer. Here's a patch to the NuGet.Client repo with my benchmark code:
benchmark.zip

However, when I run it, I get these results:

Method Mean Error StdDev Ratio Gen0 Gen1 Allocated Alloc Ratio
Change 2.739 ms 0.0161 ms 0.0126 ms 1.00 328.1250 23.4375 1.99 MB 1.00
Original 2.656 ms 0.0246 ms 0.0205 ms 0.97 316.4063 23.4375 1.91 MB 0.96

So, this benchmark results make it look like my change has more allocations and is slower than the original, which frankly doesn't make any sense to me whatsoever.

What started as reviewing another PR turned into trying to implement further improvements, and while code changes took probably less than an hour, I've now been working probably 2 full days on collecting perf measurements. I was behind my assigned work for this sprint before I started any of this, so I just can't justify any more effort. If this PR is "acceptable" as-is, or with minor changes, I can do that. But I don't have the capacity to investigate any further why these perf measurements are so out of whack.

@ToddGrun
Copy link
Contributor

ToddGrun commented Mar 7, 2025

I'm very happy with this PR (for what that's worth). I definitely wasn't trying to steal a bunch of anyone on the nuget's team time (which is why I made the original PR myself). I went ahead and did some measurements based on this branch, and will share below. My measurements were based on the following steps: "git clean -xdff" in roslyn repo, delete the global nuget cache, open VS, start profiler, and open roslyn.sln, wait for the status bar to indicate everything is done, stop profiler.

I took two profiles each with and without this change and averaged the allocations/times. Filtering under FindPackagesByIdAsync, this change showed an allocation improve from 155 MB to 25 MB and CPU samples improve from 21000 ms to 3850 ms.


In reply to: 2705152207

{
var doc = await stream.AsJObjectAsync(token);
var json = await JsonSerializer.DeserializeAsync<FlatContainerVersionList>(stream, cancellationToken: token);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JsonSerializer.DeserializeAsync

This shows up as a very small amount of the remaining allocations, so there is definitely no need to pursue the other PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@zivkan
Copy link
Member Author

zivkan commented Mar 7, 2025

While falling asleep last night, I realized that my benchmark was reusing a single SourceRepository instance, which means the same HttpFileSystemBasedFindPackageByIdResource instance, which means the warmup would warm the cache, and then the benchmark was only testing the cache-hit code path! It also explains why my change was slower and using more memory, because the old code pre-cached the content URL and PackageIdentity for every package version, whereas my change allocates those types on demand.

Fixing the benchmark, I get different results:

Method Mean Error StdDev Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
Change 206.4 ms 4.11 ms 5.20 ms 1.00 0.03 1000.0000 - 20.67 MB 1.00
Original 201.5 ms 4.02 ms 7.25 ms 0.98 0.04 1333.3333 666.6667 24.77 MB 1.20

I'm still very surprised the changed code is slower, wall clock time. But at least we're now correctly measuring the memory allocation improvements.

@zivkan zivkan requested a review from nkolev92 March 7, 2025 22:02
Copy link
Contributor

@donnie-msft donnie-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, and I look forward to seeing how perf improves as a result of this PR!

Not sure if I feel reassured or worried about test coverage given that no tests needed to be changed here. I did suggest 1 string related test change.

};
var baseUri = _chosenBaseUri ?? _baseUris[0].OriginalString;

string contentUri = string.Concat(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that interpolation would be more readable, but this seems like a hotpath and a reasonably long string, so my understanding is Concat is more performant.

For the longest time, I understood concat or + for strings to be sloppy and bad performance. But a few years ago I learned that was somehow the opposite of reality!

Therefore, if my understanding is still up to date, I'd say leave it as is or compromise for readability with:
baseUri + idInLowerCase + "/" + normalizedVersionString + "/" + idInLowerCase ...

{
var doc = await stream.AsJObjectAsync(token);
var json = await JsonSerializer.DeserializeAsync<FlatContainerVersionList>(stream, cancellationToken: token);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just name it cancellationToken?

cancellationToken);

return GetDependencyInfo(reader);
// package version not found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider omitting comment - the code seems clear

@@ -278,21 +283,23 @@ public override async Task<bool> CopyNupkgToStreamAsync(
{
cancellationToken.ThrowIfCancellationRequested();

var packageInfos = await EnsurePackagesAsync(id, cacheContext, logger, cancellationToken);
var packageVersions = await GetAvailablePackageVersionsAsync(id, cacheContext, logger, cancellationToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please use an explicit type so readers immediately know this is a hashset.

{
var doc = await stream.AsJObjectAsync(token);
var json = await JsonSerializer.DeserializeAsync<FlatContainerVersionList>(stream, cancellationToken: token);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

{
var parsedVersion = NuGetVersion.Parse(version);
var normalizedVersionString = parsedVersion.ToNormalizedString();
var normalizedVersionString = version.ToNormalizedString().ToLowerInvariant();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable given we're calling ToLower on the Id as well.
However, can the tests be updated to match this? If I'm not mistaken, this test would fail after your change if the test data were uppercase (but it's not, so it won't fail now). Just making 1 uppercase character and amending this assertion would cover that?
https://github.com/NuGet/NuGet.Client/blob/d948b1a55a189466ccb627c7f754226d9f43505f/test/NuGet.Core.Tests/NuGet.Protocol.Tests/RemoteRepositories/HttpFileSystemBasedFindPackageByIdResourceTests.cs#L328C27-L328C28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants