Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some allocations under HttpFileSystemBasedFindPackageByIdResource.ConsumeFlatContainerIndexAsync #6265

Closed
wants to merge 7 commits into from

Conversation

ToddGrun
Copy link
Contributor

@ToddGrun ToddGrun commented Feb 11, 2025

Bug

Fixes: NuGet/Home#14095

Description

Reduces allocation in HttpFileSystemBasedFindPackageByIdResource.ConsumeFlatContainerIndexAsync

  1. Moved Utf8JsonStreamReader to the build\Shared location so it can be compiled into multiple assemblies
  2. Created a new resource file for shared strings that Utf8JsonStreamReader.
  3. Switched to using Utf8JsonStreamReader for reading some JSON in HttpFileSystemBasedFindPackageByIdResource.
  4. Changed PackageInfo.ContentUri to be calculated on a deferred basis as it's very expensive to calculate and is used quite infrequently (I saw about 0.1% of PackageInfo objects created actually get queried for that data)

PR Checklist

  • Meaningful title, helpful description and a linked NuGet/Home issue
  • Added tests
  • Link to an issue or pull request to update docs if this PR changes settings, environment variables, new feature, etc.

@ToddGrun ToddGrun requested a review from a team as a code owner February 11, 2025 20:44
@microsoft-github-policy-service microsoft-github-policy-service bot added the Community PRs created by someone not in the NuGet team label Feb 11, 2025
var normalizedVersionString = Identity.Version.ToNormalizedString();
string idInLowerCase = Id.ToLowerInvariant();

var builder = StringBuilderPool.Shared.Rent(256);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly 50% of the string builders used in this codepath end up exceeding the 256 char limit, and thus aren't released back to the pool. I didn't address it in this PR, as delay calculating this ends up getting rid of the vast majority of the work, but it might be worth considering upping that 256 char limit used in StringBuilderPool.

@jeffkl jeffkl self-assigned this Feb 11, 2025
@ToddGrun
Copy link
Contributor Author

Test insertion based on commit 1 to get speedometer numbers: https://devdiv.visualstudio.com/DevDiv/_git/VS/pullrequest/610228

@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Feb 19, 2025
@jeffkl jeffkl removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Feb 19, 2025
@jeffkl
Copy link
Contributor

jeffkl commented Feb 26, 2025

I have pushed a change on behalf of the PR author to make Utf8JsonStreamReader a shared class.

NuGet.Common.PublicStrings
~static NuGet.Common.PublicStrings.Culture.get -> System.Globalization.CultureInfo
~static NuGet.Common.PublicStrings.Culture.set -> void
~static NuGet.Common.PublicStrings.Invalid_AttributeValue.get -> string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just my personal opinion, but I don't like resource strings as public APIs. There's too much risk of runtime failures if new {x} parameters are added to the string, and callers of the API don't update their string.Format argument list. We've even had this type of failure within the repo, so it'll be much worse if anyone using the NuGet Client SDK packages in their own app use this string for some reason.

I don't think that duplicating the strings, and keeping them internal, in multiple assemblies is bad enough to prefer public APIs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/NuGet/NuGet.Client/pull/6265/files#r1970768798 for one complexity of not having these as public.

I totally understand its unusual here, but these are the only options I could come up with:

  1. Have a new .resx in any assembly that needs shared code under a common namespace so that the shared code can compile against it.
  2. Hardcode this string and not have it localized at all
  3. Have a set of public strings in NuGet.Common.dll that we are confident won't change and are pretty generic (this one for example is just telling a user that an attribute value is invalid).
  4. InternalsVisibleTo/friend assembly. I wish you could grant friend access to individual classes...
  5. Make Utf8JsonStreamReader public so that our assemblies can share it
  6. Refactor entire assemblies and combine them
  7. Other?

In this case, it seemed like the least worst option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want 6, but let's wait on that until we have a major version.

I'd prefer 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I've pushed an update to use a shared resource that gets compiled into each assembly that needs it. At least the string is only defined in one place.

@jeffkl jeffkl requested review from nkolev92 and zivkan March 3, 2025 18:02
<Compile Include="$(SharedDirectory)\SharedStrings.Designer.cs">
<DesignTime>True</DesignTime>
<AutoGen>True</AutoGen>
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking out this PR's branch, looking at Solution Explorer, SharedStrings.Designer.cs isn't nested under SharedStrings.resx. And editing the resx, VS edits ths csproj with the below change, which fixes the file nesting

Suggested change
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon>
<DependentUpon>SharedStrings.resx</DependentUpon>

<Compile Include="$(SharedDirectory)\SharedStrings.Designer.cs">
<DesignTime>True</DesignTime>
<AutoGen>True</AutoGen>
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking out this PR's branch, looking at Solution Explorer, SharedStrings.Designer.cs isn't nested under SharedStrings.resx. And editing the resx, VS edits ths csproj with the below change, which fixes the file nesting

Suggested change
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon>
<DependentUpon>SharedStrings.resx</DependentUpon>

if (reader.ValueTextEquals(VersionsPropertyName))
{
reader.Read();
reader.ProcessStringArray(static (s, args) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is s? I think it needs a better name

@@ -109,6 +109,17 @@ internal void Skip()
}
}

internal void ProcessStringArray<TArg>(Action<string, TArg> callback, TArg arg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in one place, and where it was used, I couldn't guess how the callback worked without looking at this code. So, I'm not sure it's a good candidate for a shared method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calling code was already nested fairly deeply, and this seemed to be somewhat useful, but I don't mind removing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that as part of this, I added ReadTokenAsString as this class doesn't expose _reader and the existing ReadNextTokenAsString doesn't quite fix the caller's needs.

{
var doc = await stream.AsJObjectAsync(token);

var reader = new Utf8JsonStreamReader(stream);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we'll be better off creating a data type that matches the json schema (if we don't already have one), and "just" using System.Text.Json's built in JsonSerializer.DeserializeAsync(stream)?

We created Utf8JsonStreamReader because it's a super perf critical path, and the data types we have do not match the json schema, so to use the built in deserializer, we'd have to transform the data after deserialization, which would be additional memory allocations. Also, using System.Text.Json's JsonConverter causes System.Text.Json to load the entire json object in memory before calling the converter. When the root object uses a converter, that means the entire json document gets loaded in memory, which causes LOH allocations.

Frankly I'm having a hard time seeing how exactly this code reads the json and converts it into whatever data type the method's result is. I'm not sure if that's a reflection of the readability of the code, or just my mental state this morning. This code deserializses json like this, right? https://api.nuget.org/v3-flatcontainer/nuget.protocol/index.json

I feel like this code would be much more obvious if it was more like:

private static SortedDictionary<NuGetVersion, PackageInfo> ConsumeFlatContainerIndexAsync(Stream stream, string id, string baseUri)
{
    var versionsList = await JsonSerializer.DeserializeAsync<FlatContainerVersionsList>(stream, cancellationToken);
    SortedDictionary<NuGetVersion, PackageInfo> result = new(versionsList.Versions.Count;
    foreach (var versionString in versionsList.Versions)
    {
        NuGetVersion version = NuGetVersion.Parse(versionString);
        PackageInfo packageInfo = BuildModel(baseUri, id, versionString);
        result.add(version, packageInfo);
    }
}

One difference with this is that System.Text.Json would need to keep resizing the version list until the result is complete, whereas the current code only resizes the dictionary. On one hand resizing dictionaries is much more expensive than resizing lists, on the other hand I think dictionaries grow faster (and their initial size is definately much larger) so might require fewer resizes. Overall, I can't say whether the CPU time will be reduced, but I guess this approach might cause more allocations because of the backing list resizing, maybe.

Anyway, I want to make sure that we considered reviewing the overall approach this code uses to deserialize the json, rather than trying to port the old Newtonsoft.Json code to Nuget's "propriatary" Utf8JsonStreamReader, which exists because of limitations of System.Text.Json's APIs and reading the assets file, and those same constraints might not apply here.

I also didn't look at the call stack. I see that PackageInfo is a private type to this class, which means it can't be returned directly from any public method. Are there other opportunities that will give us improvements in both performance and readability if we take a wider look at this class?

Copy link
Member

@zivkan zivkan Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR the sorted dictionary is never needed. No more than a single PackageInfo is needed for any API call, and if the code is refactored to remove the dictionary, I don't think that private class provides anything. A method to generate the URL to a nupkg download on demand is sufficient.

Taking a quick look at HttpFileSystemBasedFindPackageByIdResource.cs:

  1. There's a public method Task<IEnumerable<NuGetVersion>> GetAllVersionsAsync(...). This means all the work that the inner class spends creating the sorted dictionary and the PackageInfo instances is wasted. It just returns the dictionary keys. It should be replaceable with something roughly similar to:

    var flatContainerVersions = await JsonSerializer.Deserialize(stream, cancellationToken);
    flatContainerVersions.Versions.Sort();
    return flatContainerVersions.Versions;
  2. All other public methods only care about a single package version. So, creating the sorted dictionary is quite literally pointless 🤣

    These methods could instead get the List<string> list of versions from the JSON, do an OrdinalIgnoreCase to see if it's the same version, therefore reducing a whole lot of additional NuGetVersion allocations, remove the SortedDictionary allocation and CPU sort time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sounds very interesting, but probably beyond what I can contribute right now.

Without fully understanding your suggestion, I had thought that these dictionaries were calculated just once and stored into _packageInfoCache, and thus future requests on this object would hit that cache for the data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow, I missed the fact that the dictionary gets cached in _packageInfoCache. But I think we can still change its type to ConcurrentDictionary<string, Task<List<string>> to avoid even more allocations than what this PR is already doing.

But I think I lost an earlier point due to too much rambling.

I don't think that NuGet.Protocol needs to use our custom Utf8JsonStreamReader. If the class defines a record FlatContainerVersionList(IReadOnlyList<string> Versions);, we can use System.Text.Json's built-in deserializer, and not need to move or duplicate the custom Utf8JsonStreamReader.

The reason I bring this up is because I consider custom JSON parsing code to be tech debt. I don't know if System.Text.Json will continue to make internal perf improvements, or if simple deserialization code is as performant as it's going to be in the next 10 years. But, by using Utf8JsonStreamReader rather than JsonSerializer.Deserialize<T>(), this code would miss out on any potential future changes. I would prefer to minimize use of Utf8JsonStreamReader to only things we can't find a way to use System.Text.Json's built in functionality.

@jeffkl do you want to talk about this tomorrow?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a PR with my idea: #6300

Getting perf measurements is taking more time than the code changes though. I took PerfView traces of a console app that references Microsoft.VisualStudio.SDK with the ".NET Alloc" collector (non-sampling alloc).

branch GC stats total allocs GC Heap Alloc Ignore Free Stacks ConsumeFlatContainer
dev 252 MB 5,512,529
this PR 246 MB 2,139,213
my PR 244 MB 669,655

I'm very much out my comfort zone with trying to understand what PerfView is telling me, so I this could be a complete misrepresentation. Plus, using the non-sampling .NET Alloc collector I've been told has such a large impact that some people don't trust it, but I wasn't trying to measure time performance, only allocations, so I hope it's ok. I didn't want to use the sampling alloc collector, because if there's only a small number of allocations, it might not sample these code paths and give us a misrepresentation. I feel like the sample alloc collector really needs a large number of traces to analyze. And finally, I only took a single trace for each branch, so run to run variance could completely invalidate the results as well.

Considering the relative differences in the apparent allocations in the ConsumeFlatContainerIndex method does not match up with proportional improvements in the GC statis total allocs, I feel like there's a real error in my testing methodology. But it still fills me with hope that I'm in the right direction with a good perf benefit.

Copy link
Contributor Author

@ToddGrun ToddGrun Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I liked about Utf8JsonStreamReader was that it seemed to be pretty frugal with only reading from the stream what was necessary, but maybe that doesn't help much in this case as the stream is fully read through each time. I guess I'm still not seeing how JsonSerializer.DeserializeAsync is not allocating more than Utf8JsonStreamReader, although it does provide simpler code.

I always use sampling for the allocation profiles I create, but the numbers are usually quite a bit larger, so to get data, I'd probably just create a larger scenario to measure by cleaning out the package cache and then restoring roslyn while sample profiling. Another option could be to use a benchmarking test.

It's late, but are there essentially three ideas in this PR?

  1. Use JsonSerializer.DeserializeAsync instead of Utf8JsonStreamReader
  2. Use a sorted ImmutableArray instead of a SortedDictionary
  3. Defer creating the package identity (and not needing the PackageInfo)

Maybe your PR should go ahead and then I can take a look and see if my PR built on it would provide any benefit? At that point, this PR would essentially just be changing the JsonSerializer.DeserializeAsync call to instead use Utf8JsonStreamReader (and the supporting code movement to do that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to sync about this today, I'm not sold one way or the other.

Comment on lines +607 to +611
public string ContentUri
{
get
{
var normalizedVersionString = Identity.Version.ToNormalizedString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that ContentUri typically won't be used more than once. The dependency resolver will probably only get the property when it's about to make an HTTP request to the URL, and once it does that, it has no reason to ever request the URL again.

But I'm wondering if it's worthwhile caching in a field, just in case, so the second get is cheap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the numbers on the calling patterns on this still, but it felt like it was called once infrequently, and more than that almost never, so the overhead associated with another field seemed worse than just recalculating.

@ToddGrun
Copy link
Contributor Author

ToddGrun commented Mar 7, 2025

Going to close this one out in favor of Andy's PR: #6300

@ToddGrun ToddGrun closed this Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community PRs created by someone not in the NuGet team
Projects
None yet
4 participants