-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove some allocations under HttpFileSystemBasedFindPackageByIdResource.ConsumeFlatContainerIndexAsync #6265
Conversation
var normalizedVersionString = Identity.Version.ToNormalizedString(); | ||
string idInLowerCase = Id.ToLowerInvariant(); | ||
|
||
var builder = StringBuilderPool.Shared.Rent(256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roughly 50% of the string builders used in this codepath end up exceeding the 256 char limit, and thus aren't released back to the pool. I didn't address it in this PR, as delay calculating this ends up getting rid of the vast majority of the work, but it might be worth considering upping that 256 char limit used in StringBuilderPool.
Test insertion based on commit 1 to get speedometer numbers: https://devdiv.visualstudio.com/DevDiv/_git/VS/pullrequest/610228 |
…rce.ConsumeFlatContainerIndexAsync
d6a59c6
to
cb9bd46
Compare
I have pushed a change on behalf of the PR author to make |
NuGet.Common.PublicStrings | ||
~static NuGet.Common.PublicStrings.Culture.get -> System.Globalization.CultureInfo | ||
~static NuGet.Common.PublicStrings.Culture.set -> void | ||
~static NuGet.Common.PublicStrings.Invalid_AttributeValue.get -> string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just my personal opinion, but I don't like resource strings as public APIs. There's too much risk of runtime failures if new {x}
parameters are added to the string, and callers of the API don't update their string.Format
argument list. We've even had this type of failure within the repo, so it'll be much worse if anyone using the NuGet Client SDK packages in their own app use this string for some reason.
I don't think that duplicating the strings, and keeping them internal, in multiple assemblies is bad enough to prefer public APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/NuGet/NuGet.Client/pull/6265/files#r1970768798 for one complexity of not having these as public.
I totally understand its unusual here, but these are the only options I could come up with:
- Have a new
.resx
in any assembly that needs shared code under a common namespace so that the shared code can compile against it. - Hardcode this string and not have it localized at all
- Have a set of public strings in
NuGet.Common.dll
that we are confident won't change and are pretty generic (this one for example is just telling a user that an attribute value is invalid). InternalsVisibleTo
/friend assembly. I wish you could grant friend access to individual classes...- Make
Utf8JsonStreamReader
public so that our assemblies can share it - Refactor entire assemblies and combine them
- Other?
In this case, it seemed like the least worst option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want 6, but let's wait on that until we have a major version.
I'd prefer 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I've pushed an update to use a shared resource that gets compiled into each assembly that needs it. At least the string is only defined in one place.
<Compile Include="$(SharedDirectory)\SharedStrings.Designer.cs"> | ||
<DesignTime>True</DesignTime> | ||
<AutoGen>True</AutoGen> | ||
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking out this PR's branch, looking at Solution Explorer, SharedStrings.Designer.cs isn't nested under SharedStrings.resx. And editing the resx, VS edits ths csproj with the below change, which fixes the file nesting
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon> | |
<DependentUpon>SharedStrings.resx</DependentUpon> |
<Compile Include="$(SharedDirectory)\SharedStrings.Designer.cs"> | ||
<DesignTime>True</DesignTime> | ||
<AutoGen>True</AutoGen> | ||
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking out this PR's branch, looking at Solution Explorer, SharedStrings.Designer.cs isn't nested under SharedStrings.resx. And editing the resx, VS edits ths csproj with the below change, which fixes the file nesting
<DependentUpon>$(SharedDirectory)\SharedStrings.resx</DependentUpon> | |
<DependentUpon>SharedStrings.resx</DependentUpon> |
if (reader.ValueTextEquals(VersionsPropertyName)) | ||
{ | ||
reader.Read(); | ||
reader.ProcessStringArray(static (s, args) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is s
? I think it needs a better name
build/Shared/Utf8JsonStreamReader.cs
Outdated
@@ -109,6 +109,17 @@ internal void Skip() | |||
} | |||
} | |||
|
|||
internal void ProcessStringArray<TArg>(Action<string, TArg> callback, TArg arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only used in one place, and where it was used, I couldn't guess how the callback worked without looking at this code. So, I'm not sure it's a good candidate for a shared method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calling code was already nested fairly deeply, and this seemed to be somewhat useful, but I don't mind removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that as part of this, I added ReadTokenAsString as this class doesn't expose _reader and the existing ReadNextTokenAsString doesn't quite fix the caller's needs.
{ | ||
var doc = await stream.AsJObjectAsync(token); | ||
|
||
var reader = new Utf8JsonStreamReader(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we'll be better off creating a data type that matches the json schema (if we don't already have one), and "just" using System.Text.Json's built in JsonSerializer.DeserializeAsync(stream)
?
We created Utf8JsonStreamReader
because it's a super perf critical path, and the data types we have do not match the json schema, so to use the built in deserializer, we'd have to transform the data after deserialization, which would be additional memory allocations. Also, using System.Text.Json's JsonConverter causes System.Text.Json to load the entire json object in memory before calling the converter. When the root object uses a converter, that means the entire json document gets loaded in memory, which causes LOH allocations.
Frankly I'm having a hard time seeing how exactly this code reads the json and converts it into whatever data type the method's result is. I'm not sure if that's a reflection of the readability of the code, or just my mental state this morning. This code deserializses json like this, right? https://api.nuget.org/v3-flatcontainer/nuget.protocol/index.json
I feel like this code would be much more obvious if it was more like:
private static SortedDictionary<NuGetVersion, PackageInfo> ConsumeFlatContainerIndexAsync(Stream stream, string id, string baseUri)
{
var versionsList = await JsonSerializer.DeserializeAsync<FlatContainerVersionsList>(stream, cancellationToken);
SortedDictionary<NuGetVersion, PackageInfo> result = new(versionsList.Versions.Count;
foreach (var versionString in versionsList.Versions)
{
NuGetVersion version = NuGetVersion.Parse(versionString);
PackageInfo packageInfo = BuildModel(baseUri, id, versionString);
result.add(version, packageInfo);
}
}
One difference with this is that System.Text.Json would need to keep resizing the version list until the result is complete, whereas the current code only resizes the dictionary. On one hand resizing dictionaries is much more expensive than resizing lists, on the other hand I think dictionaries grow faster (and their initial size is definately much larger) so might require fewer resizes. Overall, I can't say whether the CPU time will be reduced, but I guess this approach might cause more allocations because of the backing list resizing, maybe.
Anyway, I want to make sure that we considered reviewing the overall approach this code uses to deserialize the json, rather than trying to port the old Newtonsoft.Json code to Nuget's "propriatary" Utf8JsonStreamReader, which exists because of limitations of System.Text.Json's APIs and reading the assets file, and those same constraints might not apply here.
I also didn't look at the call stack. I see that PackageInfo
is a private type to this class, which means it can't be returned directly from any public method. Are there other opportunities that will give us improvements in both performance and readability if we take a wider look at this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR the sorted dictionary is never needed. No more than a single PackageInfo
is needed for any API call, and if the code is refactored to remove the dictionary, I don't think that private class provides anything. A method to generate the URL to a nupkg download on demand is sufficient.
Taking a quick look at HttpFileSystemBasedFindPackageByIdResource.cs:
-
There's a public method
Task<IEnumerable<NuGetVersion>> GetAllVersionsAsync(...)
. This means all the work that the inner class spends creating the sorted dictionary and thePackageInfo
instances is wasted. It just returns the dictionary keys. It should be replaceable with something roughly similar to:var flatContainerVersions = await JsonSerializer.Deserialize(stream, cancellationToken); flatContainerVersions.Versions.Sort(); return flatContainerVersions.Versions;
-
All other public methods only care about a single package version. So, creating the sorted dictionary is quite literally pointless 🤣
These methods could instead get the
List<string>
list of versions from the JSON, do an OrdinalIgnoreCase to see if it's the same version, therefore reducing a whole lot of additionalNuGetVersion
allocations, remove theSortedDictionary
allocation and CPU sort time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sounds very interesting, but probably beyond what I can contribute right now.
Without fully understanding your suggestion, I had thought that these dictionaries were calculated just once and stored into _packageInfoCache, and thus future requests on this object would hit that cache for the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow, I missed the fact that the dictionary gets cached in _packageInfoCache
. But I think we can still change its type to ConcurrentDictionary<string, Task<List<string>>
to avoid even more allocations than what this PR is already doing.
But I think I lost an earlier point due to too much rambling.
I don't think that NuGet.Protocol needs to use our custom Utf8JsonStreamReader
. If the class defines a record FlatContainerVersionList(IReadOnlyList<string> Versions);
, we can use System.Text.Json's built-in deserializer, and not need to move or duplicate the custom Utf8JsonStreamReader
.
The reason I bring this up is because I consider custom JSON parsing code to be tech debt. I don't know if System.Text.Json will continue to make internal perf improvements, or if simple deserialization code is as performant as it's going to be in the next 10 years. But, by using Utf8JsonStreamReader
rather than JsonSerializer.Deserialize<T>()
, this code would miss out on any potential future changes. I would prefer to minimize use of Utf8JsonStreamReader
to only things we can't find a way to use System.Text.Json's built in functionality.
@jeffkl do you want to talk about this tomorrow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a PR with my idea: #6300
Getting perf measurements is taking more time than the code changes though. I took PerfView traces of a console app that references Microsoft.VisualStudio.SDK with the ".NET Alloc" collector (non-sampling alloc).
branch | GC stats total allocs | GC Heap Alloc Ignore Free Stacks ConsumeFlatContainer |
---|---|---|
dev | 252 MB | 5,512,529 |
this PR | 246 MB | 2,139,213 |
my PR | 244 MB | 669,655 |
I'm very much out my comfort zone with trying to understand what PerfView is telling me, so I this could be a complete misrepresentation. Plus, using the non-sampling .NET Alloc collector I've been told has such a large impact that some people don't trust it, but I wasn't trying to measure time performance, only allocations, so I hope it's ok. I didn't want to use the sampling alloc collector, because if there's only a small number of allocations, it might not sample these code paths and give us a misrepresentation. I feel like the sample alloc collector really needs a large number of traces to analyze. And finally, I only took a single trace for each branch, so run to run variance could completely invalidate the results as well.
Considering the relative differences in the apparent allocations in the ConsumeFlatContainerIndex method does not match up with proportional improvements in the GC statis total allocs, I feel like there's a real error in my testing methodology. But it still fills me with hope that I'm in the right direction with a good perf benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I liked about Utf8JsonStreamReader was that it seemed to be pretty frugal with only reading from the stream what was necessary, but maybe that doesn't help much in this case as the stream is fully read through each time. I guess I'm still not seeing how JsonSerializer.DeserializeAsync is not allocating more than Utf8JsonStreamReader, although it does provide simpler code.
I always use sampling for the allocation profiles I create, but the numbers are usually quite a bit larger, so to get data, I'd probably just create a larger scenario to measure by cleaning out the package cache and then restoring roslyn while sample profiling. Another option could be to use a benchmarking test.
It's late, but are there essentially three ideas in this PR?
- Use JsonSerializer.DeserializeAsync instead of Utf8JsonStreamReader
- Use a sorted ImmutableArray instead of a SortedDictionary
- Defer creating the package identity (and not needing the PackageInfo)
Maybe your PR should go ahead and then I can take a look and see if my PR built on it would provide any benefit? At that point, this PR would essentially just be changing the JsonSerializer.DeserializeAsync call to instead use Utf8JsonStreamReader (and the supporting code movement to do that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to sync about this today, I'm not sold one way or the other.
public string ContentUri | ||
{ | ||
get | ||
{ | ||
var normalizedVersionString = Identity.Version.ToNormalizedString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that ContentUri typically won't be used more than once. The dependency resolver will probably only get the property when it's about to make an HTTP request to the URL, and once it does that, it has no reason to ever request the URL again.
But I'm wondering if it's worthwhile caching in a field, just in case, so the second get is cheap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the numbers on the calling patterns on this still, but it felt like it was called once infrequently, and more than that almost never, so the overhead associated with another field seemed worse than just recalculating.
Going to close this one out in favor of Andy's PR: #6300 |
Bug
Fixes: NuGet/Home#14095
Description
Reduces allocation in
HttpFileSystemBasedFindPackageByIdResource.ConsumeFlatContainerIndexAsync
Utf8JsonStreamReader
to thebuild\Shared
location so it can be compiled into multiple assembliesUtf8JsonStreamReader
.Utf8JsonStreamReader
for reading some JSON inHttpFileSystemBasedFindPackageByIdResource
.PackageInfo.ContentUri
to be calculated on a deferred basis as it's very expensive to calculate and is used quite infrequently (I saw about 0.1% ofPackageInfo
objects created actually get queried for that data)PR Checklist