Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocols in providers #465

Merged
merged 9 commits into from
Jan 27, 2024
Merged

Protocols in providers #465

merged 9 commits into from
Jan 27, 2024

Conversation

hannahhoward
Copy link
Collaborator

@hannahhoward hannahhoward commented Jan 25, 2024

Goals

  • Allow specifying providers with specific protocols
  • Cleanup the FixedPeers on request concept that was previously inerted a bit haphazard
  • Update to new boxo fork (on 0.15.0)

Implementation

  • Removed retriever.CandidateFinder -- this was a confusing concept, since there is already a types.CandidateFinder that is different. I instead changed this to types.CandidateSource.
  • retriever.CandidateFinder had two methods --> FindCandidates and FindCandidatesAsync. However, we only used FindCandidatesAsync. Moreover, this is a bit of a weird name, since while the method takes a callback, it is not actually an Async method in the sense that it is intended to be a blocking call. So on types.CandidateSource, we now have only one method FindCandidates, and it has the signature of the old FindCandidatesAsync. (note to keep in line with the interface name, I'm also find to rename this SourceCandidates if it feels important)
  • created a new Provider type -- this is similar to a types.RetrievalCandidate, but without the cid (I considered going back and removing this rootCID entirely but I think maybe we use it in several places? Also does it always make sense for it to be the root cid?)
  • modified the DirectCandidateFinder so it can be used with a fixed set of Providers with their own protocols, and you now specify an option if you want to add the protocol discovery features (or just return candidates immediately)
  • added a new syntax for specifying a peer+protocol combo -- essentially at the end of your multiaddr string, you add a +. This does remove + as an option to use in your multiaddr, but I'm not immediately aware of any cases where this is an allowed character in any of the allowed libraries (other than maybe "httppath" which isn't merged? Not even sure usually + is used in query strings).
  • updated ParseProviderStrings & GetDescriptorString to handle the parsing
  • update feat/lassie-fork over in boxo to be rebased on v0.15.0 and update dependency here
  • this is also supported at the CLI level

For Discussion

  • I originally tried to put CandidateSource directly on types.RetrievalRequest but then that gets confusing with GetDescriptorString (we'd have to make candidateSource itself have a descriptorString)
  • I then tried making CandidateSource a method on types.RetrievalRequest but that led to a lot of import cycling and I didn't get to the end of that :)
  • not sure the best naming here open to input.

instead of retriever.CandidateFinder, use types.CandidateSource, converging on single method
FindCandidates
instead of using peer.AddrInfo, protocols now have providers
@hannahhoward hannahhoward requested a review from rvagg January 25, 2024 02:56
@codecov-commenter
Copy link

codecov-commenter commented Jan 25, 2024

Codecov Report

Attention: 24 lines in your changes are missing coverage. Please review.

Comparison is base (f2e96a7) 75.63% compared to head (ab1e2b5) 75.79%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #465      +/-   ##
==========================================
+ Coverage   75.63%   75.79%   +0.16%     
==========================================
  Files          87       87              
  Lines        6657     6656       -1     
==========================================
+ Hits         5035     5045      +10     
+ Misses       1341     1335       -6     
+ Partials      281      276       -5     
Files Coverage Δ
cmd/lassie/flags.go 51.02% <100.00%> (ø)
cmd/lassie/main.go 47.66% <100.00%> (ø)
pkg/indexerlookup/candidatesource.go 66.29% <100.00%> (ø)
pkg/indexerlookup/options.go 31.66% <ø> (ø)
pkg/internal/itest/mocknet/mocknet.go 76.51% <100.00%> (ø)
pkg/internal/itest/testpeer/generator.go 75.00% <100.00%> (ø)
pkg/internal/testutil/gen.go 63.28% <100.00%> (+0.58%) ⬆️
pkg/internal/testutil/mockcandidatefinder.go 83.33% <100.00%> (ø)
pkg/internal/testutil/verifier.go 96.49% <100.00%> (ø)
pkg/retriever/assignablecandidatefinder.go 100.00% <100.00%> (ø)
... and 7 more

... and 4 files with indirect coverage changes

@@ -258,8 +259,24 @@ func ParseProviderStrings(v string) ([]peer.AddrInfo, error) {
return nil, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this make sense now? seems a reasonable assumption if they're specifying the address in http:// form

Suggested change
}
}
protocols = append(protocols, metadata.IpfsGatewayHttp{})

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had this but then that means GetDescriptorString produces a +http which is a bit odd :P

@@ -200,6 +201,21 @@ func TestRequestStringRepresentations(t *testing.T) {
_, err := ParseProviderStrings("http://127.0.0.1:5000/nope")
require.ErrorContains(t, err, "paths not supported")
})

t.Run("fixed peer, protocol included", func(t *testing.T) {
pps, err := ParseProviderStrings("/ip4/127.0.0.1/tcp/5000/p2p/12D3KooWBSTEYMLSu5FnQjshEVah9LFGEZoQt26eacCEVYfedWA4+bitswap")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some multiaddr aficionado is going to find this one day and get all twitchy

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some day, but not this day :P

@@ -258,8 +259,24 @@ func ParseProviderStrings(v string) ([]peer.AddrInfo, error) {
return nil, err
}
} else {
parts := strings.Split(v, "+")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's likely possible to break multiaddr with the + assumption, you could probably base64 encode the peer ID and end up with + in the /p2p/ component. Not sure how much it's worth worrying about since this is pretty niche use, although this can come in through the CLI and I don't think we really have a handle on how that might be used.

So, I wonder if it shouldn't be fatal below - if we hit the default, print a warning and fall-back to addrString = v and protocols = []?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think I got a decent solution to this -- check the change

Copy link
Member

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I'll let you decide how to deal with my feedback, I'm fine either way

@rvagg
Copy link
Member

rvagg commented Jan 25, 2024

TestAssignableCandidateFinder race test failures in CI probably worth investigating!

hannahhoward and others added 3 commits January 25, 2024 18:27
fix race in DirectCandidateSource when not using libp2p
Co-authored-by: Rod Vagg <rod@vagg.org>
@hannahhoward hannahhoward merged commit bf83e4b into main Jan 27, 2024
9 checks passed
@hannahhoward hannahhoward deleted the feat/protocols-in-providers branch January 27, 2024 01:37
@bajtos
Copy link
Contributor

bajtos commented Jan 29, 2024

  • Is there any documentation on the new provider format? (We are using the HTTP interface provided by the Lassie Daemon.)
  • The change looks fully backwards compatible, could you please confirm?

@rvagg
Copy link
Member

rvagg commented Jan 29, 2024

Not documented publicly (yet) but it impacts the "providers" query string and CLI argument, it's backward compatible and this just adds the ability to tack on protocols after the end of the provider ID string, just add a +http, +bitswap, +graphsync to limit the protocol(s) that will be attempted with the provider. They are additive, so:

  • provider=/dns/beep.boop.com/tcp/3747/p2p/12D3KooWDXAVxjSTKbHKpNk8mFVQzHdBDvR4kybu582Xd4Zrvagg will do what it does now, query the peer and see what it supports and go from there
  • provider=/dns/beep.boop.com/tcp/3747/p2p/12D3KooWDXAVxjSTKbHKpNk8mFVQzHdBDvR4kybu582Xd4Zrvagg+bitswap will skip the query and go straight to attempting an http retrieval to the address
  • provider=/dns/beep.boop.com/tcp/3747/p2p/12D3KooWDXAVxjSTKbHKpNk8mFVQzHdBDvR4kybu582Xd4Zrvagg+bitswap+graphsync` will queue up the provider on both bitswap and graphsync protocols and end up attempting them both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants