-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore black box tests removed from sequencer-relayer #1008
Labels
Comments
github-merge-queue bot
pushed a commit
that referenced
this issue
Apr 25, 2024
## Summary The sequencer-relayer has been updated to submit blobs directly to the Celestia app via its gRPC interface rather than using the Celestia node as a proxy. ## Background There have been issues experienced in the past when using the Celestia node for blob submissions related to sequences and timeouts. We want to avoid such issues by "cutting out the middle man" and controlling the flow of submissions directly inside the relayer. ## Changes This PR adds a lot of new proto files and their corresponding generated Rust sources, although they have been reduced to the minimal set of messages/rpcs required for this new feature. Aside from these, most of the new code has been isolated to a new module `celestia_client` inside the `relayer` module. It should be reasonably self-contained, with a view to possibly extracting it to a separate crate in the future, and/or providing a generic client API should we wish to add handling for more chains in the future. At a high level, the new client is constructed via a builder, which may need multiple attempts to create the client in the case that the Celestia app is unreachable. Retries are controlled via `tryhard`, maxing out at 30s and with infinite attempts. A shutdown signal can cancel this, allowing for graceful exit. The builder retrieves initial info from the Celestia app (e.g. chain ID, gas costs). The immutable chain ID is held as a member of the client, but the other values (being variable) are cached in the `State` struct and can be retrieved and cached again should an attempt to submit blobs fail. The `CelestiaClient` only has one public method: `try_submit`. This is called in the existing `relayer::write::submit_with_retry` function in much the same way as previously. The main difference here is that the client's `try_submit` method is also passed a tokio watch receiver containing the optional last error. When first submitting a given batch of blobs, the watch channel holds `None`. On each failed attempt, the `on_retry` closure sends the error returned by `try_submit`. On a fresh batch of blobs, a new watch channel is constructed, again initialized with `None`. The purpose of providing the last error to the client is that it can be interrogated for a failure due to the fee being too low. Such an error holds information on the required fee, and on reattempt, the client uses this value in preference to its own calculated fee. The general flow inside `try_submit` is as follows: - if the previous submit attempt failed, fetch the cost info from the Celestia app and cache it - fetch our account number and sequence number from the Celestia app - create, sign and send the blob transaction via the Celestia `BroadcastTx` method - poll the Celestia app once per second via the `GetTx` method ## Testing Most of the new functionality has unit tests. The `CelestiaClient` mainly has member methods calling gRPC requests, however handling the responses is delegated to free functions. As well as code clarity, this allows for comprehensive unit tests of these functions without the need for running/mocking a Celestia server. However, there are still missing tests for the `try_submit` method. That might be best done via the new `astria-grpc-mock` crate, or at a higher level by refactoring the existing black box tests (almost all of which have been removed for now). I have run the sequencer against a local network running on k8s and observed the blob submission to behave as expected via the logging. This includes temporarily changing the fee to a low value for the first submission attempt, which caused the retry behaviour to identify and use the correct fee from the error response of the first attempt. I have also had the smoke test pass locally. ## Breaking Changelist - `ASTRIA_SEQUENCER_RELAYER_CELESTIA_BEARER_TOKEN` has been replaced by `ASTRIA_SEQUENCER_RELAYER_CELESTIA_APP_KEY_FILE` which expects a path to a hex-encoded secp256k1 secret key corresponding to an account on the Celestia chain. - `ASTRIA_SEQUENCER_RELAYER_CELESTIA_ENDPOINT` has been replaced by `ASTRIA_SEQUENCER_RELAYER_CELESTIA_APP_GRPC_ENDPOINT` which represents the endpoint of the gRPC server on the Celestia app. ## Further Discussion - integration testing/black box testing. [Discussed, agreed to restore them in follow-up. Ticketed at #1008.] - the polling period for `GetTx` after a successful `BroadcastTx` is hard-coded to one second. Seems like this should be configurable via an env var. [Discussed, and the period is now variable depending upon the result of `GetTx`. Also, given that the Celestia app will likely be managed by us, we can control its rate-limits, meaning there's less likelihood of needing to change the polling period in the relayer.] ## Related Issues Closes #921.
github-merge-queue bot
pushed a commit
that referenced
this issue
Apr 30, 2024
## Summary Removes `astria-celestia-client` as a direct dependency from astria-conductor and astria-sequencer-relayer. ## Background We are no longer depending on any of the extension-traits defined in astria-celestia-client. It is merely used for constructing Celestia namespaces and reexporting of celestia crates. ## Changes - Move the `namespace_v0*` construction utilities to `astria-core` behind a `celestia` feature. - Remove `astria-celestia-client` from `astria-conductor` and `astria-sequencer-relayer`. - The `astria-celestia-mock` crate remains until #1008 is done. ## Testing Not necessary, just refactoring. All tests still pass.
github-merge-queue bot
pushed a commit
that referenced
this issue
Apr 30, 2024
## Summary A new configuration env var has been added to allow filtering data submission to a subset of rollups. ## Background This allows more flexibility in what is actually submitted to the Celestia chain. ## Changes A new env var was added, mapped to a new field `Config::rollup_id_filter`. This represents a list of rollups whose transactions should be included in blobs submitted to Celestia. Rollups not in the list do not have their data submitted. However, if the list is empty, no filtering is applied; all rollups have their data submitted. The collection of filtered rollup IDs is passed down to the `write::conversion::convert` function where sequencer blocks are converted to blobs. The blobs are filtered inside this function, partly since this is relatively early in the flow, but also so that the `ConversionInfo` struct can carry information about blobs being excluded. This data is logged at the start of `write::submit_blobs` in an info-level log message. It might be worthwhile adding metrics around the number of excluded rollups per submission, but I didn't see a clear benefit to that and it can be added in a follow-up PR if required. ## Testing - Added unit tests covering parsing of the config string to a collection of `RollupId`s. - ~Added a black box test where a sequencer block is constructed with several transactions, half from included rollups and half from excluded ones. Existing black box tests ensure that when the filter is empty, no filtering is done.~ These all got removed in #963, and should be restored as part of #1008. ## Breaking Changelist - Added `ASTRIA_SEQUENCER_RELAYER_ONLY_INCLUDE_ROLLUPS` which can be empty or can be a comma-separated list.
github-merge-queue bot
pushed a commit
that referenced
this issue
May 3, 2024
## Summary The black box tests were removed in a recent PR. This PR reinstates them using the new mock gRPC framework `astria-grpc-mock`. ## Background The tests would have needed heavy refactoring in #963, and we wanted to avoid adding a lot more code to that already-large PR. We decided to temporarily delete them and reinstate them using `astria-grpc-mock` to mock responses from the Celestia app. ## Changes The tests removed in #963 and a single test briefly added in #1001 then removed again have been restored. I also added a new test to check the relayer shuts down in a timely manner. Previously the tests leaned heavily on counts of blobs received by the mock Celestia app. The current tests retain these checks, but also query the `/status` endpoint of the sequencer-relayer to confirm its state. There was also a single test previously which checked the state of the postsubmit.json file. I didn't think this was necessary given that we're querying the state in a more "black box" way now (via the http server), but I didn't remove the function to perform this check (`TestSequencerRelayer::assert_state_files_are_as_expected`) pending a decision on whether to reinstate that check or not inside the `later_height_in_state_leads_to_expected_relay` test. As @SuperFluffy predicted, all the new protobuf packages added in #963 had to be added to the `pbjson_build` builder to generate the serde impls required for using them in `astria-grpc-mock`. This accounts for the vast majority of the new LoC in this PR. I made a few small changes to the mock framework: - added `Mock::up_to_n_times` to avoid failures due to a single-use mock response being sent multiple times - added `DynamicResponse` to support constructing mock responses which can be passed the relevant gRPC request - changed `MockGuard::wait_until_satisfied` to take `self` by ref rather than value, since if it consumes self and `wait_until_satisfied` times out, the `MockGuard` panics in its `Drop` impl, meaning we don't get a good indication from e.g. a `tokio::time::timeout` as to what went wrong Most of the new functionality lives in `MockCelestiaAppServer` and `TestSequencerRelayer`. For the former, all gRPCs except for `BroadcastTx` and `GetTx` are set up to always return valid-enough responses - i.e. these don't need to be mounted individually in every test case. In the case of `MockCelestiaAppServer`, the new functionality is a combination of support for mounting mock responses and functions with timeouts to query the `/status`, `/healthz` and `/readyz` http endpoints of the sequencer-relayer. ## Testing These changes are tests. Closes #1008.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
We need to restore the black box tests removed as part of #963. Ensure we also restore the single test added in f236c79 and subsequently deleted when merging
main
into #1001.They should be re-implemented using the new grpc-mock crate.
The text was updated successfully, but these errors were encountered: