-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add StateSync
component
#650
base: new-state-sync
Are you sure you want to change the base?
Conversation
8731b2b
to
d54e186
Compare
I need to tidy up the code and update the documentation, but the basic idea with this updated version is that we have callback functions (defined in These callbacks could also work as a sort of event reaction, as the user could redefine them and they would be called on each sync with all the new and updated node information. |
6baeafa
to
50938fb
Compare
50938fb
to
2f26007
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Not a full review but I left some comments inline. Overall, I think the approach works - but we need to clean this up quite a bit and make sure the comments are accurate.
Also, I would suggest moving out some "auxiliary changes" (e.g., implementing interior mutability) into another PR so that this PR could focus on the state sync related logic.
/// Callback to be executed when a nullifier is received in the sync response. It receives the | ||
/// nullifier update received from the node and the list of transaction updates that were committed | ||
/// in the block. | ||
/// | ||
/// It returns two optional notes (one input and one output) that should be updated in the store and | ||
/// an optional transaction ID if a transaction should be discarded. | ||
pub type OnNullifierReceived = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comments are outdated here. Also, why do we need this callback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need it, the Client
just assumes all nullifiers are relevant (i.e. the default implementation always returns true). The purpose of this callback is so that the sync component is more extensible and so that users can react to nullifiers arriving in the sync process.
We could remove it if we want a simplified version of the component on this first iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably get rid of this for now as I'm not 100% sure we'll need it. It'll simplify the PR a bit and if we do need it, should be pretty simple to add it in the future.
Just created two auxiliary PRs (#726 and #727) that will merge into the You can review the new PRs first and wait to review this one until those are merged so that is easier. |
6a4bf21
to
fa830a5
Compare
The number of files to review has been reduced after merging the auxiliary PRs, the remaining changes are more relevant to the addition of the component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! I left some comments mostly about docs (feel free to disregard the more stylistic ones if you think the current version is better) and code structure, but the overall refactor and functionality looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good! Not a full review still, but I left some comments inline. The main ones are about bringing the code up to date with the latest changes (e.g., not having nullifiers included in the state sync request).
nullifiers_tags.append( | ||
&mut state_sync_update | ||
.note_updates | ||
.updated_input_notes() | ||
.filter(|note| { | ||
note.is_committed() | ||
&& !nullifiers_tags.contains(&get_nullifier_prefix(¬e.nullifier())) | ||
}) | ||
.map(|note| get_nullifier_prefix(¬e.nullifier())) | ||
.collect::<Vec<_>>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be converted into a helper method on StateSyncUpdate
? Maybe something like:
impl StateSyncUpdate {
pub fn append_nullifier_tags(&self, target: &mut Vec<u16>) {
...
}
}
But also, with some of the latest updates in miden-node
, we no longer need to send nullifier tags with the state sync requests but rather do them at the end of the sync process via the GetNullifiersByPrefix
endpoint.
We should probably integrate these changes here as it may simplify a few things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! This will be removed in #751 and merged into this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up changing it in this PR in case #751 gets closed after this discussion
let current_block_num = (u32::try_from(current_partial_mmr.num_leaves() - 1) | ||
.expect("The number of leaves in the MMR should be greater than 0 and less than 2^32")) | ||
.into(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not get current_block_num
here simply as:
let current_block_num = state_sync_update.block_num;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I have to move this conversion to the state_sync_update
construction so that the block num is up to date in the first call.
state_sync_update.block_num = response.block_header.block_num(); | ||
|
||
// We don't need to continue if the chain has not advanced, there are no new changes | ||
if response.block_header.block_num() == current_block_num { | ||
return Ok(false); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can do what I described in the last comment, we could change this to be something like:
if response.block_header.block_num() == state_sync_update.block_num {
return Ok(false);
}
state_sync_update.block_num = response.block_header.block_num();
And then we may not even need the current_block_num
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current_block_num
variable is also used in the request before this line so I think we need it anyways.
Edit: Sorry, I think I misunderstood the suggestion the first time I responded. I was just doing an overview of all comments and this is totally possible. I'll be removing this variable.
let account_updates = | ||
self.account_state_sync(accounts, &response.account_hash_updates).await?; | ||
|
||
state_sync_update.account_updates = account_updates; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something, but wouldn't this overwrite account updates received in the previous step?
But also, is there a reason to fetch updated public account states on every step rather than do that at the end the sync for all updated public accounts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something, but wouldn't this overwrite account updates received in the previous step?
Yes, I should have used extend
here.
But also, is there a reason to fetch updated public account states on every step rather than do that at the end the sync for all updated public accounts?
Yes, this would be better. I'll change it so that the public account fetch is done at the end.
crates/rust-client/src/sync/mod.rs
Outdated
self.store | ||
.apply_state_sync(state_sync_update) | ||
.await | ||
.map_err(ClientError::StoreError)?; | ||
|
||
if response.chain_tip == response.block_header.block_num() { | ||
Ok(SyncStatus::SyncedToLastBlock(sync_summary)) | ||
} else { | ||
Ok(SyncStatus::SyncedToBlock(sync_summary)) | ||
} | ||
} | ||
|
||
// HELPERS | ||
// -------------------------------------------------------------------------------------------- | ||
|
||
/// Returns the [`NoteUpdates`] containing new public note and committed input/output notes and | ||
/// a list or note tag records to be removed from the store. | ||
async fn committed_note_updates( | ||
&mut self, | ||
committed_notes: Vec<CommittedNote>, | ||
block_header: &BlockHeader, | ||
) -> Result<(NoteUpdates, Vec<NoteTagRecord>), ClientError> { | ||
// We'll only pick committed notes that we are tracking as input/output notes. Since the | ||
// sync response contains notes matching either the provided accounts or the provided tag | ||
// we might get many notes when we only care about a few of those. | ||
let relevant_note_filter = | ||
NoteFilter::List(committed_notes.iter().map(CommittedNote::note_id).copied().collect()); | ||
|
||
let mut committed_input_notes: BTreeMap<NoteId, InputNoteRecord> = self | ||
.store | ||
.get_input_notes(relevant_note_filter.clone()) | ||
.await? | ||
.into_iter() | ||
.map(|n| (n.id(), n)) | ||
.collect(); | ||
|
||
let mut committed_output_notes: BTreeMap<NoteId, OutputNoteRecord> = self | ||
.store | ||
.get_output_notes(relevant_note_filter) | ||
.await? | ||
.into_iter() | ||
.map(|n| (n.id(), n)) | ||
.collect(); | ||
|
||
let mut new_public_notes = vec![]; | ||
let mut committed_tracked_input_notes = vec![]; | ||
let mut committed_tracked_output_notes = vec![]; | ||
let mut removed_tags = vec![]; | ||
|
||
for committed_note in committed_notes { | ||
let inclusion_proof = NoteInclusionProof::new( | ||
block_header.block_num(), | ||
committed_note.note_index(), | ||
committed_note.merkle_path().clone(), | ||
)?; | ||
|
||
if let Some(mut note_record) = committed_input_notes.remove(committed_note.note_id()) { | ||
// The note belongs to our locally tracked set of input notes | ||
|
||
let inclusion_proof_received = note_record | ||
.inclusion_proof_received(inclusion_proof.clone(), committed_note.metadata())?; | ||
let block_header_received = note_record.block_header_received(block_header)?; | ||
|
||
removed_tags.push((¬e_record).try_into()?); | ||
|
||
if inclusion_proof_received || block_header_received { | ||
committed_tracked_input_notes.push(note_record); | ||
} | ||
} | ||
|
||
if let Some(mut note_record) = committed_output_notes.remove(committed_note.note_id()) { | ||
// The note belongs to our locally tracked set of output notes | ||
|
||
if note_record.inclusion_proof_received(inclusion_proof.clone())? { | ||
committed_tracked_output_notes.push(note_record); | ||
} | ||
} | ||
|
||
if !committed_input_notes.contains_key(committed_note.note_id()) | ||
&& !committed_output_notes.contains_key(committed_note.note_id()) | ||
{ | ||
// The note is public and we are not tracking it, push to the list of IDs to query | ||
new_public_notes.push(*committed_note.note_id()); | ||
} | ||
} | ||
|
||
// Query the node for input note data and build the entities | ||
let new_public_notes = | ||
self.fetch_public_note_details(&new_public_notes, block_header).await?; | ||
self.update_mmr_data().await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we separate update_mmr_data()
from apply_state_sync()
? I would have expected that whatever is being done inside update_mmr_data()
is done as a part of apply_state_sync()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's one of the followup points but I was planning to do it in a future PR as this one was getting pretty big.
crates/rust-client/src/sync/mod.rs
Outdated
pub(crate) fn get_nullifier_prefix(nullifier: &Nullifier) -> u16 { | ||
(nullifier.inner()[3].as_int() >> FILTER_ID_SHIFT) as u16 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in one of the previous comments, we should be able to get rid of this function now.
// STATE SYNC UPDATE | ||
// ================================================================================================ | ||
|
||
/// Contains all information needed to apply the update in the store after syncing with the node. | ||
#[derive(Default)] | ||
pub struct StateSyncUpdate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably move all child structs of this struct (e.g., BlockUpdates
, NoteUpdates
, AccountUpdates
, TransactionUpdates
) into this file. Unless they are being use din multiple places?
#[derive(Debug, Clone, Default)] | ||
pub struct BlockUpdates { | ||
/// New block headers to be stored, along with a flag indicating whether the block contains | ||
/// notes that are relevant to the client and the MMR peaks for the block. | ||
block_headers: Vec<(BlockHeader, bool, MmrPeaks)>, | ||
/// New authentication nodes that are meant to be stored in order to authenticate block | ||
/// headers. | ||
new_authentication_nodes: Vec<(InOrderIndex, Digest)>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, and this is probably fine for now, but for a client that is online and syncs very frequently with the network, we may end up storing a lot of extra data. Ideally, if a block header doesn't have relevant notes, we shouldn't store it, unless it is the last block in the chain. Thought, I wonder if this has any implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but wouldn't we need to store some minimal data for the chain mmr? even if the block isn't relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these changes needed in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change from u32
to usize
is because of a cast_possible_truncation
clippy error, this file wasn't being included in previous versions of the web client so clippy didn't check this file.
I'll be updating this PR with the new state sync rpc request to fix the CI tests. Originally I wanted to wait for #751 but the stream functionality is currently being discussed. |
8a5a6a6
to
3dc16d5
Compare
This PR adds a new
StateSync
component that encompasses all the logic for the client's synchronization with the node. The idea is that this component can be instantiated separate from the client and can be ran without modifying the state of the client. The component exposes async_state
method that will do the necessary rpc calls and return aStateSyncUpdate
with all the changes that should be applied to the store.state_sync
methodThe new
state_sync
method no longer exposes the "steps" of the rpc response. It returns all the necessary updates to be up to date with the node in a single call.Additionally, the component doesn't have direct access to the store. This is because the sync process now is indepentent on the state of the store and users can pass the information they want the component tu use and simulate syncs without the store changing. For this reason, the client will gather all the needed information up front and give it to the component, including the headers of every tracked account, all the unspent notes and the current partial MMR. The only way the store can be accessed during the sync process is using the callbacks explained in the next section.
Callbacks
The component receives two callbacks that will be called in specific events during the sync:
OnNoteReceived
: This callback will be called on each new committed note that is received in the sync state endpoint response. It receives aCommittedNote
that contains the Id of the note being committed and an optionalInputNoteRecord
that corresponds to the state of the note in the node (only if the note is public), this allows the callback to check whether the new public note is relevant to the client or not. The callback should return a boolean indicating if this newCommittedNote
event should be applied to the store or not. The default implementation ison_note_received
.OnNullifierReceived
: This callback will be called on each new nullifier that is received in the sync state endpoint response. It receives aNullifierUpdate
that contains the new nullifier along with its block number. The callback should return a boolean indicating if this newNullifierUpdate
event should be applied to the store or not. There's no default implementation, the client will simply returntrue
for every nullifier update.Other changes
rpc_api
in the client was changed fromBox
toArc
so it can be shared with the component. This also means that I had to change the base struct to give it interior mutability.crates/rust-client/src/sync/mod.rs
to the newcrates/rust-client/src/sync/state_sync.rs
.TODOs
apply_mmr_changes
part inside the component. I feel like it could be better documented and it should fit nicely inside thenote_state_sync
logic (it's just a different note state change).