Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(infra): concurrent materializer tests #1243

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

LePremierHomme
Copy link
Contributor

@LePremierHomme LePremierHomme commented Dec 28, 2024

Introducing concurrent tests in materializer, enabling the generation of traceable workloads without significantly altering how test scenarios are written.

Existing tests refactored to use make_testnet instead of with_testnet.

Progress

  • scenario 1: native coin transfer (docket_tests::benches::test_native_coin_transfer)
  • scenario 2: contract deployment (docket_tests::benches::test_contract_deployment)
  • scenario 3: contract call (docket_tests::benches::test_contract_call)
  • framework: assert that block gas limit isn't the TPS bottleneck
  • framework: TPS reporting
  • framework: latency reporting
  • framework: reach 1,000 max concurrency
  • framework: reduce repeated scenario test code
  • framework: standardize report publication and regression checks
  • framework: NonceManager was introduced due to get_transaction_count not being reliable. need to double check that

For follow-up PRs

  • Create manifest files dynamically
  • Introduce profiling features
  • Introduce tracing / statistical analysis features

This change is Reviewable

@LePremierHomme LePremierHomme requested a review from a team as a code owner December 28, 2024 20:52
@LePremierHomme LePremierHomme marked this pull request as draft December 28, 2024 20:56
@LePremierHomme LePremierHomme changed the title feat(tests): concurrent materializer tests [WIP] feat(tests): concurrent materializer tests Dec 28, 2024
@LePremierHomme LePremierHomme changed the title [WIP] feat(tests): concurrent materializer tests feat(test): concurrent materializer tests Jan 7, 2025
@LePremierHomme LePremierHomme marked this pull request as ready for review January 7, 2025 12:23
@LePremierHomme LePremierHomme changed the title feat(test): concurrent materializer tests feat(infra): concurrent materializer tests Jan 7, 2025
}

pub async fn record(&mut self, label: String) {
let duration = self.start_time.unwrap().elapsed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel expect here if better if you assume caller know calling "start" should happen first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revise this API once the reporting summary is more solid.


#[derive(Default)]
pub struct NonceManager {
nonces: Arc<Mutex<HashMap<H160, U256>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bottom neck as well, every address is waiting on the same lock. Maybe this might help: https://github.com/xacrimon/dashmap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just a temporary solution, I was hoping to remove it entirely. If not, I'll optimize it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to try building in NonceManager from Ethers. Was there any problem with that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use dashmap. Last time I checked ~6 months ago, it still had soundness issues for async code.

@karlem karlem self-requested a review January 20, 2025 17:48
@karlem
Copy link
Contributor

karlem commented Jan 21, 2025

I have a suggestion about how to improve the framework design to make it cleaner and more intuitive. While the current implementation works, there are some areas where terminology and structure can be refined to improve clarity and usability. Consider the following approach:

  1. BenchmarkRunner

The BenchmarkRunner should:

Orchestrate the entire benchmarking process.

Execute each BenchmarkStep sequentially within the specified duration limit.

struct BenchmarkRunner {
    steps: Vec<BenchmarkStep>,
    max_duration: Duration,
}

impl BenchmarkRunner {
    fn new(steps: Vec<BenchmarkStep>, max_duration: Duration) -> Self;
    fn run(&self) -> BenchmarkResult;
}
  1. BenchmarkStep

The BenchmarkStep would be similar to the current ExecutionStep, but it would encapsulate a specific test function, making it more modular and allowing different functions to run within a single test.

struct BenchmarkStep<F>
where
    F: Fn(TestInput) -> TestResult + Send + Sync + 'static {
    concurrency: usize,      // Number of concurrent test executions (N)
    run_duration: Duration,  // Execution time duration (in seconds)
    test_fn: Arc<F>,
}

impl<F> BenchmarkStep<F>
where
    F: Fn(TestInput) -> TestResult + Send + Sync + 'static
{
    fn execute(&self, stop_flag: Arc<AtomicBool>) -> StepResult;
}

The stop_flag (using AtomicBool) is used to stop execution gracefully if the overall test time has expired or in case of any other issue. This is just a suggestion—other mechanisms, such as a Signal abstraction, could also be considered.

  1. Test Input and Result

The TestInput structure can remain as it is, without the current Bencher, simplifying the design.

struct TestResult {
    pub test_id: usize,
    pub step_id: usize,
    pub tx_hash: Option<H256>,
    pub tx_tracker: TransactionTracker,
    pub err: Option<anyhow::Error>,
}
  1. TransactionTracker

Instead of the existing Bencher, a TransactionTracker can be introduced to provide a clearer API. The current API has the potential for errors if start is forgotten, leading to incorrect results.

The new method should automatically set the submission time to ensure correct tracking without requiring manual intervention.

struct TransactionTracker {
    submission_time: Instant,
    mempool_time: Option<Instant>,
    block_time: Option<Instant>,
}

impl TransactionTracker {
    fn new() -> Self;
    fn mark_mempool(&mut self);
    fn mark_block(&mut self);
    fn get_mempool_latency(&self) -> Option<Duration>;
    fn get_block_latency(&self) -> Option<Duration>;
}
  1. StepResult

The StepResult should pre-calculate average latencies and other useful statistics for each step, making it equivalent to the current StepSummary.

struct StepResult {
    step_id: usize,
    tests: Vec<TestResult>,
    avg_mempool_latency: Duration,
    avg_block_latency: Duration,
    // Additional useful statistics for the step
}

impl StepResult {
    fn new(results: Vec<TestResult>) -> Self;
}
  1. Execution Engine

The execution engine should support concurrent execution of the test function for a specified duration, allowing precise control over execution time.

fn run_concurrent<F>(concurrency: usize, run_duration: Duration, test_fn: F, stop_flag: Arc<AtomicBool>)
where
    F: Fn(TestInput) -> TestResult + Send + Sync + 'static;

The stop_flag ensures the execution stops when the total benchmark duration is reached or when other termination conditions occur.

  1. BenchmarkResult

The BenchmarkResult serves as the overall execution summary, aggregating results from all benchmark steps.

struct BenchmarkResult {
    steps: Vec<StepResult>,
}

Conclusion

This revised design primarily improves terminology and clarity, making the framework more cohesive and intuitive. The key benefits of the proposed approach include:

Encapsulation: Each BenchmarkStep holds its own test function, making it easier to run varied tests within a single benchmark.

Clarity: Replacing Bencher with TransactionTracker simplifies the API and eliminates potential misuses.

Intuitive Structure: The separation of responsibilities across BenchmarkRunner, BenchmarkStep, and TransactionTracker makes the design easier to understand and maintain.

Overall, this proposal aligns closely with the current design but improves cohesion, intuitiveness, and robustness.

Copy link
Contributor

@karlem karlem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first major review batch (1/2). Tomorrow, a smaller set of reviews will follow.

Outstanding reviews:

  • The tests in benches.rs
  • Thoroughly review summary.rs


#[derive(Default)]
pub struct NonceManager {
nonces: Arc<Mutex<HashMap<H160, U256>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to try building in NonceManager from Ethers. Was there any problem with that?

let step_results = Arc::new(tokio::sync::Mutex::new(Vec::new()));
let execution_start = Instant::now();
loop {
if execution_start.elapsed() > step.duration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe?

while execution_start.elapsed() < step.duration {

.await
.unwrap();
tx = tx.gas(gas_estimation);
assert!(gas_estimation <= max_tx_gas_limit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this fail to whole test run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason for allowing this to pass, and having to deal with partial failures. However, this shouldn't normally fail, and if so should fail consistently.

}
input.bencher.mempool();

let receipt = pending
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a timeout here in case it isn't included?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will be added.

@karlem
Copy link
Contributor

karlem commented Jan 22, 2025

Both reviews (2/2) are now complete. That should be all for now :)

// Copyright 2022-2024 Protocol Labs
// SPDX-License-Identifier: Apache-2.0, MIT

pub struct Signal(tokio::sync::Semaphore);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uses a loom::sync::Mutex internally and is hence not signal safe, since pthread_mutex_lock is not signal safe. Please correct me if I am wrong.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was just browsing code base, but i was curious what is your concern here, e.g what signal safety means

it is generally working pattern to hold blocking locks that guard short sections executed in async context. blocking primitives are used consistently in the tokio codebase, one of the examples https://github.com/tokio-rs/tokio/blob/4b3da20c9847b202cf110f7b7772fd4674edaecf/tokio/src/sync/barrier.rs#L142-L148 , and some info here https://tokio.rs/tokio/tutorial/shared-state under Tasks, threads, and contention paragraph

specifically in semaphore it guards a section that doesn't yield by itself, and should be very fast to complete (i expect that to be sub 1us), so preemption by os is very unlikely.

that said, there is actually no waiting in this wrapper

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess you meant that if this is used directly in the signal interrupt handler, then it doesn't protect from re-entrancy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Signal needs some more context. My assumption was handling UNIX signals from doing a single pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @dshulyak mentioned, there's no actual scheduler waiting here (as initially planned), so introducing this primitive to wrap Semaphore turned out to be confusing and an overkill. I downgraded it to a simple AtomicBool wrapper here: d00f2e1

@LePremierHomme LePremierHomme linked an issue Feb 10, 2025 that may be closed by this pull request
Copy link
Contributor

@drahnr drahnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 27 files reviewed, 34 unresolved discussions (waiting on @cryptoAtwill, @dshulyak, @karlem, @LePremierHomme, and @raulk)


fendermint/testing/materializer/src/concurrency/reporting/dataset.rs line 43 at r6 (raw file):

    let median = if count % 2 == 0 {
        (sorted_data[count / 2 - 1] + sorted_data[count / 2]) / 2.0

Nit: that's not a median, pick one, I'd argue the else branch is all we need


fendermint/testing/materializer/src/concurrency/signal.rs line 4 at r6 (raw file):

// SPDX-License-Identifier: Apache-2.0, MIT

pub struct Signal(tokio::sync::Semaphore);

A better name and some documentation would be great


fendermint/testing/materializer/tests/docker_tests/benches.rs line 248 at r6 (raw file):

            deploy_tx.set_gas(gas_estimation);
            assert!(gas_estimation <= max_tx_gas_limit);

The setup code deserves a few comments


fendermint/testing/materializer/src/concurrency/reporting/summary.rs line 59 at r6 (raw file):

    }

    pub fn print(&self) {

Nit: move to std::fmt::Display implementation

Copy link
Contributor

@drahnr drahnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 28 files reviewed, 34 unresolved discussions (waiting on @cryptoAtwill, @karlem, @LePremierHomme, and @raulk)


#[derive(Default)]
pub struct NonceManager {
nonces: Arc<Mutex<HashMap<H160, U256>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use dashmap. Last time I checked ~6 months ago, it still had soundness issues for async code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

Single-node benchmarking
6 participants