-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Custom Data in Replay Files #78
Comments
Thanks for the detailed description. I want to understand the requirement better though. In my POV, the test environment should always be reproducible, regardless of In the example you gave, seed numbers in tests are usually fixed, so we always get the same result, so I don't really understand how that would fit here. Can you elaborate your example a bit? |
Hi @nicoddemus , thanks for your reply, sure, I am happy to explain that a bit more We use a mix of stochastic testing, Monte Carlo testing, and property-based testing. Since it's virtually impossible to cover all possible scenarios, we introduce a random seed to always generate a new set of tests. This helps us explore a wider input space over multiple runs while still maintaining some level of control. The key point is that our tests are not designed to be fully reproducible in the traditional sense (like always running with the same fixed inputs). Instead, we prioritize broader test coverage by allowing different runs to explore different edge cases. That said, if reproducibility is needed for debugging, we can reuse a specific seed value, ensuring the same test case can be re-executed when necessary. If you need more explanation, please let me know :) I know it is a bit of a different scenario, but specially for property base testing (like hypothesis) that might help other users as well. |
Thanks! Things are a bit clearer now. One thing is not very clear to me yet is how you would use this functionality... can you write a simple example of your use case, assuming we have this feature? For the sake of this exercise, we can assume we have a |
Yes, sure, no worries! I thought in two ways to implement/use this, either one is fine for me
I believe the fixture solution might be more generic. As an example with what you described I would use it something like this: import random
import pytest
import numpy as np
@pytest.fixture
def rng(pytest_replay_metadata, request):
current_test = request.node.nodeid
current_seed = pytest_replay_metadata.metadata.get(current_test, {}).get("seed", None)
if current_seed is None:
current_seed = random.randint(0, 9_999_999_999)
pytest_replay_metadata.metadata[current_test] = current_seed
return np.random.default_rng(seed=current_seed)
def test_rand_val(rng):
assert run_model_foo(rng.standard_normal((1_000_000, 1_000_000))) < 0.3 That is a just a toy example, the |
Hmm yeah OK, I think that clarifies things. If you would like to contribute that please go ahead. One change from the above example though, import random
import pytest
import numpy as np
@pytest.fixture
def rng(pytest_replay_metadata):
current_seed = pytest_replay_metadata.metadata.get("seed", None)
if current_seed is None:
current_seed = random.randint(0, 9_999_999_999)
pytest_replay_metadata.metadata["seed"] = current_seed
return np.random.default_rng(seed=current_seed)
def test_rand_val(rng):
assert run_model_foo(rng.standard_normal((1_000_000, 1_000_000))) < 0.3 |
Perfect! That works even better for me. I'll try to implement this feature in the coming days Thanks for your input! |
PR created! |
New `replay_metadata` fixture letting the user associate extra metadata information to each test. This allows users to attach extra information for reproducibility in some cases, for example when RNG is involved. Fixes #78 --------- Co-authored-by: Bruno Oliveira <bruno@soliv.dev>
|
Description
Hey folks! 👋
I have a use case where I need to add extra data to the replay file so I can properly reproduce my tests. One specific example is when a test involves random number generation—I need to store the seed that was used. Since each test might have a different seed, this would be super useful to ensure reproducibility.
Proposal
It would be great if
pytest-replay
allowed users to store custom data in the replay file. This could be done via a fixture or some other mechanism that lets users define extra metadata when a test runs. Then, when replaying, this data would be available to ensure the test environment is correctly reconstructed.Implementation Idea
Limitations
I know there are some edge cases, like when a core dump happens before the data is written. But even with that limitation, I think this would be a really nice addition for a lot of users (at least for me! 😄).
Happy to Contribute 🚀
If this sounds like a useful feature, I’d be happy to create a PR and implement it! Let me know what you think. 😊
cc: @tadeu @nicoddemus @prusse-martin
The text was updated successfully, but these errors were encountered: