Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize e2e tests using docker #2023

Closed
wants to merge 72 commits into from

Conversation

MartinquaXD
Copy link
Contributor

@MartinquaXD MartinquaXD commented Oct 27, 2023

Description

Our e2e test setup has some significant limitations at the moment:

  • 1 test at a time because we only have a single postgres DB
  • user needs to have anvil installed
  • user needs to have our docker-compose script running

Changes

Where to begin... 😅

  • spawn a dockerized version of anvil and postgres per test
  • made sure to properly clean up containers even on panics or manual aborts
  • lots of small changes to get rid of hard coded URLs / ports in tests
  • bumped the flyway version to significantly speed up test and make them actually reliable (old version could get stuck during migrations)
  • split e2e tests into 4 parallel github CI runners with 1 test each (because having 1 runner with n tests in parallel is actually slower now)
  • run CI with --nocapture again because --failure-output immediate might be related to tests getting stuck (weird 🤷‍♂️ )
  • reduce CI job timeout to 20 minutes
  • run e2e CI jobs with a timeout of 2 minutes per test and 4 retries per test to get rid of flaky test entirely, this might actually not needed anymore because of dockerized anvil but better safe than sorry, I guess
  • fixed a race condition that was caused by caching balances in 2 places (CachingBalancerFetcher and SolvableOrdersCache)
  • use websockets for e2e test code because they are more reliable (services spawned by e2e test still use http requests, though)

There are still a couple of hacky things left but the idea overall appears to be sound.

How to test

Build the migrations container image (once). This currently needed because I didn't find a nice way yet to build images from Dockerfiles using the new crate bollard yet.:
docker build -t main-migrations:latest -f ./docker/Dockerfile.migration .

Run the tests with either command:
cargo test -- --ignored local_node
cargo nextest run -- --ignored local_node

Celebrate fast tests:
Screenshot 2023-10-27 at 12 39 01

Please actually run the tests on your machines to see if we encounter any issues. Also try cancelling the tests and introduce panics in some tests to verify that the docker container cleanup is robust on multiple platforms.

Also @sunce86 I remember you are using windows. I did not yet implement signal handlers for aborting tests on windows because I'm not sure which signals to actually catch there. Do you know what gets sent when you ctrl-c something?

Related Issues

Fixes: #1845
Possibly also gets rid of the remaining flakyness of our test due to our switch to anvil.

@fedgiac
Copy link
Contributor

fedgiac commented Oct 27, 2023

This is great! I was able to run the tests on my machine at commit bc6ccd2, the setup works at the start but then some random test fails (not the same test as in CI, but it's similar, with first a few successes).
Then, if I try again all tests fail, the error being status_code: 409, message: "Cannot kill container: [...]: Container [...] is not running.

@MartinquaXD
Copy link
Contributor Author

As is this is not salvageable as an actual PR.
But it still makes sense to port individual changes from this PR to at least make some progress on this.

@MartinquaXD MartinquaXD closed this Feb 5, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Feb 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In-memory database for E2E tests
2 participants