Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Increase timeout further for controller_managers_srv test #1260

Conversation

christophfroehlich
Copy link
Contributor

The increased timeout from #1224 is not sufficient, see this job.

@christophfroehlich christophfroehlich added backport-humble This label should be used by maintainers only! Label triggers PR backport to ROS2 humble. backport-iron labels Dec 29, 2023
Copy link

codecov bot commented Dec 29, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (54ced2a) 47.82% compared to head (244e09b) 47.82%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1260   +/-   ##
=======================================
  Coverage   47.82%   47.82%           
=======================================
  Files          40       40           
  Lines        3448     3448           
  Branches     1869     1869           
=======================================
  Hits         1649     1649           
  Misses        456      456           
  Partials     1343     1343           
Flag Coverage Δ
unittests 47.82% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the need for this increase in the timeout is coming from this specific test TestControllerManagerSrvs.list_large_number_of_controllers_with_chains that I've added recently.

TEST_F(TestControllerManagerSrvs, list_large_number_of_controllers_with_chains)
{
/// The simulated controller chaining is:
/// test_controller_name_1 -> chain_ctrl_3 -> chain_ctrl_2 -> chain_ctrl_1
/// &&
/// test_controller_name_2 -> chain_ctrl_6 -> chain_ctrl_5 -> chain_ctrl_4
/// &&
/// test_controller_name_7 -> test_controller_name_8
/// &&
/// There are 100 more other basic controllers and 100 more different broadcasters to check for
/// crashing
/// NOTE: A -> B signifies that the controller A is utilizing the reference interfaces
/// exported from the controller B (or) the controller B is utilizing the expected interfaces
/// exported from the controller A

Do you think it make sense to relax the test by reducing the number of controllers? or we simply increase the timeout?
Any opinions on this are welcome. If we decided to reduce the number, I can open another PR reducing the number

@christophfroehlich
Copy link
Contributor Author

I think the need for this increase in the timeout is coming from this specific test TestControllerManagerSrvs.list_large_number_of_controllers_with_chains that I've added recently.

yes I know, I already increased the timeout with #1224 after your change. But sometimes the shared runners are too busy and the jobs take too long ;)

Do you think it make sense to relax the test by reducing the number of controllers? or we simply increase the timeout? Any opinions on this are welcome. If we decided to reduce the number, I can open another PR reducing the number

If the test makes sense this shouldn't be a problem. I'm just curious why it takes so long actually? There is no sleep or timing in there? Does the service call take so much time?

@saikishor
Copy link
Member

But sometimes the shared runners are too busy and the jobs take too long ;)

Got it. I wasn't aware of the shared runners causing this issue

If the test makes sense this shouldn't be a problem. I'm just curious why it takes so long actually? There is no sleep or timing in there? Does the service call take so much time?

I'm not sure about this. Configuring each controller in these 200+ controllers, it does the sorting and then it has to wait for switching of the lists. Ideally, it should be quick, but somehow it is taking sometime.

@@ -141,7 +141,7 @@ if(BUILD_TESTING)
test_chainable_controller
ros2_control_test_assets::ros2_control_test_assets
)
set_tests_properties(test_controller_manager_srvs PROPERTIES TIMEOUT 120)
set_tests_properties(test_controller_manager_srvs PROPERTIES TIMEOUT 240)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's 4 minutes... which non-simulation / integration tests raises some 🚩 Do you have an idea how long this test takes in two conditions?

  1. Ideal "decent" hardware"
  2. a container limited to 1 or 2 CPUs

I believe the default free tier of GitHub action runners are 2 vCPUs...
However, if for example this was run on the highest performance GitHub hosted Linux runners with 64 vCPUs this 4 minutes would cost over 1$ which could add up quickly.

@bmagyar
Copy link
Member

bmagyar commented Jan 3, 2024

I'd rather we find out why things take time than to increase it even further

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-humble This label should be used by maintainers only! Label triggers PR backport to ROS2 humble.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants