Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing: shell script per test #2865

Merged
merged 23 commits into from
May 12, 2024
Merged

Testing: shell script per test #2865

merged 23 commits into from
May 12, 2024

Conversation

Lestropie
Copy link
Member

@Lestropie Lestropie commented Mar 18, 2024

Committing what I've generated thus far to a draft PR; likely not able to finish off this week.

Closes #2789.
Supersedes #2825.

While this took a decent amount of time to work though, I'm hoping that it will make the addition of new tests a bit more accessible: one can add header comments explaining what the test does & why, and can actually make use of newline characters.


Previous behaviour:

  • For every file (one per command):
    • Generate initial "cleanup" test
    • For each line in file:
      • Generate test string identifier based on directory (binaries / scripts / unit tests), command name (from name of file), line index
      • Create test based on shell invocation

Desired behaviour:

  • For every file (one per test):
    • Copy test shell script into build directory
    • Generate test string identifier based on command subdirectory & test shell script name
    • Create test that invokes that shell script
      • Include trap to clean up any temporary files afterwards regardless of outcome
        Edit: Alternative solution for this purpose under development
    • Add labels to tests to indicate:
      • Classification (binaries/ scripts / unit tests)
      • Command being evaluated / unit test being performed
      • Whether or not to include a particular script test in CI ("Quick" testing of Python scripts #2825).
        Ideally have some string included in the comments at the head of the shell script file that can be parsed by cmake to discover that that test should have that label attributed to it.
  • Revise CI to make use of test labels rather than regex.
  • Revise documentation / CI to make use of test labels rather than regex.
    Edit: Deferred to Update testing README #2855.

@Lestropie Lestropie self-assigned this Mar 18, 2024
Copy link

clang-tidy review says "All clean, LGTM! 👍"

@Lestropie
Copy link
Member Author

@daljit46: What are the chances of getting some help on the cmake side here? Suspect you'd get it done 100 times faster than I would.

@daljit46
Copy link
Member

👍 Happy to help. I'll try to have a go at this today.

@daljit46
Copy link
Member

Copy test shell script into build directory

Why do we need to copy the script in the build directory?

Copy link

clang-tidy review says "All clean, LGTM! 👍"

@daljit46
Copy link
Member

Just made some minimal changes to current CMake scripts to run binary tests. I relied on globbing to parse the files inside each folder, but we probably best to not use it in the final implementation.
For cleaning up, it's doing the same thing as the previous structure: run the clean up command before running a test. We could probably run it after the test has finished or failed, but that would be slightly more involved (still doable).

@daljit46
Copy link
Member

Ideally have some string included in the comments at the head of the shell script file that can be parsed by cmake to discover that that test should have that label attributed to it.

If we don't rely on globbing, we could do this in CMake directly. IMO, that would be cleaner.

Globbing is disabled and each test is manually listed in the corresponding CMakeLists.txt.
Labels can be specified as extra arguments to the add_bash_binary_tests function
Copy link

clang-tidy review says "All clean, LGTM! 👍"

@Lestropie
Copy link
Member Author

Lestropie commented Mar 25, 2024

Why do we need to copy the script in the build directory?

Previously, the lines of a file were read, and each was used to form a single test invocation, where the contents of the test were just a part of the test string. Here, where an individual test can be a multi-line bash script, my thinking was that those files would therefore be copied into the build directory, and the test string would involve executing that shell script. Not saying this absolutely has to be the case, it's just what I had expected would be your recommendation.

For cleaning up, it's doing the same thing as the previous structure: run the clean up command before running a test

So my plan here is different. The set of tests for a given command are not guaranteed to be executed all together, or in any particular order. Each test needs to be possible to run in isolation, and should ideally clean up after itself.
Therefore what I'd prefer in conjunction with these changes is that:

  1. Those manually added "cleanup" jobs be removed entirely;
  2. Each individual test use the -force option such that they can succeed even if there happens to be loose temporary data from some other test lying around;
  3. Each test to be executed include use of trap such that any temporary files generated by that test are deleted regardless of test success or failure.
    (happy if there's an alternative solution here; I just want for there to be no *tmp* files lying around after running a test, rather than exclusively deleting them upon later execution of some other test)

@Lestropie Lestropie mentioned this pull request Mar 25, 2024
2 tasks
@Lestropie
Copy link
Member Author

Lestropie commented May 7, 2024

Put some effort into this one as I wish to be able to make use of this change to verify and finalise changes elsewhere.

  • Wait to see how long the additional Python script CI checks take to execute; cull list if required.
    Edit: Deferring activation of Python CI tests until dev discrepancies in test results are addressed.

  • @daljit46 Currently you've modified the cleanup such that rather than adding a cleanup "test" after the set of tests for one command, it now inserts one such cleanup "test" after every individual test. This makes the ctest terminal feedback messy as half of it is taken up by cleanup scripts. Is there no other way to manipulate ctest to clean up any intermediate files regardless of test script completion? If not, I think I'd rather manually insert a trap call at the head of every test shell script file.

  • File testing/README.md wasn't modified to reflect the cmake transition in Change build system to CMake #2689. @daljit46 is this something you can do?
    Edit: Deferred to Update testing README #2855.

@daljit46
Copy link
Member

daljit46 commented May 7, 2024

@daljit46 Currently you've modified the cleanup such that rather than adding a cleanup "test" after the set of tests for one command, it now inserts one such cleanup "test" after every individual test. This makes the ctest terminal feedback messy as half of it is taken up by cleanup scripts. Is there no other way to manipulate ctest to clean up any intermediate files regardless of test script completion? If not, I think I'd rather manually insert a trap call at the head of every test shell script file.

Yes, that's quite verbose indeed. I just pushed a change that works around this problem by adding another level of indirection. Inspired by this StackOverflow post, I've added a wrapper RunTests.cmake (not super happy with the name, so let me know if you have a better suggestion) script that calls a given bash test. This allows us to eliminate the cleanup test because we can execute the cleanup command in the CMake script instead.

File testing/README.md wasn't modified to reflect the cmake transition in #2689. @daljit46 is this something you can do?

Yes, that issue is tracked in #2855. I haven't completed that PR yet as we have a lot of ongoing changes to the testing logic, but I will do so when things get more stable.

This allows to get rid of the cleanup tests and instead execute a custom
CMake script which will execute a given bash test. The script takes care
of cleaning up regardless whether the test passes or not.
Lestropie added 2 commits May 8, 2024 11:31
- Run all test commands with -e flag to exit script as soon as any underlying command fails.
- Rename add_bash_tests() to add_bash_test() given that there is now only one test per file rather than many.
- For tests that ensure proper failure of a command, check that the appropriate error message is issued.
- Some newline fixes within test scripts.
- Add -force flag to some tests.
- Import changes to reference test data for fixel2voxel and sh2power.
@Lestropie
Copy link
Member Author

The cleanup is for some reason not working. Have masked it with -force and explicit deletion of expected output directories on a per-test basis, but it's nevertheless not doing as intended.

@Lestropie
Copy link
Member Author

I can at least confirm that outside of the ctest environment, if I manually add bin/ from my build directory into my PATH, then testing_npyread.exe works and succeeds, whereas if I don't do that then testing_npyread.exe fails silently, just as occurs in the ctest logs (both locally and in CI).

@daljit46 daljit46 force-pushed the split_tests branch 2 times, most recently from 5f081b4 to 77cc0d7 Compare May 10, 2024 21:56
@daljit46 daljit46 force-pushed the split_tests branch 2 times, most recently from 7fbc510 to e0fa479 Compare May 11, 2024 06:33
@daljit46
Copy link
Member

daljit46 commented May 11, 2024

Ok, so I spent a lot of time figuring out why CTest is unable to execute npyread. Running ctest --verbose I can see that despite providing the exact same $PATH and arguments, running the exact same commands manually yields a different result. Either this is a bug in CTest itself or in my understanding of how execute_process is supposed to work. Either way, a clear lesson for me is that we should not use RunTest.cmake to execute the bash tests because it makes debugging things very difficult. Instead, following your idea of using trap, we can just create a run_test.sh that replaces RunTests.cmake, something like this works:

#!/usr/bin/bash

script_path=$1
working_dir=$2
additional_paths=$3

trap 'rm -rf tmp* *-tmp-*' EXIT

cd $working_dir

if [ ! -z "$additional_paths" ]; then
  export PATH=$additional_paths:$PATH
fi

bash $script_path

Then we can move cmake/BashTests.cmake to testing/cmake and call run_tests from there. I've tested this and it works just fine. I'm happy to implement this if you agree.

Additionally, I just remembered that in #2437 we decided to disable npy tests on Windows because installing numpy was tricky. Here, we will run into the same problems so I'm not even sure if we want to make the effort to make the tests work on MSYS2.

Lestropie added a commit that referenced this pull request May 12, 2024
Resolves CI testing issues discussed in #2865.
@Lestropie
Copy link
Member Author

I was getting some rather puzzling results myself yesterday. Wasn't able to sufficiently wrap my head around exactly what was going on to be able to make modifications or even post anything of use here.

There's clearly something going very wrong with the paths, given after your changes the MSYS2 CI produced this:
npyread: line 3: rm: command not found

On my own Windows system, with a clean build, I now have all tests failing...

Then we can move cmake/BashTests.cmake to testing/cmake and call run_tests from there. I've tested this and it works just fine. I'm happy to implement this if you agree.

If you think it would be a worthwhile change in its own right, I'd be happy for the change to be made. I think that for the sake of getting this PR merged to facilitate progressing with others I'll do something different, but there's clearly something more fundamental going wrong that would be preferable to resolve longer term.

Additionally, I just remembered that in #2437 we decided to disable npy tests on Windows because installing numpy was tricky. Here, we will run into the same problems so I'm not even sure if we want to make the effort to make the tests work on MSYS2.

I had recollection of implementing code in all relevant commands to support the prospect of numpy being absent. The read tests should create an empty directory and then read an empty directory, issuing warnings each time; the write test should have been able to write a bunch of data but then not be checked. I didn't recall doing this not just for the sake of not having tests fail on systems where numpy isn't installed, but actually for passing CI on MSYS2. These mechanisms are however not working because the C++ binaries outright refuse to execute due to not finding the MRtrix3 core DLL (despite it being in PATH).

What I'm hoping will work instead in 36f333b is to not generate those tests in the first place if numpy is not available. Hopefully that can at least get CI clearing here.

@Lestropie
Copy link
Member Author

36f333b doesn't work:

  • NPY tests get excluded on all systems, not just MSYS2 as intended.
  • Just like on my local system, all C++ binary tests now fail on the MSYS2 CI.
  • Would be better to use find_package(Python3 REQUIRED COMPONENTS NumPy); if(${Python3_NumPy_FOUND}), present since cmake 3.14. But on my local Windows system that test passes, only for ctest to still fail.

Could do a sneaky sneaky and omit the NPY tests based on if(MINGW AND WIN32), but I'd rather not...

@Lestropie
Copy link
Member Author

On 1f4b8be:

$ ctest -L unittest --rerun-failed --output-on-failure --verbose
500: Test command: C:\msys64\mingw64\bin\cmake.exe "-D" "BASH=/usr/bin/bash.exe" "-D" "FILE_PATH=npyread" "-D" "CLEANUP_CMD=rm -rf /home/rob/src/mrtrix3/testing/unit_tests/tmp* /home/rob/src/mrtrix3/testing/unit_tests/*-tmp-*" "-D" "WORKING_DIRECTORY=/home/rob/src/mrtrix3/testing/unit_tests" "-P" "C:/msys64/home/rob/src/mrtrix3/cmake/RunTest.cmake"
500: Working Directory: C:/msys64/home/rob/src/mrtrix3/build_release/testing/unit_tests
500: Environment variables:
500:  PATH=/home/rob/src/mrtrix3/build_release/testing/unit_tests:/home/rob/src/mrtrix3/build_release/bin:/home/rob/src/mrtrix3/build_release/testing/tools:C:\msys64\home\rob\src\dcm2niix\build\bin
500:  C:\msys64\mingw64\bin
500:  C:\msys64\usr\local\bin
500:  C:\msys64\usr\bin
500:  C:\msys64\usr\bin
500:  C:\Windows\System32
500:  C:\Windows
500:  C:\Windows\System32\Wbem
500:  C:\Windows\System32\WindowsPowerShell\v1.0;C:\msys64\usr\bin\site_perl
500:  C:\msys64\usr\bin\vendor_perl
500:  C:\msys64\usr\bin\core_perl
500: Test timeout computed to be: 10000000
500: CMake Error at C:/msys64/home/rob/src/mrtrix3/cmake/RunTest.cmake:17 (message):
500:   Test npyread failed!

@Lestropie Lestropie enabled auto-merge May 12, 2024 04:52
Lestropie added a commit that referenced this pull request May 12, 2024
Attempting to resolve residual errors in MSYS2 CI in #2865.
This is necessary to pass tests on MSYS2, as tests involving execution of mrinfo need to invoke the MRtrix3 command mrinfo rather than the Windows command of the same name.
@Lestropie Lestropie merged commit 9371123 into dev May 12, 2024
5 of 6 checks passed
@Lestropie Lestropie deleted the split_tests branch May 12, 2024 23:30
daljit46 added a commit that referenced this pull request May 15, 2024
Santiziers workflow needs to be update after #2865
daljit46 added a commit that referenced this pull request May 15, 2024
Sanitizers workflow needs to be updated after #2865
Lestropie added a commit that referenced this pull request May 18, 2024
In #2678 the working directory for unit tests is set to new directory testing/data/. But in #2865, bash tests are executed as-is, rather than being imported and executed on a per-line basis. Resolving these requires setting the absolute path to the test bash script.
@Lestropie Lestropie mentioned this pull request Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants