Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run.command(): Support for command-line piping #1392

Closed
Lestropie opened this issue Jul 11, 2018 · 5 comments
Closed

run.command(): Support for command-line piping #1392

Lestropie opened this issue Jul 11, 2018 · 5 comments
Labels
enhancement internship Issues suitable for student projects scripts

Comments

@Lestropie
Copy link
Member

When the Python script library was first provided, piping in/out of image data actually worked just as it does for binary executables; this was due to the use of shell=True in the run.command() function. This capability was revoked as this function became more advanced; this however now prevents the use of piped images as the input / output of Python scripts.

To restore support for this will require some gymnastics:

  1. Command-line arguments will need to have a flag indicating if they are an input or an output image, much like the .type_*() C++ functions.

  2. The app module will need to replace the value of any '-' input image argument with the contents of stdin. The run.command() function could either just display this, or replace any appearances of temporary image paths with '-'.
    This I think would be easier than trying to detect the '-' within run.command() (where information about whether it's an input or output would have been lost) and connect the processes's stdin to that of the main process.

  3. The run.command() function will need to be informed if the command call includes an output to a pipe, and store the contents of stdout from that call somewhere for the app module to write to stdout of the main process.
    This can't strictly be done based on the "last" run.command() call, since a script may contain multiple outputs and an output piped image may not necessarily appear as part of the very last one.

@Lestropie
Copy link
Member Author

Rethought this a little since it was re-raised:

Could have keyword arguments to the run.command() function indicating that the command string contains at least one user image input / output (would need to be one or the other). Then the run module could itself manage the gymnastics between "-" characters temporary image paths, terminal outputs, and stdin / stdout contents in an entirely enclosed way.

Alternatively, this could potentially be managed within the fsys.fromUser() function (which would also need to be informed as to whether the path is an input or output). This would technically be more "general" since there's no guarantee that a temporary image piped to/from a script will necessarily be an input to / output from run.command(). Such handling though probably exceeds what one would intuitively expect from the "filesystem" module.

Too many possible solutions without a clear winner :-/

@Lestropie
Copy link
Member Author

Okay, aborted on my first tentative attempt.

One of the issues is that a piped image provided as input to a Python script may need to be accessed both at the point of loading e.g. into the scratch directory, and also at completion of the script for the use of app.mrconvert_output_option() (which is used to properly update "command_history"). So based on that alone, support for this feature (without huge amounts of copy&paste code) would require much deeper embedding of identification of argument types, i.e. labelling of command-line arguments as input image / output image. Which starts to head toward a complete rewrite of the command-line parsing code and removal of argparse...

@Lestropie
Copy link
Member Author

In retrospect, after looking at the documentation more closely, it might be possible to support this without going to the extent of #1608.

argparse permits the input to the type kwarg to be any callable that accepts a string. Therefore custom callables could be provided for input and output images.

  • For input images, this would mean that a "-" input could be detected by the relevant function and be filled with content provided at stdin.

  • What I'm not sure about is whether or not this would solve the issue entirely for piping image data out. Here's what I think might be possible. The callable just latches the specified image path. Only upon executing the __str__() function of that option does it detect that the specified path is "-", generates the temporary file path, spits that out to stdout, and returns that path on function close.

@Lestropie Lestropie added the internship Issues suitable for student projects label Dec 12, 2022
@Lestropie
Copy link
Member Author

Lestropie commented Dec 12, 2022

  • Investigate history of this functionality of argparse; find out whether use of such will place a lower limit on Python version dependency for MRtrix3

  • Familiarize oneself with how image piping works in MRtrix3

  • Familiarise oneself with argparse and creation of custom type classes

    • Get basic no-operation custom class working in test executable script outside of MRtrix3
    • Add basic trivial functionality to custom class and make sure it has differential behaviour depending on user input
  • Add custom callable types
    (see relevant C++ code: type list and special handling):

    • Boolean
    • Integer sequence (comma-delimited)
    • Floating-point sequence (comma-delimited)
    • Input image
    • Output image
    • Input directory
    • Output directory
      • Will require special behaviour as -force no longer permits overwriting of non-empty directories; @Lestropie to investigate current vs. desired behaviour
    • Input tractogram
    • Output tractogram
  • Make use of callable types throughout various MRtrix3 commands

  • Make sure that existing script tests execute successfully

  • Consider adding integer & floating-point types that optionally impose lower & upper value limits as can be done in C++ (and gets reported in __print_full_usage__ and tested during command-line parsing)

  • Remove checks on input / output data that become redundant because of explicit handling of those operations in the CLI parser (@Lestropie responsibility)

  • Add support for piping image data (both input and output)

    • Good test case for input: mrconvert dwi.mif - | dwishellmath - mean shellmeans.mif
    • Good test case for output: dwishellmath dwi.mif mean - | mrconvert - shellmeans.mif
    • Test cases of multiple piped inputs / multiple piped outputs
    • Ensure that temporary piped images are written to the appropriate location given system configuration
    • If possible, make system robust to potentially reading from input image that may be a pipe multiple times
  • Add Continuous Integration (CI) tests for:

    • Piping of image data using Python scripts
    • Any scripts for which changing the command-line parsing breaks functionaity in the absence of script content changes
  • Add command-line parsing error if piped image is specified but pipe is unconnected (already a functionality in C++)

  • Refine behaviour of __print_full_usage__

Deferred:

  • Experiment with adoption of argparse.FileType for handling input and output files
    (will likely require modifications to the code of any scripts that make use of such)

@Lestropie
Copy link
Member Author

Closed by #2678.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement internship Issues suitable for student projects scripts
Projects
None yet
Development

No branches or pull requests

1 participant