Skip to content

Commit

Permalink
Update processing to be more general
Browse files Browse the repository at this point in the history
The eval implementation assumes certain forms for input
which results in complex strings from the token to be
parsed into commands. This is unwanted, as programs can
generate quite complicated commands in string form, with
different shapes (for safety for ex.) like single-, double-
quotes, backticks, and other characters that bash, python
and json may interpret/parse differently.

The implementation in this commit changes the bash part to
leave the input string intact, and not split it into pieces.
Moreover, the shell part of python is changed into default
behaviour of subprocess, along with listing the input,
which also helps with parsing complex strings properly.

This was tested with string containing escaped characters
in nested and nonnested forms, and is more robust.
  • Loading branch information
lnauta committed Oct 11, 2024
1 parent 86e75cc commit 6ab7f4e
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 7 deletions.
19 changes: 13 additions & 6 deletions examples/local-example.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,20 @@ def process_task(self, token):
print(key, value)
print("-----------------------")

# Start running the main job
# /usr/bin/time -v ./process_task.sh [input] [tokenid] 2> logs_[token_id].err 1> logs_[token_id].out
command = "/usr/bin/time -v ./process_task.sh " + "\"" +token['input'] + "\" " + token['_id'] + " 2> logs_" + str(token['_id']) + ".err 1> logs_" + str(token['_id']) + ".out"
# Start running the main job, the logging is done internally and saved below
# /usr/bin/time -v ./process_task.sh [input] [tokenid]
command = ["/usr/bin/time", "-v", "./process_task.sh", token['input'], token['_id']]
out = execute(command)

logsout = f"logs_{token['_id']}.out"
logserr = f"logs_{token['_id']}.err"

# write the logs
with open(logsout, 'w') as f:
f.write(out[2].decode('utf-8'))
with open(logserr, 'w') as f:
f.write(out[3].decode('utf-8'))

out = execute(command, shell=True)
self.subprocess = out[0]

# Get the job exit code and done in the token
Expand All @@ -61,11 +70,9 @@ def process_task(self, token):
# Attach logs in token
curdate = time.strftime("%d/%m/%Y_%H:%M:%S_")
try:
logsout = "logs_" + str(token['_id']) + ".out"
log_handle = open(logsout, 'rb')
token.put_attachment(logsout, log_handle.read())

logserr = "logs_" + str(token['_id']) + ".err"
log_handle = open(logserr, 'rb')
token.put_attachment(logserr, log_handle.read())
except:
Expand Down
2 changes: 1 addition & 1 deletion examples/process_task.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ echo $TOKENID
echo $OUTPUT

#Start processing
eval $INPUT
bash -c "$INPUT"
if [[ "$?" != "0" ]]; then
echo "Program interrupted. Exit now..."
exit 1
Expand Down

3 comments on commit 6ab7f4e

@hailihu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes seem fine to me. Using "bash -c" instead of "eval" creates a subshell. I don't know if this may have unforeseen consequences, but at least not for the simple examples in the repo.
Can you specify the test case for which the old implementation causes problems and the new one fixes it?

@hailihu
Copy link
Collaborator

@hailihu hailihu commented on 6ab7f4e Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the test cases in the Jira issue:

  • echo 'single quote' && echo "double quote" && echo `backtick command` && echo escaped backslash
  • echo "this is a test"; echo 'nesting with "double quotes" and `backticks`'

@lnauta
Copy link
Member Author

@lnauta lnauta commented on 6ab7f4e Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see, the comment section of github is also interpreting special characters, similarly eval and bash may or may not interpret special characters depending on syntax.

In our case we have text coming from json objects into bash that should be preserved properly.

Please sign in to comment.