Allow '_ensure_save_result' to be skipped with a flag in any place it is called. #513

GeraldIr · 2023-12-04T10:44:13Z

Often I find myself being unable to start jobs that have different result nodes which are incompatible with save_result and there has to be some workaround implemented for handling these edge cases.

I'd like to propose an optional flag for methods such as create_job and download which allows ensure save result to be skipped so that no superfluous save_result node gets added.

I can also create a PR for this if a change like this is approved.

jdries · 2023-12-04T10:47:19Z

Do you have an example of these different result nodes? Maybe the check can also be improved? We are for instance thinking about a save_collection process, that could also have this problem.

…

________________________________ From: Gerald Walter Irsiegler ***@***.***> Sent: Monday, December 4, 2023 11:44 AM To: Open-EO/openeo-python-client ***@***.***> Cc: Subscribed ***@***.***> Subject: [Open-EO/openeo-python-client] Allow '_ensure_save_result' to be skipped with a flag in any place it is called. (Issue #513) Often I find myself being unable to start jobs that have different result nodes which are incompatible with save_result and there has to be some workaround implemented for handling these edge cases. I'd like to propose an optional flag for methods such as create_job and download which allows ensure save result to be skipped so that no superfluous save_result node gets added. I can also create a PR for this if a change like this is approved. — Reply to this email directly, view it on GitHub<#513>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABNJPSH5Q75CQFPLCIQLMK3YHWSQTAVCNFSM6AAAAABAFYEIOOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDGNJVGAYDCMI>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***> VITO Disclaimer: http://www.vito.be/e-maildisclaimer

GeraldIr · 2023-12-04T10:55:27Z

We are currently using save_ml_model which stores results of random forest regression, we debated including this functionality in save_result but decided the outputs were different enough to justify its own process (and this might in the future be the case for more things, like UDFs).

I think for more advanced users having the option would be a benefit, and making it optional would save the users relying on it currently from any trouble.

soxofaan · 2023-12-07T15:48:00Z

is save_ml_model the only use case you're talking about? Because we can easily whitelist that process alongside save_result in the logic.

soxofaan · 2023-12-07T15:50:30Z

ah no wait, save_ml_model is already whitelisted in MlModel.create_job() so I'm not sure I fully understand the problem here.

can you provide some examples in code snippets?

ValentinaHutter · 2024-09-23T13:54:26Z

I would like to skip the save_result for some of the sen2like processing and now created a PR to do so: #627 Not sure, if I miss something, I could not find a whitelist for save_ml_model to use it as an example.

jdries · 2024-09-23T14:40:25Z

Side note: if you do connection.create_job(your_cube.flat_graph()), no save_result will be added. This is the easy way around.
A better way forward would be to fetch the schema for the last process, and only add save_resultif the output is a datacube.
Now the thing is, sen2like claims the output is a datacube, so as a user, I would expect that I actually need to add a save_resultprocess, otherwise I won't get valid output. Also the process description does not mention anything in this regards.
@ValentinaHutter Is that process metadata correct?

ValentinaHutter · 2024-09-23T14:54:59Z

Thanks for pointing this out, then I will use the create job with the flat graph for now!
The reason for this is that sen2like processor natively generates a Sentinel2 like .SAFE folder holding all the results. In order to make use of other openeo processes on top of that, we then load the data from the .SAFE afterwards. So, when we only want to generate the .SAFE output, we would not need a save_result process afterwards. This is why the return value of sen2like is a datacube. This is described in the process metadata.

soxofaan · 2024-09-24T12:44:13Z

In order to make use of other openeo processes on top of that, we then load the data from the .SAFE afterwards. So, when we only want to generate the .SAFE output, we would not need a save_result process afterwards.

I'm not completely following: if you don't have a save_result in your first job, you don't really have result assets in your job results so how can you reference to that cube in another job?

Isn't it better to define a "SAFE" file format in your capabilities to be used in this case?

soxofaan · 2024-09-24T12:46:17Z

I could not find a whitelist for save_ml_model to use it as an example.

FYI: I was referring to this part in create_job of the MlModel class:

openeo-python-client/openeo/rest/mlmodel.py

Lines 108 to 110 in 26bef79

    
           if pg.result_node().process_id not in {"save_ml_model"}: 
        
               _log.warning("Process graph has no final `save_ml_model`. Adding it automatically.") 
        
               pg = pg.save_ml_model()

…cute_batch

soxofaan · 2024-09-26T13:30:24Z

added auto_add_save_result option to download/create_job/execute_batch with cfd9f1a

soxofaan added the collect feedback label Jul 18, 2024

soxofaan added a commit that referenced this issue Sep 26, 2024

Issue #513 add auto_add_save_result option to download/create_job/exe…

cfd9f1a

…cute_batch

soxofaan closed this as completed Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow '_ensure_save_result' to be skipped with a flag in any place it is called. #513

Allow '_ensure_save_result' to be skipped with a flag in any place it is called. #513

GeraldIr commented Dec 4, 2023

jdries commented Dec 4, 2023 via email

GeraldIr commented Dec 4, 2023

soxofaan commented Dec 7, 2023

soxofaan commented Dec 7, 2023

ValentinaHutter commented Sep 23, 2024

jdries commented Sep 23, 2024 •

edited

Loading

ValentinaHutter commented Sep 23, 2024

soxofaan commented Sep 24, 2024

soxofaan commented Sep 24, 2024

soxofaan commented Sep 26, 2024

Allow '_ensure_save_result' to be skipped with a flag in any place it is called. #513

Allow '_ensure_save_result' to be skipped with a flag in any place it is called. #513

Comments

GeraldIr commented Dec 4, 2023

jdries commented Dec 4, 2023 via email

GeraldIr commented Dec 4, 2023

soxofaan commented Dec 7, 2023

soxofaan commented Dec 7, 2023

ValentinaHutter commented Sep 23, 2024

jdries commented Sep 23, 2024 • edited Loading

ValentinaHutter commented Sep 23, 2024

soxofaan commented Sep 24, 2024

soxofaan commented Sep 24, 2024

soxofaan commented Sep 26, 2024

jdries commented Sep 23, 2024 •

edited

Loading