Changes to improve exception handling #287

stevenrhall · 2025-01-26T14:51:59Z

Addresses issue #286

The Problem

Currently, there are many situations in which cyipopt produces predictable but cryptic errors in response to user input errors. Some examples:

If some of the indices returned by jacobianstructure are negative, one would expect an error message such as
```
ValueError: All jacobian row indices must be non-negative and less than m
```
Instead, Ipopt halts with status code -100 (Unrecoverable Exception), without any indication that this issue is an error in the indices.
If the hessian callback raises an exception, one would expect the exception to be raised eventually for the user to see. That doesn't happen, and instead the hessian is not calculated, and typically the Ipopt iteration will never converge. This bug is really insidious, because the user won't know that there's an exception in their code.
If a user specifies hessian indices that are not lower triangular, Ipopt exits with status code -12, Invalid Option, because it concludes that no hessian is provided, and expects the hessian_approximation option to be "limited-memory". It would be much better of course to get an error message:
```
ValueError: Indices are not lower triangular in hessianstructure
```

The Causes

There are multiple issues:

The jacobianstructure and hessianstructure user callbacks are called at problem instantiation only to get the length of the index arrays, with no error checking. Error checking is deferred until the Ipopt solver is called, and the error checking is done in the callbacks. That's a particularly problematic design choice, because an exception should not be raised in the callbacks. When an error is detected, the callbacks return False, which in turn leads Ipopt to return a failure status code. A quick fix would be to raise an exception in the callbacks, which would be caught in the try-except block. That's awkward, but actually works on MacOS, but causes a segfault on Linux, because it leaves the index arrays uninitialized.
There are bugs in the hessian_cb function. For one, the user_data pointer is cast as self = <object>user_data instead of self = <Problem>user_data. That changes how Cython name mangles variables, and the result is that the self.__exception = sys.exc_info() doesn't actually pass the exception back to the class, and so an exception in the user hessian callback never gets re-raised. Further, there's a missing return True that prevents the intermediate callback from stopping Ipopt execution.

Proposed Solution

My proposed solution moves all the index checking into the Problem.__init__ method, so that all errors are caught at problem instantiation. This effectively moves all the code from jacobian_struct_cb and hessian_struct_cb into the constructor, and also simplifies jacobian_cb and hessian_cb. All errors in the indices raise ValueError exceptions, with richer messages than the current log messages to provide better feedback to the user.

I've also fixed the logic error that silenced all exceptions in the user hessian callback.

In order to allow unit testing of these changes, I added a feature that should be useful to the user. I've added a Problem.info attribute, which holds the info dict return in a problem.solve(x0), so that if an exception is re-raised after the call to the Ipopt solver, it can be caught by the unit tests (or the user!), and both the return status and the state of the solver can be examined. Adding Problem.info does not break the current API, and could actually be viewed as a positive feature to help users debug their code. If not desired as a feature, it could be changed to Problem.__info.

I've also fixed a few typos here and there.

Readiness of the PR

This PR is not to master but to a feature branch, improve-exception-handling, so that if the PR is accepted, some polishing can be done before merging to main. I have some additional tests that aren't quite ready, and minor changes to the docs would be required as well. If you'd prefer a different approach, let me know.

The current PR does pass all the CI tests I believe. I've modifed the test_deriv_errors.py test to reflect my changes. An added bonus is that because all the indices are properly tested and an exception is raised if there's an error before the call the Ipopt solver, it's no longer necessary to distinguish pre and post Ipopt v3.14.13, and that is reflected in the tests.

The changes here produce generally better error messages for cyipopt. In many cases, errors in the callbacks produced predictable errors in Ipopt, but with error messages that would be somewhat cryptic. So for examples, errors in the hessianstructure callback produced status code -12 errors (Invalid Option), with the invalid option being hessian_approximation = “limited-memory”, which from a user perspective does not point to an error in the hessianstructure callback. With these changes, errors in the structure callbacks or exceptions in the other callbacks all produce meaningful exceptions and error messages.

moorepants · 2025-01-28T04:53:03Z

This will take a careful look to process. So it may take a little longer to review.

stevenrhall · 2025-01-28T11:16:58Z

This will take a careful look to process. So it may take a little longer to review.

No problem. I'm really looking for feedback so I can make needed changes if you want to move forward.

On my local repo, I've also made changes that allow you to get the coverage of the tests, including the cython code. Doing so allowed me to catch a couple of issues I had missed. It's a very modest change if you would like that incorporated. (2 lines in setup.py, which have no effect unless the proper environment variables are set, and a few lines in .coveragerc and pytest.ini. It would also take a line or two in the workflow files to set the envirnment variable for testing.)

moorepants · 2025-01-28T12:03:40Z

The main thing I want to think about is if this is somehow backwards incompatible. If the earlier error handling is relied on downstream, then we made need some caution.

moorepants · 2025-03-03T05:07:30Z

@stevenrhall can you comment on whether this has backwards incompatibilities? For example, if a downstream program relied on the current error propagation, i.e. looking for one of the return codes or the Cyipopt error, what will now happen?

moorepants · 2025-03-03T05:09:53Z

cyipopt/tests/unit/test_deriv_errors.py

-def ensure_unrecoverable_exception(instance):
-    # -100: Unrecoverable Exception
-    # *Should* be returned from errors in initialization
-    ensure_solve_status(instance, -100)


Removing these ensure_unrecoverable_exception feels like a break in backwards compatibility at first glance.

moorepants · 2025-03-03T05:10:36Z

cyipopt/tests/unit/test_deriv_errors.py

-    ensure_solve_status(instance, -100)
-
-
-@pytest.mark.skipif(pre_3_14_13, reason="Not caught in Ipopt < (3,14,13)")


Does this mean that the test would now pass on the older supported IPOPTs?

moorepants · 2025-03-03T05:13:37Z

So it seems you will return ValueErrors for things that used to return various IPOPT status messages. Downstream programs may have relied on these status messages. We can issue a warning that these will be changed to ValueErrors in a future release. This would give a single cyipopt release with warnings so that people have a chance to fix anything.

stevenrhall · 2025-03-03T12:56:46Z

So it seems you will return ValueErrors for things that used to return various IPOPT status messages. Downstream programs may have relied on these status messages. We can issue a warning that these will be changed to ValueErrors in a future release. This would give a single cyipopt release with warnings so that people have a chance to fix anything.

Yes, there's no doubt that these changes (and a few that I missed and should add) cause a change in behavior, which may mean that the changes proposed need to be changed to warnings for now and not be made errors until a major version increment. The appropriate warning category is probably RuntimeWarning.

Regarding the test suite, if you did want to use warnings rather than errors, then many of the changes I made in the test suite would have to be reverted, and then modified to detect that the correct warning was issued. That might require only a small change to ensure_solve_status, for example.

Please let me know if you would like me make changes in this direction.

moorepants · 2025-03-03T13:18:57Z

My preference is that we give warnings for at least a version for changes that are likely to cause downstream users to change their code. I prefer being conservative on such things, to minimize disruption. I will accept the new error handling design, but we should give warning.

stevenrhall · 2025-03-03T14:06:59Z

OK, I will take another pass at this. Will certainly take me a week or so. Thanks.

moorepants reviewed Mar 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to improve exception handling #287

Changes to improve exception handling #287

stevenrhall commented Jan 26, 2025 •

edited

Loading

moorepants commented Jan 28, 2025

stevenrhall commented Jan 28, 2025

moorepants commented Jan 28, 2025

moorepants commented Mar 3, 2025

moorepants Mar 3, 2025

moorepants Mar 3, 2025

moorepants commented Mar 3, 2025

stevenrhall commented Mar 3, 2025

moorepants commented Mar 3, 2025

stevenrhall commented Mar 3, 2025

		ensure_solve_status(instance, -100)


		@pytest.mark.skipif(pre_3_14_13, reason="Not caught in Ipopt < (3,14,13)")

Changes to improve exception handling #287

Are you sure you want to change the base?

Changes to improve exception handling #287

Conversation

stevenrhall commented Jan 26, 2025 • edited Loading

The Problem

The Causes

Proposed Solution

Readiness of the PR

moorepants commented Jan 28, 2025

stevenrhall commented Jan 28, 2025

moorepants commented Jan 28, 2025

moorepants commented Mar 3, 2025

moorepants Mar 3, 2025

Choose a reason for hiding this comment

moorepants Mar 3, 2025

Choose a reason for hiding this comment

moorepants commented Mar 3, 2025

stevenrhall commented Mar 3, 2025

moorepants commented Mar 3, 2025

stevenrhall commented Mar 3, 2025

stevenrhall commented Jan 26, 2025 •

edited

Loading