Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement error handling #31

Merged
merged 5 commits into from
May 22, 2024

Conversation

declark1
Copy link
Collaborator

@declark1 declark1 commented May 13, 2024

This PR implements initial error handling using 3 layers of errors:

  1. clients::Error is a low-level error that wraps grpc and http client errors.
    • Implements status_code() method to return the actual (http client) or mapped (grpc client) http status code. This is useful for pattern-matching specific types of errors.
  2. orchestrator::Error contains variants for issues that may arise during processing of tasks, such as DetectorNotFound, DetectorRequestFailed, etc. We can extend this as-needed.
  3. server::Error is a high-level error returned to the caller, enabling them to react accordingly to validation and not found errors from the orchestrator and downstream services.
    • Implements axum::IntoResponse to return 422 for validation-type errors (with a detailed message), 404 for not found errors, and 500 (with generic “unexpected error” message) for other errors.
      /// High-level errors to return to clients.
      #[derive(Debug, thiserror::Error)]
      pub enum Error {
          #[error("{0}")]
          Validation(String),
          #[error("{0}")]
          NotFound(String),
          #[error("unexpected error occured while processing request")]
          Unexpected,
      }
      
      impl From<orchestrator::Error> for Error {
          fn from(error: orchestrator::Error) -> Self {
              use orchestrator::Error::*;
              match error {
                  DetectorNotFound { .. } => Self::NotFound(error.to_string()),
                  DetectorRequestFailed { error, .. }
                  | ChunkerRequestFailed { error, .. }
                  | GenerateRequestFailed { error, .. }
                  | TokenizeRequestFailed { error, .. } => match error.status_code() {
                      StatusCode::BAD_REQUEST | StatusCode::UNPROCESSABLE_ENTITY => {
                          Self::Validation(error.to_string())
                      }
                      StatusCode::NOT_FOUND => Self::NotFound(error.to_string()),
                      _ => Self::Unexpected,
                  },
                  _ => Self::Unexpected,
              }
          }
      }
  • Propagates errors from orchestrator tasks
  • Logs orchestrator::Error to provide sufficient detail for debugging
  • Adds response debug logging statements for chunker/detector/generation client calls
  • Updates create_http_clients and create_grpc_clients to panic rather than return Result and updates Client::new methods and orchestrator::create_clients accordingly

Examples:

  1. Invalid detector_id provided in request
    Server response:
    {
        "code": 404,
        "details": "detector not found: not_a_model"
    }
    Error log:
    ERROR fms_guardrails_orchestr8::orchestrator: unary task failed request_id=deaba6cc-9d69-43e0-81dc-3c4ab73532a2 error=detector not found: not_a_model
    
  2. Cannot connect to chunker service
    Server response:
    {
        "code": 500,
        "details": "unexpected error occured while processing request"
    }
    Error log:
    ERROR fms_guardrails_orchestr8::orchestrator: unary task failed request_id=09108c5b-00ab-47d2-8195-5032e0a43a62 error=chunker request failed for chunker_id=en_chunker: error trying to connect: tcp connect error: Connection refused (os error 61)
    

@declark1 declark1 requested a review from gkumbhat May 13, 2024 16:37
@declark1 declark1 requested a review from evaline-ju as a code owner May 13, 2024 16:37
@evaline-ju evaline-ju mentioned this pull request May 15, 2024
2 tasks
declark1 added 2 commits May 21, 2024 10:42
Signed-off-by: declark1 <daniel.clark@ibm.com>
Signed-off-by: declark1 <daniel.clark@ibm.com>
…ound and return 404, drop is_validation_error helper

Signed-off-by: declark1 <daniel.clark@ibm.com>
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the error handling! A few questions

declark1 added 2 commits May 22, 2024 10:21
Signed-off-by: declark1 <daniel.clark@ibm.com>
Signed-off-by: declark1 <daniel.clark@ibm.com>
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@gkumbhat gkumbhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@declark1 declark1 merged commit 45e99d3 into foundation-model-stack:main May 22, 2024
1 check passed
gkumbhat pushed a commit that referenced this pull request May 22, 2024
Signed-off-by: declark1 <daniel.clark@ibm.com>
@declark1 declark1 deleted the error-handling branch May 22, 2024 23:06
@evaline-ju evaline-ju linked an issue May 23, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error handling
3 participants