-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
output detection for chat completions #288
output detection for chat completions #288
Conversation
Signed-off-by: resoluteCoder <resolutecoder@gmail.com>
…point Signed-off-by: resoluteCoder <resolutecoder@gmail.com>
0a00adc
to
50b767f
Compare
Signed-off-by: Chris Santiago <resolutecoder@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! Some comments, along with a couple high level things
- Formatting is failing
- Let’s remove any unused/old functions/comments/printlns
- For example 2, the output request may not match the response? If n=3, max_new_tokens=25, I would expect 3 choices and 25 generated tokens unless some parameters were being passed to the chat completions model incorrectly?
- Have we confirmed that not providing any detectors in the request provides an expected passthrough response? (i.e. same fields are persisted)
@evaline-ju this is without any detectors request
response
|
Signed-off-by: resoluteCoder <resolutecoder@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @resoluteCoder for the contributions.. I left a few questions and comments
if let ChatCompletionsResponse::Unary(ref chat_completion) = chat_completions { | ||
let choices = Vec::<ChatMessageInternal>::from(chat_completion); | ||
|
||
let output_detections = match detectors.output { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this function has gotten quite big, can we split out all of this logic (fetching the output detection part) into separate function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I will break out the output detections into its own function.
Some(mut output_detections) if !output_detections.is_empty() => { | ||
output_detections.sort_by_key(|value| value.index); | ||
|
||
let detections = output_detections | ||
.into_iter() | ||
.map(|mut detection| { | ||
let last_idx = detection.results.len(); | ||
// sort detection by starting span, if span is not present then move to the end of the message | ||
detection.results.sort_by_key(|r| match r { | ||
GuardrailDetection::ContentAnalysisResponse(value) => { | ||
value.start | ||
} | ||
_ => last_idx, | ||
}); | ||
detection | ||
}) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this is basically same as the processing done at input time, can we move this logic into a separate function and then reuse in both input and output time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they are similar and we could break that into a sort_detections
function. However when I was doing just that I remembered in the input portion I returned the InputDetectionResult
instead of the DetectionResult
effectively casting
it to the appropriate type.
If I do that with the output section I can remove where I do the conversion on line 281-287.
or of course I can remove the "casting" in those and take out the similar functionality and then loop through and "cast" them to their appropriate type.
I'm good with whichever. Regardless I did kind of mix and match those solutions 😅
Here is what it would look like
Option 1 - more iterating with concise functionality
fn sort_detections(mut detections: Vec<DetectionResult>) -> Vec<DetectionResult> {
detections.sort_by_key(|value| value.index);
detections
.into_iter()
.map(|mut detection| {
let last_idx = detection.results.len();
// sort detection by starting span, if span is not present then move to the end of the message
detection.results.sort_by_key(|r| match r {
GuardrailDetection::ContentAnalysisResponse(value) => value.start,
_ => last_idx,
});
detection
})
.collect::<Vec<_>>()
}
detections: Some(ChatDetections {
input: detections
.into_iter()
.map(|detection_result| InputDetectionResult {
message_index: detection_result.index,
results: detection_result.results,
})
.collect(),
output: vec![],
}),
Option 2 - less iterating, repetitive functionality
let detections = input_detections
.into_iter()
.map(|mut detection| {
let last_idx = detection.results.len();
// sort detection by starting span, if span is not present then move to the end of the message
detection.results.sort_by_key(|r| match r {
GuardrailDetection::ContentAnalysisResponse(value) => value.start,
_ => last_idx,
});
InputDetectionResult {
message_index: detection.index,
results: detection.results,
}
})
.collect::<Vec<_>>();
@gkumbhat went ahead and choose option 1 and separated the output detections and sort detection logic in their own function |
…c into own function Signed-off-by: resoluteCoder <resolutecoder@gmail.com>
3df2093
to
48eee3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @resoluteCoder. While reviewing this, I ended up having a lot of nit comments around cleanups and restructuring the processing steps, including:
- simplify function names, e.g.
- message_detections() -> detect()
- detector_chunk_task() -> chunk()
- chunk() should be refactored to take a list of chunker_ids and messages, returning a map of chunker_id->chunks, rather than a map of detectors as multiple detectors can use the same chunker
- detect() should only perform detection
- preprocessing (only applicable to input detections) should be moved out of detect()
- pass chunks to detect() instead of messages
- return detections sorted from detect() rather than subsequently calling sort_detections()
- drop handle_output_detections() and process in match block for consistency with input detections
- mutate ChatCompletion instead of building a new object
- misc other things
I experimented with these changes and more on a fork here. Since a lot of my suggested changes are applicable to both input/output detection, I won't request the changes here. We also have a broader refactor in the works to cleanup and reuse code across handlers, which will replace some of this code with common code (e.g. the separate chunk() and detect() here won't be needed making some of these comments redundant). I'll open a separate issue/PR for items that don't overlap with the refactor. This review was still very helpful for designing common code. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @resoluteCoder for making the changes. Agree with some of the suggestions that @declark1 made. But if we can make those in separate PR, thats fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Added output detection for chat completions
Design choices
DetectionResult
struct to allow for re-use of similar logic such asmessage_detection
.message_index
var infilter_chat_messages
to allow for the output detectionchoice_index
to match thechoices
array in the responseExample 1
input request
input response
Example 2
output request
output response