-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pre-refactor: Add orchestrator.types module with common types #326
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
…tcher task Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
…tiple_detectors Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
ba01ed7
to
24a125c
Compare
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good - a couple questions
pub role: Option<&'a openai::Role>, | ||
/// The text contents of the message. | ||
pub text: Option<&'a str>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we want to track a potential assistant refusal
? (Otherwise, what happens on a refusal?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We aren't doing anything with refusal
, so I didn't see a reason to include it here, although I can add it if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, refusal isn't getting used currently, so its fine for it to be not in this object. and whenever that gets more popularly used, we might need to take different "actions" on that (speculating).
The intended use of this internal object is for only internal processing, like calling out to detector etc, and this one doesn't affect the output a user will see. So refusal
not being in here doesn't mean it cannot be in final output to user.. Is that accurate @declark1 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct @gkumbhat. Since this ChatMessage
type is just references, I'll go ahead and add the field even though we don't need it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, yes, my main concern had been whether this would still be reflected in the final result if a refusal
were to occur on completions.
|
||
The primary issue with these components is that they were designed specifically for the *Streaming Classification With Generation* task and lack flexibility to be extended to additional streaming use cases that require batching detections, e.g. | ||
- A use case may require different batching logic | ||
- A use case may need to use different containers to implement it's batching logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the use case described here - when would separate or different containers be used for batching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. the chat completions batching logic will likely not use BTreeMap
like we are for the MaxProcessedIndexBatcher
. The idea here is that the implementation details of the batcher, such as how it stores state and the types it uses internally, should be flexible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the use of containers
was a bit confusing here then, more like struct / data structure usage?
/// Pushes new detections. | ||
fn push( | ||
&mut self, | ||
input_id: InputId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm forgetting a bit what 'input' is referred to for the id here - could we document this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_id: u32
was added to support new requirements for chat completion streaming and purposely has a generic name. It's usage here isn't clear as the code where it's relevant isn't added yet in this PR.
For each chat completion chunk received, we have a message_index
indicating the message position in the stream (which is passed through to the chunker). The completion can contain multiple choices, so we also have a secondary choice_index
. Each choice's content is handled independently with it's own chunk->detection pipeline, so we need to track both throughout the pipeline. Previously, we only needed to track message_index
(although, I believe the TGIS API is able to produce multiple generations too).
So, the question becomes how do we name this additional index that needs to be tracked but not sent through to the chunker, therefore isn't one of the indices in Chunk
? I didn't want to name it choice_index
as that is specific to chat completions and this is common code. I figured "input" would work, but the term "index" is a bit overloaded throughout the code, so I went with input_id
. For generation streaming, it will always be 0. For chat completion streaming, it will correspond to the choice_index
, which will also often be 0 except when the user explicitly requests n > 1
.
This is primarily needed for the batching logic in ChatCompletionBatcher
and I have a placeholder comment there noting that it will map to choice_index
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment just to indicate that it refers to choice_index / message_index
would be helpful
src/orchestrator/types/detection_batcher/max_processed_index.rs
Outdated
Show resolved
Hide resolved
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
pub role: Option<&'a openai::Role>, | ||
/// The text contents of the message. | ||
pub text: Option<&'a str>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, refusal isn't getting used currently, so its fine for it to be not in this object. and whenever that gets more popularly used, we might need to take different "actions" on that (speculating).
The intended use of this internal object is for only internal processing, like calling out to detector etc, and this one doesn't affect the output a user will see. So refusal
not being in here doesn't mean it cannot be in final output to user.. Is that accurate @declark1 ?
…tectionEvidence for consistency Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
No. 2 of a few small pre-refactor PRs. This PR adds a new
orchestrator.types
module, including new abstractions used in the refactored code.types.chunk
Chunk
Chunks
Chunk
stypes.detection
Detection
Detections
Detection
stypes.detection_batch_stream
DetectionBatchStream
DetectionBatcher
implementations below.types.detection_batcher
DetectionBatcher
DetectionBatchStream
MaxProcessedIndexBatcher
DetectionBatcher
implementation based on the existing "max processed index" aggregatorChatCompletionBatcher
DetectionBatcher
implementation for chat completions (placeholder, not yet implemented)NoopBatcher
DetectionBatcher
implementation that doesn't actually batch (no-op, for testing purposes)types.chat_message
ChatMessage
ChatMessageInternal
, except it only is fortext
for now (keeping it simple as we don't currently support images or audio) and it holds references instead of owned values to avoid copying the original text, hence the lifetimes.ChatMessageIterator
ChatMessage
s.. + several type aliases