Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-refactor: Add orchestrator.types module with common types #326

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

declark1
Copy link
Collaborator

@declark1 declark1 commented Mar 4, 2025

No. 2 of a few small pre-refactor PRs. This PR adds a new orchestrator.types module, including new abstractions used in the refactored code.

types.chunk

  • Chunk
    • An internal chunk, used for both unary and streaming
  • Chunks
    • A newtype representing a vec of Chunks

types.detection

  • Detection
    • An internal detection, used for all detector types
  • Detections
    • A newtype representing a vec of Detections

types.detection_batch_stream

  • DetectionBatchStream
    • A stream adapter that wraps multiple detection streams and produces a stream of batches using one of the pluggable DetectionBatcher implementations below.
    • More details to be added to this PR explaining the rationale and how this works.

types.detection_batcher

  • DetectionBatcher
    • A trait to implement pluggable batching logic for DetectionBatchStream
  • MaxProcessedIndexBatcher
    • A DetectionBatcher implementation based on the existing "max processed index" aggregator
  • ChatCompletionBatcher
    • A DetectionBatcher implementation for chat completions (placeholder, not yet implemented)
  • NoopBatcher
    • A DetectionBatcher implementation that doesn't actually batch (no-op, for testing purposes)

types.chat_message

  • ChatMessage
    • A internal representation of an openai chat message which can be a request message or a completion choice. This is essentially the same idea as the existing ChatMessageInternal, except it only is for text for now (keeping it simple as we don't currently support images or audio) and it holds references instead of owned values to avoid copying the original text, hence the lifetimes.
  • ChatMessageIterator
    • An iterator over ChatMessages

.. + several type aliases

Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
@declark1 declark1 marked this pull request as draft March 4, 2025 01:31
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
…tcher task

Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
declark1 added 2 commits March 4, 2025 09:31
…tiple_detectors

Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
@declark1 declark1 marked this pull request as ready for review March 4, 2025 19:21
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
declark1 added 2 commits March 4, 2025 12:17
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
@declark1 declark1 force-pushed the types branch 4 times, most recently from ba01ed7 to 24a125c Compare March 5, 2025 19:08
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good - a couple questions

pub role: Option<&'a openai::Role>,
/// The text contents of the message.
pub text: Option<&'a str>,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to track a potential assistant refusal? (Otherwise, what happens on a refusal?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't doing anything with refusal, so I didn't see a reason to include it here, although I can add it if needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, refusal isn't getting used currently, so its fine for it to be not in this object. and whenever that gets more popularly used, we might need to take different "actions" on that (speculating).

The intended use of this internal object is for only internal processing, like calling out to detector etc, and this one doesn't affect the output a user will see. So refusal not being in here doesn't mean it cannot be in final output to user.. Is that accurate @declark1 ?

Copy link
Collaborator Author

@declark1 declark1 Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct @gkumbhat. Since this ChatMessage type is just references, I'll go ahead and add the field even though we don't need it now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yes, my main concern had been whether this would still be reflected in the final result if a refusal were to occur on completions.


The primary issue with these components is that they were designed specifically for the *Streaming Classification With Generation* task and lack flexibility to be extended to additional streaming use cases that require batching detections, e.g.
- A use case may require different batching logic
- A use case may need to use different containers to implement it's batching logic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the use case described here - when would separate or different containers be used for batching?

Copy link
Collaborator Author

@declark1 declark1 Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. the chat completions batching logic will likely not use BTreeMap like we are for the MaxProcessedIndexBatcher. The idea here is that the implementation details of the batcher, such as how it stores state and the types it uses internally, should be flexible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of containers was a bit confusing here then, more like struct / data structure usage?

/// Pushes new detections.
fn push(
&mut self,
input_id: InputId,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm forgetting a bit what 'input' is referred to for the id here - could we document this?

Copy link
Collaborator Author

@declark1 declark1 Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_id: u32 was added to support new requirements for chat completion streaming and purposely has a generic name. It's usage here isn't clear as the code where it's relevant isn't added yet in this PR.

For each chat completion chunk received, we have a message_index indicating the message position in the stream (which is passed through to the chunker). The completion can contain multiple choices, so we also have a secondary choice_index. Each choice's content is handled independently with it's own chunk->detection pipeline, so we need to track both throughout the pipeline. Previously, we only needed to track message_index (although, I believe the TGIS API is able to produce multiple generations too).

So, the question becomes how do we name this additional index that needs to be tracked but not sent through to the chunker, therefore isn't one of the indices in Chunk? I didn't want to name it choice_index as that is specific to chat completions and this is common code. I figured "input" would work, but the term "index" is a bit overloaded throughout the code, so I went with input_id. For generation streaming, it will always be 0. For chat completion streaming, it will correspond to the choice_index, which will also often be 0 except when the user explicitly requests n > 1.

This is primarily needed for the batching logic in ChatCompletionBatcher and I have a placeholder comment there noting that it will map to choice_index.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment just to indicate that it refers to choice_index / message_index would be helpful

Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
pub role: Option<&'a openai::Role>,
/// The text contents of the message.
pub text: Option<&'a str>,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, refusal isn't getting used currently, so its fine for it to be not in this object. and whenever that gets more popularly used, we might need to take different "actions" on that (speculating).

The intended use of this internal object is for only internal processing, like calling out to detector etc, and this one doesn't affect the output a user will see. So refusal not being in here doesn't mean it cannot be in final output to user.. Is that accurate @declark1 ?

declark1 added 2 commits March 5, 2025 15:39
…tectionEvidence for consistency

Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Signed-off-by: declark1 <44146800+declark1@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants