-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust topic initial regret initialization #670
Adjust topic initial regret initialization #670
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually looks good. Minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, just noted some simplification
a674331
to
8d69e3f
Compare
78f5882
to
9d5474b
Compare
The PR description should be updated to reflect the latest intentions implemented by this PR / description of what's actually here @guilherme-brandao |
@guilherme-brandao the PR template was not fully filled out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 1 missing param. Otherwise, matches this description
d42d128
to
58cd0d3
Compare
The latest Buf updates on your PR. Results from workflow Buf Linter / buf (pull_request).
|
Purpose of Changes and their Description
PROBLEM
To calculate the initial topic regret, we need to have at least one participant with enough experience to be used in the
StdDev
.workerRegrets are selected based on the worker experience:
experienceCount > 1/alphaRegret
After the last migration, we deleted all regrets, set the initial regret of topics to 0, and also set all scores to 0. All participants were reinitialized.
A high number of workers with equal scores means high volatility in the active set and, consequently, a low chance of experienced participants in the active set. The problem gets a lot worse for topics that have long epochs.
SOLUTION
1/alphaRegret
epochs of experience. If these conditions aren’t met, we’re in a cold start, prioritizing rapid inclusion to avoid gaps.The essence of the change is to maintain the purpose of regret equilibrium before calculating standard deviation, aligning with the design intention of regret handling.
Link(s) to Ticket(s) or Issue(s) resolved by this PR
https://linear.app/alloralabs/issue/PROTO-2806/investigate-potential-inference-synthesis-issues-found-by-research
Are these changes tested and documented?
Unreleased
section ofCHANGELOG.md
?