Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross entropy stopping condition for training. #684

Closed
Tilps opened this issue Jan 17, 2019 · 2 comments
Closed

Cross entropy stopping condition for training. #684

Tilps opened this issue Jan 17, 2019 · 2 comments

Comments

@Tilps
Copy link
Contributor

Tilps commented Jan 17, 2019

Rather than using a fixed 800 nodes, or using pruning to do early stopping (which based on its selection point favors certain distribution shapes), I wonder if we could use a distribution cross entropy approach.
Previously some analysis was done that showed about 800 nodes was the peak cross entropy delta rate and afterwards has diminishing returns - but that was averaged over many games. If we sample the visit distribution every x nodes, we can calculate an average cross entropy per node change for the last x nodes. But specific for the current position, which may have a longer period of information gain compared to average.

Then we can set a minimum and maximum node visit range, and in between we stop if the average cross entropy per node change drops below a configurable value y.
Would need to do some simulations to find the trade off between speed and what value of y we can go down to, but maybe once we have draw agreements in place the additional saved time per game can be spent a bit here.

@Naphthalin
Copy link
Contributor

#721 was merged a while ago, this issue can be closed I guess :)

@Tilps
Copy link
Contributor Author

Tilps commented May 1, 2020

yeap, done.

@Tilps Tilps closed this as completed May 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants