You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rather than using a fixed 800 nodes, or using pruning to do early stopping (which based on its selection point favors certain distribution shapes), I wonder if we could use a distribution cross entropy approach.
Previously some analysis was done that showed about 800 nodes was the peak cross entropy delta rate and afterwards has diminishing returns - but that was averaged over many games. If we sample the visit distribution every x nodes, we can calculate an average cross entropy per node change for the last x nodes. But specific for the current position, which may have a longer period of information gain compared to average.
Then we can set a minimum and maximum node visit range, and in between we stop if the average cross entropy per node change drops below a configurable value y.
Would need to do some simulations to find the trade off between speed and what value of y we can go down to, but maybe once we have draw agreements in place the additional saved time per game can be spent a bit here.
The text was updated successfully, but these errors were encountered:
Rather than using a fixed 800 nodes, or using pruning to do early stopping (which based on its selection point favors certain distribution shapes), I wonder if we could use a distribution cross entropy approach.
Previously some analysis was done that showed about 800 nodes was the peak cross entropy delta rate and afterwards has diminishing returns - but that was averaged over many games. If we sample the visit distribution every x nodes, we can calculate an average cross entropy per node change for the last x nodes. But specific for the current position, which may have a longer period of information gain compared to average.
Then we can set a minimum and maximum node visit range, and in between we stop if the average cross entropy per node change drops below a configurable value y.
Would need to do some simulations to find the trade off between speed and what value of y we can go down to, but maybe once we have draw agreements in place the additional saved time per game can be spent a bit here.
The text was updated successfully, but these errors were encountered: