Solving Hearts with Deep Learning

Overview

This repository solves Hearts (the card game) using a simplified version of Deep Counterfactual Regret Minimization, aka Deep CFR. The basic idea is:

Start with a model that considers all actions to be equally advantageous. This model plays randomly, since all actions are equally likely to be chosen.
Play the model against itself for thousands of games. Compare the predicted outcome of each action to the actual outcome of taking that action. (This is called "regret".)
Train a new version of the model using the comparisons generated in the previous step.
Repeat from step 2 for multiple iterations.

Building and running

Build and run the training program, Hearts.Learn. This requires a high-end computer with a fast GPU and CPU, and will take several days to complete. Models will be saved in the /Models directory. You can track progress via the TensorBoard directory, /runs. E.g. tensorboard --bind_all --logdir .\Hearts.Learn\bin\Release\net9.0\.
Build the web server, Hearts.Web.Server.
Copy one of the trained models to the web server's runtime directory (e.g. ./bin/Debug or ./bin/Release) and rename it to AdvantageModel.pt.
Start the web server by building and running Hearts.Web.Harness.
In the Hearts.Web/Client directory, start the client via npm install followed by npm start.
Browse to http://localhost:8081/ to play the game.

You can also play the game online on my website.

Modeling Hearts

Because Hearts ends when one of the players reaches 100 points, it can sometimes benefit players to cooperate near the end of a game, in order to avoid going over the limit. This model ignores that aspect of the game entirely, and focuses only on the score within the current deal.

Differences from Deep CFR

Since Hearts strategy is the same for all players, there is no need to train a separate model for each player. Instead, all players share the same model.
Because misdirection/bluffing is not a major part of Hearts, there is no need to train a separate "strategy" model from the advantage models at the end of the run. Instead, the advantage model converges on a strategy after a few iterations.

Name		Name	Last commit message	Last commit date
Latest commit History 736 Commits
Hearts.Learn		Hearts.Learn
Hearts.Model		Hearts.Model
Hearts.Web		Hearts.Web
Hearts		Hearts
PlayingCards		PlayingCards
.gitattributes		.gitattributes
.gitignore		.gitignore
Hearts.sln		Hearts.sln
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Solving Hearts with Deep Learning

Overview

Building and running

Modeling Hearts

Differences from Deep CFR

About

Languages

brianberns/Hearts

Folders and files

Latest commit

History

Repository files navigation

Solving Hearts with Deep Learning

Overview

Building and running

Modeling Hearts

Differences from Deep CFR

About

Resources

Stars

Watchers

Forks

Languages