Intelligent Pool

An example of playing the billiard using sampling based method, and how to combine it with supervised learning.

We use CMA-ES(Covariance Matrix Adaption - Evolution Strategies) to find the optimal solution to a billiard problem. CMA-ES is a sampling based method, where it needs to sample different actions(shooting direction and power in our case), evaluate the final result of those actions and obtain the best of them. CMA-ES uses matrix adaption and evalution strategy to improve the sampling distribution, so that it is much easier/faster to find the best action.

Demo of play billiard using CMA-ES, and the 2D heatmap of action space

CMA-ES is slow, but its result is generally good. Neural network is fast at inference time. Therefore we can use the results from CMA-ES to train a neural network using supervised learning.

However, the result from neural network is not as good as expected(we will show in the next video). Even though, it is still possible to use the output from neural network as the initial guess of CMA-ES, hence speed up the optimization process.

Here is the demo using a simplified billiard environment:

Comparison of using CMA-ES only, trained neural network only, and both of them

The reason why we used the simplified environment is that the billiard game is too hard to learn because of its nature. We probably need much more samples and deeper neural network to generate a ok result.

The article below has a more detailed explanation of the billiard example including what we have tried and why it is hard to train a neural network for this game.

Article: Intelligent Pool

Go to Sourcecode

This Intelligent Pool example is one of the examples in the the UnityTensorflowKeras repository. Go to the repository from the link below to install it according to the instructions.

View on Github

The Intelligent Pool example is located under Assets/UnityTensorflow/Examples/IntelligentPool directory.

Exercises

Xiaoxiao Ma 2018-10-15 EXAMPLE-UNITY
Games Unity