Pong
Trained agent using supervised learning

Trained with 30000 samples from gameplay of myself. The result is not good enough. The agent does not know how to react if it have never seen similiar observations in the training samples.
Trained agent using reinforcement learning and selfplay

The agent plays much better than the one with supervised learning. It even learns to hit the ball with the edge of racket to create harder situation for the opponent.
Another extra demonstration in this example is showing how to use supervised learning to initialize the neural network used for reinforcement learning. Generally, this can speed up the training process.
Trained agent using reinforcement learning and selfplay, with weights initialized from supervised learning

Compared with the agent trained with RL only, it does less unnecessary movement. Even though it tries to keep the racket at the bottom for some unknown reason.
Go to Sourcecode
This Pong example is one of the examples in the the UnityTensorflowKeras repository. Go to the repository from the link below to install it according to the instructions.
The Pong example is located under Assets/UnityTensorflow/Examples/Pong directory.
For more information about this example, see Here.
Exercises
NA
Xiaoxiao Ma EXAMPLE-UNITY
Games Unity