Trained with 30000 samples from gameplay of myself. The result is not good enough. The agent does not know how to react if it have never seen similiar observations in the training samples.
The agent plays much better than the one with supervised learning. It even learns to hit the ball with the edge of racket to create harder situation for the opponent.
Another extra demonstration in this example is showing how to use supervised learning to initialize the neural network used for reinforcement learning. Generally, this can speed up the training process.
Compared with the agent trained with RL only, it does less unnecessary movement. Even though it tries to keep the racket at the bottom for some unknown reason.
This Pong example is one of the examples in the the UnityTensorflowKeras repository. Go to the repository from the link below to install it according to the instructions.
The Pong example is located under Assets/UnityTensorflow/Examples/Pong directory.
For more information about this example, see Here.
NA
Xiaoxiao Ma EXAMPLE-UNITY
Games Unity