keras实现REINFORCE算法强化学习:
# Policy Gradient
Minimal implementation of Stochastic Policy Gradient Algorithm in Keras
## Pong Agent
![pg](./assets/pg.gif)
This PG agent seems to get more frequent wins after about 8000 episodes. Below is the score graph.
1