深度確定性策略梯度DDPG(Deep Deterministic Policy Gradient)

優勢演員評論家A2C(Advantage-Actor-Critic)