强化学习算法:此存储库包含大多数基于pytorch实现的经典深度强化学习算法,包括-DQN,DDQN,Dualling Network,DDPG,SAC,A2C,PPO,TRPO。 (更多算法仍在进行中)-源码

上传者: 42117150 | 上传时间: 2021-08-29 18:54:48 | 文件大小: 3.92MB | 文件类型: ZIP
深度强化学习算法 该存储库将使用PyTorch实现经典的深度强化学习算法。 该存储库的目的是为人们提供清晰的代码,以供他们学习深度强化学习算法。 将来,将添加更多算法,并且还将保留现有代码。 当前实施 深度Q学习网络(DQN) 基本DQN 双Q网络 决斗网络架构 深度确定性策略梯度(DDPG) 优势演员评判(A2C) 信任区域策略梯度(TRPO) 近端政策优化(PPO) 使用克罗内克因素信任区域(ACKTR)的演员评论家 软演员评论(SAC) 更新信息 :triangular_flag: 2018年10月17日-在此更新中,大多数算法已得到改进,并添加了更多关于图的实验(DPPG除外)。 PPO现在支持atari游戏和mujoco-env 。 TRPO非常稳定,可以得到更好的结果! :triangular_flag: 2019-07-15-在此更新中,不再需要为openai基准安装。 我在rl__utils模块中集成了有用的功能。 DDPG也重新实现,并支持更多结果。 自述文件已被修改。 代码结构也有微小的调整。 :triangular_flag: 201

文件下载

资源详情

[{"title":"( 72 个子文件 3.92MB ) 强化学习算法:此存储库包含大多数基于pytorch实现的经典深度强化学习算法,包括-DQN,DDQN,Dualling Network,DDPG,SAC,A2C,PPO,TRPO。 (更多算法仍在进行中)-源码","children":[{"title":"reinforcement-learning-algorithms-master","children":[{"title":"figures","children":[{"title":"hopper.gif <span style='color:#111;'> 1.79MB </span>","children":null,"spread":false},{"title":"06_sac.png <span style='color:#111;'> 135.70KB </span>","children":null,"spread":false},{"title":"04_trpo.png <span style='color:#111;'> 141.83KB </span>","children":null,"spread":false},{"title":"01_dqn.png <span style='color:#111;'> 233.10KB </span>","children":null,"spread":false},{"title":"05_ppo.png <span style='color:#111;'> 130.17KB </span>","children":null,"spread":false},{"title":"03_a2c.png <span style='color:#111;'> 164.86KB </span>","children":null,"spread":false},{"title":"breakout.gif <span style='color:#111;'> 451.78KB </span>","children":null,"spread":false},{"title":"logo.png <span style='color:#111;'> 12.62KB </span>","children":null,"spread":false},{"title":"bipedal.gif <span style='color:#111;'> 815.07KB </span>","children":null,"spread":false},{"title":"02_ddpg.png <span style='color:#111;'> 135.88KB </span>","children":null,"spread":false}],"spread":true},{"title":"rl_utils","children":[{"title":"mpi_utils","children":[{"title":"utils.py <span style='color:#111;'> 1.39KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"normalizer.py <span style='color:#111;'> 2.71KB </span>","children":null,"spread":false}],"spread":true},{"title":"running_filter","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"running_filter.py <span style='color:#111;'> 1.67KB </span>","children":null,"spread":false}],"spread":true},{"title":"logger","children":[{"title":"bench.py <span style='color:#111;'> 5.57KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"logger.py <span style='color:#111;'> 14.46KB </span>","children":null,"spread":false},{"title":"plot.py <span style='color:#111;'> 3.87KB </span>","children":null,"spread":false}],"spread":true},{"title":"experience_replay","children":[{"title":"experience_replay.py <span style='color:#111;'> 1.35KB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"env_wrapper","children":[{"title":"create_env.py <span style='color:#111;'> 2.19KB </span>","children":null,"spread":false},{"title":"atari_wrapper.py <span style='color:#111;'> 10.09KB </span>","children":null,"spread":false},{"title":"multi_envs_wrapper.py <span style='color:#111;'> 3.98KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 5.74KB </span>","children":null,"spread":false},{"title":"frame_stack.py <span style='color:#111;'> 1.13KB </span>","children":null,"spread":false}],"spread":true},{"title":"seeds","children":[{"title":"seeds.py <span style='color:#111;'> 407B </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"rl_algorithms","children":[{"title":"ddpg","children":[{"title":"ddpg_agent.py <span style='color:#111;'> 8.63KB </span>","children":null,"spread":false},{"title":"train.py <span style='color:#111;'> 717B </span>","children":null,"spread":false},{"title":"arguments.py <span style='color:#111;'> 2.06KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 686B </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 950B </span>","children":null,"spread":false},{"title":"demo.py <span style='color:#111;'> 1.48KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 354B </span>","children":null,"spread":false}],"spread":true},{"title":"dqn_algos","children":[{"title":"train.py <span style='color:#111;'> 589B </span>","children":null,"spread":false},{"title":"dqn_agent.py <span style='color:#111;'> 5.51KB </span>","children":null,"spread":false},{"title":"arguments.py <span style='color:#111;'> 2.19KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 1.58KB </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 2.54KB </span>","children":null,"spread":false},{"title":"demo.py <span style='color:#111;'> 1.11KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 437B </span>","children":null,"spread":false}],"spread":true},{"title":"ppo","children":[{"title":"train.py <span style='color:#111;'> 757B </span>","children":null,"spread":false},{"title":"arguments.py <span style='color:#111;'> 2.22KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 1.34KB </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 3.82KB </span>","children":null,"spread":false},{"title":"demo.py <span style='color:#111;'> 2.58KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 754B </span>","children":null,"spread":false},{"title":"ppo_agent.py <span style='color:#111;'> 10.88KB </span>","children":null,"spread":false}],"spread":true},{"title":"a2c","children":[{"title":"a2c_agent.py <span style='color:#111;'> 6.22KB </span>","children":null,"spread":false},{"title":"train.py <span style='color:#111;'> 612B </span>","children":null,"spread":false},{"title":"arguments.py <span style='color:#111;'> 1.84KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 749B </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 1.91KB </span>","children":null,"spread":false},{"title":"demo.py <span style='color:#111;'> 1.17KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 269B </span>","children":null,"spread":false}],"spread":true},{"title":"sac","children":[{"title":"train.py <span style='color:#111;'> 450B </span>","children":null,"spread":false},{"title":"arguments.py <span style='color:#111;'> 3.01KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 2.77KB </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"sac_agent.py <span style='color:#111;'> 10.62KB </span>","children":null,"spread":false},{"title":"demo.py <span style='color:#111;'> 1.40KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 268B </span>","children":null,"spread":false}],"spread":true},{"title":"trpo","children":[{"title":"train.py <span style='color:#111;'> 461B </span>","children":null,"spread":false},{"title":"arguments.py <span style='color:#111;'> 1.49KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 1.98KB </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 1.34KB </span>","children":null,"spread":false},{"title":"demo.py <span style='color:#111;'> 1.35KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 261B </span>","children":null,"spread":false},{"title":"trpo_agent.py <span style='color:#111;'> 9.08KB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"setup.py <span style='color:#111;'> 275B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 5.96KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 1.25KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明