Popular-RL-Algorithms:软参与者关键(SAC),双延迟DDPG(TD3),参与者关键(ACA2C),近端策略优化(PPO),QT-Opt,PointNet的PyTorch实施。

上传者: 42131790 | 上传时间: 2023-03-10 12:07:00 | 文件大小: 2MB | 文件类型: ZIP
流行的无模型强化学习算法 PyTorch和Tensorflow 2.0在Openai体育馆环境和自行实现的Reacher环境中均实现了最新的无模型强化学习算法。 算法包括软参与者关键(SAC),深度确定性策略梯度(DDPG),双延迟DDPG(TD3),参与者关键(AC / A2C),近端策略优化(PPO),QT-Opt(包括交叉熵( CE)方法) , PointNet ,运输商,循环策略梯度,软决策树等。 请注意,此存储库更多是我在研究和实施期间实施和测试的个人算法集合,而不是正式的开放源代码库/软件包以供使用。 但是,我认为与他人分享它可能会有所帮助,并且我希望对实现进行有益的讨论。 但是我没有花太多时间在清理或构建代码上。 您可能会注意到,每种算法可能都有几种实现方式,在此我特意展示所有这些方式,供您参考和比较。 此外,此存储库仅包含PyTorch实施。 对于RL算法的官方库,

文件下载

资源详情

[{"title":"( 103 个子文件 2MB ) Popular-RL-Algorithms:软参与者关键(SAC),双延迟DDPG(TD3),参与者关键(ACA2C),近端策略优化(PPO),QT-Opt,PointNet的PyTorch实施。","children":[{"title":".gitmodules <span style='color:#111;'> 239B </span>","children":null,"spread":false},{"title":"qt_opt_v3.py <span style='color:#111;'> 12.13KB </span>","children":null,"spread":false},{"title":"events.out.tfevents.1577364245.quantumiracle-G3-3579 <span style='color:#111;'> 237.07KB </span>","children":null,"spread":false},{"title":"ddpg.py <span style='color:#111;'> 12.31KB </span>","children":null,"spread":false},{"title":"_policies.py <span style='color:#111;'> 588B </span>","children":null,"spread":false},{"title":"CEM_Gaussian_test.ipynb <span style='color:#111;'> 75.99KB </span>","children":null,"spread":false},{"title":"cem.py <span style='color:#111;'> 3.34KB </span>","children":null,"spread":false},{"title":"RunJupyter.py <span style='color:#111;'> 159B </span>","children":null,"spread":false},{"title":"CEM_Categorical_test.ipynb <span style='color:#111;'> 51.67KB </span>","children":null,"spread":false},{"title":"CEM_Categorical_test-checkpoint.ipynb <span style='color:#111;'> 51.67KB </span>","children":null,"spread":false},{"title":"CEM_Gaussian_test-checkpoint.ipynb <span style='color:#111;'> 74.59KB </span>","children":null,"spread":false},{"title":"sdt_train.py <span style='color:#111;'> 1.29KB </span>","children":null,"spread":false},{"title":"sdt_train.cpython-36.pyc <span style='color:#111;'> 967B </span>","children":null,"spread":false},{"title":"SDT.cpython-36.pyc <span style='color:#111;'> 4.41KB </span>","children":null,"spread":false},{"title":"SDT.py <span style='color:#111;'> 5.88KB </span>","children":null,"spread":false},{"title":"ac.py <span style='color:#111;'> 17.55KB </span>","children":null,"spread":false},{"title":"sac_nonautoentropy.png <span style='color:#111;'> 64.00KB </span>","children":null,"spread":false},{"title":"td3_deterministic.png <span style='color:#111;'> 90.32KB </span>","children":null,"spread":false},{"title":"ac_cartpole.png <span style='color:#111;'> 121.48KB </span>","children":null,"spread":false},{"title":"td3_nondeterministic.png <span style='color:#111;'> 83.65KB </span>","children":null,"spread":false},{"title":"sac_autoentropy.png <span style='color:#111;'> 54.95KB </span>","children":null,"spread":false},{"title":"ppo_single_2 (copy).png <span style='color:#111;'> 22.65KB </span>","children":null,"spread":false},{"title":"pendulum.png <span style='color:#111;'> 41.92KB </span>","children":null,"spread":false},{"title":"ac.png <span style='color:#111;'> 46.49KB </span>","children":null,"spread":false},{"title":"ppo_single_2.png <span style='color:#111;'> 26.17KB </span>","children":null,"spread":false},{"title":"ppo_continuous3.py <span style='color:#111;'> 7.71KB </span>","children":null,"spread":false},{"title":"ppo_continuous_multiprocess.py <span style='color:#111;'> 15.24KB </span>","children":null,"spread":false},{"title":"ddpg_target_q <span style='color:#111;'> 35.36KB </span>","children":null,"spread":false},{"title":"ddpg_policy <span style='color:#111;'> 18.60KB </span>","children":null,"spread":false},{"title":"rdpg_policy <span style='color:#111;'> 149.30KB </span>","children":null,"spread":false},{"title":"rdpg_q <span style='color:#111;'> 149.55KB </span>","children":null,"spread":false},{"title":"ddpg_q <span style='color:#111;'> 35.36KB </span>","children":null,"spread":false},{"title":"rdpg_target_q <span style='color:#111;'> 149.55KB </span>","children":null,"spread":false},{"title":"plot.ipynb <span style='color:#111;'> 34.52KB </span>","children":null,"spread":false},{"title":"sac_v2_lstm.py <span style='color:#111;'> 12.98KB </span>","children":null,"spread":false},{"title":"reacher.cpython-36.pyc <span style='color:#111;'> 4.80KB </span>","children":null,"spread":false},{"title":"reacher.cpython-35.pyc <span style='color:#111;'> 5.40KB </span>","children":null,"spread":false},{"title":"sac_v2_multiprocess_multi_gpu.py <span style='color:#111;'> 26.95KB </span>","children":null,"spread":false},{"title":"ppo_continuous.py <span style='color:#111;'> 13.84KB </span>","children":null,"spread":false},{"title":"reward_compare_td3.pdf <span style='color:#111;'> 26.99KB </span>","children":null,"spread":false},{"title":"plot.ipynb <span style='color:#111;'> 68.00KB </span>","children":null,"spread":false},{"title":"sac_v2_lstm.py <span style='color:#111;'> 12.29KB </span>","children":null,"spread":false},{"title":"td3_lstm.py <span style='color:#111;'> 12.65KB </span>","children":null,"spread":false},{"title":"value_networks.py <span style='color:#111;'> 5.96KB </span>","children":null,"spread":false},{"title":"buffers.py <span style='color:#111;'> 6.89KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 686B </span>","children":null,"spread":false},{"title":"initialize.cpython-36.pyc <span style='color:#111;'> 826B </span>","children":null,"spread":false},{"title":"value_networks.cpython-36.pyc <span style='color:#111;'> 5.15KB </span>","children":null,"spread":false},{"title":"policy_networks.cpython-36.pyc <span style='color:#111;'> 14.25KB </span>","children":null,"spread":false},{"title":"utils.cpython-36.pyc <span style='color:#111;'> 930B </span>","children":null,"spread":false},{"title":"buffers.cpython-36.pyc <span style='color:#111;'> 6.13KB </span>","children":null,"spread":false},{"title":"optimizers.py <span style='color:#111;'> 4.78KB </span>","children":null,"spread":false},{"title":"policy_networks.py <span style='color:#111;'> 18.91KB </span>","children":null,"spread":false},{"title":"initialize.py <span style='color:#111;'> 515B </span>","children":null,"spread":false},{"title":"reward_compare_sac.pdf <span style='color:#111;'> 16.02KB </span>","children":null,"spread":false},{"title":"td3.py <span style='color:#111;'> 16.98KB </span>","children":null,"spread":false},{"title":"sac_v2.py <span style='color:#111;'> 16.83KB </span>","children":null,"spread":false},{"title":"plot-checkpoint.ipynb <span style='color:#111;'> 68.00KB </span>","children":null,"spread":false},{"title":"reward_compare_td3.pdf <span style='color:#111;'> 26.99KB </span>","children":null,"spread":false},{"title":"plot.ipynb <span style='color:#111;'> 68.39KB </span>","children":null,"spread":false},{"title":"ppo_discrete.py <span style='color:#111;'> 4.48KB </span>","children":null,"spread":false},{"title":"reacher.py <span style='color:#111;'> 7.69KB </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 11.09KB </span>","children":null,"spread":false},{"title":"td3_lstm.py <span style='color:#111;'> 12.97KB </span>","children":null,"spread":false},{"title":"ppo_continuous2.py <span style='color:#111;'> 13.96KB </span>","children":null,"spread":false},{"title":"rdpg.py <span style='color:#111;'> 11.67KB </span>","children":null,"spread":false},{"title":"reward_compare.pdf <span style='color:#111;'> 12.24KB </span>","children":null,"spread":false},{"title":"value_networks.py <span style='color:#111;'> 7.67KB </span>","children":null,"spread":false},{"title":"buffers.py <span style='color:#111;'> 6.89KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 686B </span>","children":null,"spread":false},{"title":"initialize.cpython-36.pyc <span style='color:#111;'> 826B </span>","children":null,"spread":false},{"title":"value_networks.cpython-36.pyc <span style='color:#111;'> 6.54KB </span>","children":null,"spread":false},{"title":"policy_networks.cpython-36.pyc <span style='color:#111;'> 16.95KB </span>","children":null,"spread":false},{"title":"utils.cpython-36.pyc <span style='color:#111;'> 930B </span>","children":null,"spread":false},{"title":"buffers.cpython-36.pyc <span style='color:#111;'> 6.13KB </span>","children":null,"spread":false},{"title":"optimizers.py <span style='color:#111;'> 4.78KB </span>","children":null,"spread":false},{"title":"policy_networks.py <span style='color:#111;'> 22.37KB </span>","children":null,"spread":false},{"title":"initialize.py <span style='color:#111;'> 515B </span>","children":null,"spread":false},{"title":"ramble_sac.md <span style='color:#111;'> 5.48KB </span>","children":null,"spread":false},{"title":"ppo_gae_discrete.py <span style='color:#111;'> 4.22KB </span>","children":null,"spread":false},{"title":"sac_v2_multiprocess.py <span style='color:#111;'> 21.11KB </span>","children":null,"spread":false},{"title":"td3.py <span style='color:#111;'> 17.36KB </span>","children":null,"spread":false},{"title":"sac_pendulum.py <span style='color:#111;'> 10.03KB </span>","children":null,"spread":false},{"title":"requirements.txt <span style='color:#111;'> 2.55KB </span>","children":null,"spread":false},{"title":"td3_multiprocess.py <span style='color:#111;'> 21.53KB </span>","children":null,"spread":false},{"title":"ppo_continuous_tf.py <span style='color:#111;'> 9.06KB </span>","children":null,"spread":false},{"title":"checkpoint <span style='color:#111;'> 63B </span>","children":null,"spread":false},{"title":"ppo.index <span style='color:#111;'> 1.36KB </span>","children":null,"spread":false},{"title":"ppo.meta <span style='color:#111;'> 138.25KB </span>","children":null,"spread":false},{"title":"ppo.data-00000-of-00001 <span style='color:#111;'> 32.12KB </span>","children":null,"spread":false},{"title":"ppo_continuous_multiprocess2.py <span style='color:#111;'> 15.18KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 114B </span>","children":null,"spread":false},{"title":"ppo_gae_continuous_not_work.py <span style='color:#111;'> 5.99KB </span>","children":null,"spread":false},{"title":"sac_v2.py <span style='color:#111;'> 17.39KB </span>","children":null,"spread":false},{"title":"sac_v2_multithread.py <span style='color:#111;'> 17.96KB </span>","children":null,"spread":false},{"title":"sac_v2_gru.py <span style='color:#111;'> 13.24KB </span>","children":null,"spread":false},{"title":"plot-checkpoint.ipynb <span style='color:#111;'> 34.45KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 6.63KB </span>","children":null,"spread":false},{"title":"sdt_ppo_gae_discrete.py <span style='color:#111;'> 5.56KB </span>","children":null,"spread":false},{"title":"sac.py <span style='color:#111;'> 15.60KB </span>","children":null,"spread":false},{"title":"......","children":null,"spread":false},{"title":"<span style='color:steelblue;'>文件过多,未全部展示</span>","children":null,"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明