CS294_homework:我对伯克利的CS294(深度强化学习)家庭作业的解决方案

上传者: 42137539 | 上传时间: 2022-05-14 14:23:19 | 文件大小: 2.08MB | 文件类型: ZIP
CS 294-112作业(2017年秋季提供) 这是我为做作业的github(于2017年秋季提供)。 我远程讲授了这门课程(使用讲义和视频),并实施了作业的编码部分。 以下是我为每个作业分配的内容简介。 免责声明:此代码仅用于教育目的。 参加本课程当前迭代的学生应避免复制此代码,因为这会破坏学术诚信并妨碍他们自己的教育。 依存关系 Gym 0.9.5用于作业3。 请注意,在本课程中,其中一些依赖项尚未发布。 此外,已修改了入门代码,以反映OpenAI Gym文档中的更改。 作业1 到目前为止,该课程涵盖了更基本的监督学习。 我实现了BC(行为克隆)和DAgger(数据集聚合),这(略有改善)了结果。 我还尝试了各种超参数。 作业2 我实现了策略梯度算法,并在各种环境下进行了一些测试。 我玩了超参数,发现我的实现使代理的奖励收敛到理论值。 我还实施了GAE(广义优势估算)并比

文件下载

资源详情

[{"title":"( 71 个子文件 2.08MB ) CS294_homework:我对伯克利的CS294(深度强化学习)家庭作业的解决方案","children":[{"title":"CS294_homework-master","children":[{"title":"sp17_hw","children":[{"title":"hw2","children":[{"title":"HW2.ipynb <span style='color:#111;'> 310.77KB </span>","children":null,"spread":false},{"title":"discrete_env.py <span style='color:#111;'> 1.48KB </span>","children":null,"spread":false},{"title":"frozen_lake.py <span style='color:#111;'> 4.29KB </span>","children":null,"spread":false}],"spread":true},{"title":"hw3","children":[{"title":"atari_wrappers.py <span style='color:#111;'> 5.17KB </span>","children":null,"spread":false},{"title":"dqn.py <span style='color:#111;'> 12.04KB </span>","children":null,"spread":false},{"title":"run_dqn_atari.py <span style='color:#111;'> 4.12KB </span>","children":null,"spread":false},{"title":"README <span style='color:#111;'> 193B </span>","children":null,"spread":false},{"title":"dqn_utils.py <span style='color:#111;'> 13.66KB </span>","children":null,"spread":false},{"title":"run_dqn_ram.py <span style='color:#111;'> 3.69KB </span>","children":null,"spread":false}],"spread":true},{"title":"hw4","children":[{"title":"main.py <span style='color:#111;'> 12.07KB </span>","children":null,"spread":false},{"title":"logz.py <span style='color:#111;'> 2.96KB </span>","children":null,"spread":false},{"title":"homework.md <span style='color:#111;'> 4.14KB </span>","children":null,"spread":false},{"title":"plot_learning_curves.py <span style='color:#111;'> 912B </span>","children":null,"spread":false}],"spread":true},{"title":"hw1","children":[{"title":"run_expert.py <span style='color:#111;'> 2.15KB </span>","children":null,"spread":false},{"title":"demo.bash <span style='color:#111;'> 174B </span>","children":null,"spread":false},{"title":"tf_util.py <span style='color:#111;'> 17.37KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 719B </span>","children":null,"spread":false},{"title":"experts","children":[{"title":"HalfCheetah-v1.pkl <span style='color:#111;'> 70.89KB </span>","children":null,"spread":false},{"title":"Walker2d-v1.pkl <span style='color:#111;'> 70.89KB </span>","children":null,"spread":false},{"title":"Hopper-v1.pkl <span style='color:#111;'> 63.60KB </span>","children":null,"spread":false},{"title":"Reacher-v1.pkl <span style='color:#111;'> 62.77KB </span>","children":null,"spread":false},{"title":"Ant-v1.pkl <span style='color:#111;'> 147.78KB </span>","children":null,"spread":false},{"title":"Humanoid-v1.pkl <span style='color:#111;'> 366.58KB </span>","children":null,"spread":false}],"spread":true},{"title":"load_policy.py <span style='color:#111;'> 2.45KB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 1.05KB </span>","children":null,"spread":false},{"title":"hw2","children":[{"title":"HW2.pdf <span style='color:#111;'> 149.28KB </span>","children":null,"spread":false},{"title":"TestNoteBook.ipynb <span style='color:#111;'> 6.84KB </span>","children":null,"spread":false},{"title":"hw2_final.pdf <span style='color:#111;'> 149.28KB </span>","children":null,"spread":false},{"title":"train_pg.py <span style='color:#111;'> 20.49KB </span>","children":null,"spread":false},{"title":"logz.py <span style='color:#111;'> 3.33KB </span>","children":null,"spread":false},{"title":"plot.py <span style='color:#111;'> 3.33KB </span>","children":null,"spread":false}],"spread":true},{"title":"hw3","children":[{"title":"atari_wrappers.py <span style='color:#111;'> 5.17KB </span>","children":null,"spread":false},{"title":"HW3.pdf <span style='color:#111;'> 96.38KB </span>","children":null,"spread":false},{"title":"dqn.py <span style='color:#111;'> 14.08KB </span>","children":null,"spread":false},{"title":"run_dqn_atari.py <span style='color:#111;'> 4.12KB </span>","children":null,"spread":false},{"title":"README <span style='color:#111;'> 196B </span>","children":null,"spread":false},{"title":"Testing.ipynb <span style='color:#111;'> 9.15KB </span>","children":null,"spread":false},{"title":"dqn_utils.py <span style='color:#111;'> 13.66KB </span>","children":null,"spread":false},{"title":"run_dqn_ram.py <span style='color:#111;'> 3.69KB </span>","children":null,"spread":false}],"spread":true},{"title":".gitignore <span style='color:#111;'> 1.06KB </span>","children":null,"spread":false},{"title":"hw4","children":[{"title":"main.py <span style='color:#111;'> 10.97KB </span>","children":null,"spread":false},{"title":"dynamics.py <span style='color:#111;'> 3.97KB </span>","children":null,"spread":false},{"title":"data","children":[{"title":"mb_mpc_HalfCheetah-v1_25-06-2018_14-51-34","children":[{"title":"log.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":true},{"title":"mb_mpc_HalfCheetah-v1_25-06-2018_15-28-11","children":[{"title":"log.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":true},{"title":"mb_mpc_HalfCheetah-v1_25-06-2018_15-22-28","children":[{"title":"log.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":true},{"title":"mb_mpc_HalfCheetah-v1_25-06-2018_15-23-41","children":[{"title":"log.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":true},{"title":"mb_mpc_HalfCheetah-v1_25-06-2018_15-01-46","children":[{"title":"log.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":false}],"spread":true},{"title":"controllers.py <span style='color:#111;'> 1.62KB </span>","children":null,"spread":false},{"title":"logz.py <span style='color:#111;'> 3.35KB </span>","children":null,"spread":false},{"title":"cheetah_env.py <span style='color:#111;'> 1.30KB </span>","children":null,"spread":false},{"title":"Testing.ipynb <span style='color:#111;'> 20.73KB </span>","children":null,"spread":false},{"title":"plot.py <span style='color:#111;'> 3.33KB </span>","children":null,"spread":false},{"title":"HW4.pdf <span style='color:#111;'> 164.21KB </span>","children":null,"spread":false},{"title":"cost_functions.py <span style='color:#111;'> 1.64KB </span>","children":null,"spread":false}],"spread":true},{"title":"README.md <span style='color:#111;'> 3.03KB </span>","children":null,"spread":false},{"title":"hw1","children":[{"title":"run_expert.py <span style='color:#111;'> 2.90KB </span>","children":null,"spread":false},{"title":"DAgger.py <span style='color:#111;'> 3.03KB </span>","children":null,"spread":false},{"title":".DS_Store <span style='color:#111;'> 6.00KB </span>","children":null,"spread":false},{"title":"demo.bash <span style='color:#111;'> 153B </span>","children":null,"spread":false},{"title":"HW1.pdf <span style='color:#111;'> 94.36KB </span>","children":null,"spread":false},{"title":"tf_util.py <span style='color:#111;'> 17.37KB </span>","children":null,"spread":false},{"title":"model.py <span style='color:#111;'> 2.89KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 719B </span>","children":null,"spread":false},{"title":"experts","children":[{"title":"HalfCheetah-v1.pkl <span style='color:#111;'> 70.89KB </span>","children":null,"spread":false},{"title":"Walker2d-v1.pkl <span style='color:#111;'> 70.89KB </span>","children":null,"spread":false},{"title":"Hopper-v1.pkl <span style='color:#111;'> 63.60KB </span>","children":null,"spread":false},{"title":"Reacher-v1.pkl <span style='color:#111;'> 62.77KB </span>","children":null,"spread":false},{"title":"Ant-v1.pkl <span style='color:#111;'> 147.78KB </span>","children":null,"spread":false},{"title":"Humanoid-v1.pkl <span style='color:#111;'> 366.58KB </span>","children":null,"spread":false}],"spread":false},{"title":"load_policy.py <span style='color:#111;'> 2.45KB </span>","children":null,"spread":false},{"title":"DAgger.bash <span style='color:#111;'> 149B </span>","children":null,"spread":false}],"spread":false}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明