dlsa:使用Apache Spark实现的分布式最小二乘近似(dlsa)

上传者: 42118770 | 上传时间: 2022-05-16 20:25:16 | 文件大小: 105KB | 文件类型: ZIP
dlsa分布式最小二乘近似 使用Apache Spark实施 介绍 在这项工作中,我们开发了一种分布式最小二乘近似(DLSA)方法,该方法能够解决分布式系统上的大量回归问题(例如,线性回归,逻辑回归和Cox模型)。 通过使用局部二次形式逼近局部目标函数,我们可以通过对局部估计量进行加权平均来获得组合估计量。 在统计上证明了所得的估计器与全局估计器一样有效。 而且,它只需要一轮通信。 我们使用自适应套索方法进一步基于DLSA估计进行收缩估计。 通过在主节点上使用LARS算法,可以轻松获得该解决方案。 从理论上讲,通过使用新设计的分布式贝叶斯信息准则(DBIC),得出的估计量具有oracle属性,并且选择一致。 广泛的数值研究和航空公司数据集进一步说明了有限的样本性能和计算效率。 整个方法已在的Spark系统中实现。 R软件包dlsa提供了上可用的概念演示。 系统要求 Spark >= 2

文件下载

资源详情

[{"title":"( 42 个子文件 105KB ) dlsa:使用Apache Spark实现的分布式最小二乘近似(dlsa)","children":[{"title":"dlsa-master","children":[{"title":"setup.py <span style='color:#111;'> 858B </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 1.19KB </span>","children":null,"spread":false},{"title":"projects","children":[{"title":"results","children":[{"title":"speedtest","children":[{"title":"spark_speedtest.py <span style='color:#111;'> 1.39KB </span>","children":null,"spread":false},{"title":"storagelevel.py <span style='color:#111;'> 233B </span>","children":null,"spread":false},{"title":"logistic_LBFGS_local.py <span style='color:#111;'> 1.23KB </span>","children":null,"spread":false},{"title":"logistic_LBFGS.py <span style='color:#111;'> 2.26KB </span>","children":null,"spread":false},{"title":"simul_logistic_to_hdfs.py <span style='color:#111;'> 1.54KB </span>","children":null,"spread":false}],"spread":true},{"title":"data","children":[{"title":"games-expand.csv <span style='color:#111;'> 514.53KB </span>","children":null,"spread":false},{"title":"speedtest.csv <span style='color:#111;'> 1.54KB </span>","children":null,"spread":false}],"spread":true},{"title":"linear_regression_dc.py <span style='color:#111;'> 1.37KB </span>","children":null,"spread":false},{"title":"plot_coef.R <span style='color:#111;'> 1.51KB </span>","children":null,"spread":false},{"title":"coef.csv <span style='color:#111;'> 19.44KB </span>","children":null,"spread":false},{"title":"logistic_single.py <span style='color:#111;'> 942B </span>","children":null,"spread":false},{"title":"utils_plot.py <span style='color:#111;'> 785B </span>","children":null,"spread":false},{"title":"plot_coef.py <span style='color:#111;'> 1.53KB </span>","children":null,"spread":false}],"spread":true},{"title":"tools","children":[{"title":"modify_dummies.py <span style='color:#111;'> 666B </span>","children":null,"spread":false},{"title":"line_count.py <span style='color:#111;'> 151B </span>","children":null,"spread":false}],"spread":true},{"title":"README.md <span style='color:#111;'> 1.80KB </span>","children":null,"spread":false},{"title":"bash","children":[{"title":"run_spark_speedtest_logistic.sh <span style='color:#111;'> 589B </span>","children":null,"spread":false},{"title":"run_spark_speedtest_standardalone.sh <span style='color:#111;'> 629B </span>","children":null,"spread":false},{"title":"clean_airlinedata.sh <span style='color:#111;'> 1.13KB </span>","children":null,"spread":false},{"title":"run_spark_dlsa.sh <span style='color:#111;'> 1.38KB </span>","children":null,"spread":false},{"title":"get_airlinedata.sh <span style='color:#111;'> 281B </span>","children":null,"spread":false},{"title":"run_spark_speedtest_yarn.sh <span style='color:#111;'> 898B </span>","children":null,"spread":false}],"spread":true},{"title":"logistic_spark_ml.py <span style='color:#111;'> 1.75KB </span>","children":null,"spread":false},{"title":"logistic_single_sgd.py <span style='color:#111;'> 5.46KB </span>","children":null,"spread":false},{"title":"logistic_dlsa.py <span style='color:#111;'> 22.21KB </span>","children":null,"spread":false}],"spread":true},{"title":"Makefile <span style='color:#111;'> 239B </span>","children":null,"spread":false},{"title":"pyproject.toml <span style='color:#111;'> 105B </span>","children":null,"spread":false},{"title":"dlsa","children":[{"title":"R","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 192B </span>","children":null,"spread":false},{"title":"utils_spark.py <span style='color:#111;'> 2.42KB </span>","children":null,"spread":false},{"title":"lsa.py <span style='color:#111;'> 6.45KB </span>","children":null,"spread":false},{"title":"model_eval.py <span style='color:#111;'> 1.50KB </span>","children":null,"spread":false},{"title":"dummies.py <span style='color:#111;'> 5.07KB </span>","children":null,"spread":false},{"title":"dlsa.py <span style='color:#111;'> 3.46KB </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 8.72KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 4.17KB </span>","children":null,"spread":false},{"title":"sdummies.py <span style='color:#111;'> 5.59KB </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 34.33KB </span>","children":null,"spread":false},{"title":".gitmodules <span style='color:#111;'> 80B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 2.58KB </span>","children":null,"spread":false},{"title":"TODO.md <span style='color:#111;'> 771B </span>","children":null,"spread":false}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明