Hadoop大数据开发案例教程与项目实战
2024-04-18 21:31:05 133.22MB Hadoop
1
如果你的计算机上已经安装了Hadoop,本步骤可以略过。这里假设没有安装。如果没有安装Hadoop,请访问Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04,依照教程学习安装即可。注意,在这个Hadoop安装教程中,就包含了Java的安装,所以,按照这个教程,就可以完成JDK和Hadoop这二者的安装。
2024-04-18 20:49:00 127KB hadoop spark
1
文档非常详细,分为四个部分: ①VMware的安装 ②VMware下安装Ubuntu ③Hadoop的安装与配置 ④Spark的安装配置
2024-04-18 20:47:14 7.7MB spark hadoop vmware ubuntu
1
这个数据集是顾客对各个商家餐饮服务的评价由标签和评价两个数据项组成。 label=1(正向评价) label=0(负向评价) 用jieba(自然语言处理)库对用户评价进行分类,对商家的餐饮质量进行分析,把商家更加直观的呈现给顾客,让顾客选择更方便简洁。
2024-04-16 20:40:14 936KB hadoop 餐饮行业
1
hadoop-eclipse开发插件,此jar所支持的是hadoop-1.2.1,请下载该插件后放置于Eclipse\plugins目录下,然后重启eclipse即可。
2024-04-11 16:23:11 5.58MB eclipse插件 hadoop
1
该jar包是属于大数据hadoop使用的些jar包,可以在编写代码的时候导入工程中 该jar包是属于大数据hadoop使用的些jar包,可以在编写代码的时候导入工程中
2024-04-08 13:27:09 26.84MB hadoop
1
基于Hadoop大数据平台对某网站的外卖订单数据进行分析,分析结果进行可视化展示
2024-04-03 15:36:30 10.14MB hadoop 可视化
1
Title: Hadoop in Practice, 2nd Edition Author: Alex Holmes Length: 512 pages Edition: 2 Language: English Publisher: Manning Publications Publication Date: 2014-10-12 ISBN-10: 1617292222 ISBN-13: 9781617292224 Summary Hadoop in Practice, Second Edition provides over 100 tested, instantly useful techniques that will help you conquer big data, using Hadoop. This revised new edition covers changes and new features in the Hadoop core architecture, including MapReduce 2. Brand new chapters cover YARN and integrating Kafka, Impala, and Spark SQL with Hadoop. You'll also get new and updated techniques for Flume, Sqoop, and Mahout, all of which have seen major new versions recently. In short, this is the most practical, up-to-date coverage of Hadoop available anywhere. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Book It's always a good time to upgrade your Hadoop skills! Hadoop in Practice, Second Edition provides a collection of 104 tested, instantly useful techniques for analyzing real-time streams, moving data securely, machine learning, managing large-scale clusters, and taming big data using Hadoop. This completely revised edition covers changes and new features in Hadoop core, including MapReduce 2 and YARN. You'll pick up hands-on best practices for integrating Spark, Kafka, and Impala with Hadoop, and get new and updated techniques for the latest versions of Flume, Sqoop, and Mahout. In short, this is the most practical, up-to-date coverage of Hadoop available. Readers need to know a programming language like Java and have basic familiarity with Hadoop. What's Inside Thoroughly updated for Hadoop 2 How to write YARN applications Integrate real-time technologies like Storm, Impala, and Spark Predictive analytics using Mahout and RR Readers need to know a programming language like Java and have basic familiarity with Hadoop. About the Author Alex Holmes works on tough big-data problems. He is a software engineer, author, speaker, and blogger specializing in large-scale Hadoop projects. Table of Contents Part 1: Background and fundamentals Chapter 1: Hadoop in a heartbeat Chapter 2: Introduction to YARN Part 2: Data logistics Chapter 3: Data serialization— working with text and beyond Chapter 4: Organizing and optimizing data in HDFS Chapter 5: Moving data into and out of Hadoop Part 3: Big data patterns Chapter 6: Applying MapReduce patterns to big data Chapter 7: Utilizing data structures and algorithms at scale Chapter 8: Tuning, debugging, and testing Part 4: Beyond MapReduce Chapter 9: SQL on Hadoop Chapter 10: Writing a YARN application Appendix: Installing Hadoop and friends
2024-04-03 06:29:08 9.46MB Hadoop
1
提出了一款基于Hadoop的并行数据分析系统―――PDM.该系统拥有大量以MapReduce为计算框架的并行数据分析算法,不仅包括传统的ETL、数据挖掘、数据统计和文本分析算法,还引入了基于图理论的SNA(社会网络分析)算法.详细阐述了并行多元线性回归算法和“多源最短路径”算法的原理和实现,其中,提出的“消息传递模型”能有效解决MapReduce难以处理邻接矩阵的问题;介绍了基于电信数据的典型应用,如采用并行k均值和决策树算法实现的“套餐推荐”,利用并行PageRank算法实现的“营销关键点发现”等;最后
2024-03-25 13:56:36 894KB 自然科学 论文
1