Spark-The Definitive Guide Big Data Processing Made Simple
完美true pdf。
Apache Spark is a unified computing engine and a set of libraries for parallel data processing on
computer clusters. As of this writing, Spark is the most actively developed open source engine
for this task, making it a standard tool for any developer or data scientist interested in big data.
Spark supports multiple widely used programming languages (Python, Java, Scala, and R),
includes libraries for diverse tasks ranging from SQL to streaming and machine learning, and
runs anywhere from a laptop to a cluster of thousands of servers. This makes it an easy system to
start with and scale-up to big data processing or incredibly large scale.
2019-12-21 21:23:22
8.41MB
Spark
1