In Expert Hadoop® Administration, leading Hadoop administrator Sam R. Alapati brings together authoritative knowledge for creating, configuring, securing, managing, and optimizing production Hadoop clusters in any environment. Drawing on his experience with large-scale Hadoop administration, Alapati integrates action-oriented advice with carefully researched explanations of both problems and solutions. He covers an unmatched range of topics and offers an unparalleled collection of realistic examples.
Alapati demystifies complex Hadoop environments, helping you understand exactly what happens behind the scenes when you administer your cluster. You’ll gain unprecedented insight as you walk through building clusters from scratch and configuring high availability, performance, security, encryption, and other key attributes. The high-value administration skills you learn here will be indispensable no matter what Hadoop distribution you use or what Hadoop applications you run.
Understand Hadoop’s architecture from an administrator’s standpoint
Create simple and fully distributed clusters
Run MapReduce and Spark applications in a Hadoop cluster
Manage and protect Hadoop data and high availability
Work with HDFS commands, file permissions, and storage management
Move data, and use YARN to allocate resources and schedule jobs
Manage job workflows with Oozie and Hue
Secure, monitor, log, and optimize Hadoop
Benchmark and troubleshoot Hadoop
Table of Contents
Part I: Introduction to Hadoop—Architecture and Hadoop Clusters
Chapter 1 Introduction to Hadoop and Its Environment
Chapter 2 An Introduction to the Architecture of Hadoop
Chapter 3 Creating and Configuring a Simple Hadoop Cluster
Chapter 4 Planning for and Creating a Fully Distributed Cluster
Part II: Hadoop Application Frameworks
Chapter 5 Running Applications in a Cluster—The MapReduce Framework (and Hive and Pig)
Chapter 6 Running Applications in a Cluster—The Spark Framework
Chapter 7 Running Spark Applications
1