Cover image for Hadoop Cluster Deployment.
Hadoop Cluster Deployment.
Title:
Hadoop Cluster Deployment.
Author:
Zburivsky, Danil.
ISBN:
9781783281725
Personal Author:
Physical Description:
1 online resource (144 pages)
Contents:
Hadoop Cluster Deployment -- Table of Contents -- Hadoop Cluster Deployment -- Credits -- About the Author -- About the Reviewers -- www.PacktPub.com -- Support files, eBooks, discount offers and more -- Why Subscribe? -- Free Access for Packt account holders -- Preface -- What this book covers -- What you need for this book -- Who this book is for -- Conventions -- Reader feedback -- Customer support -- Errata -- Piracy -- Questions -- 1. Setting Up Hadoop Cluster - from Hardware to Distribution -- Choosing Hadoop cluster hardware -- Choosing the DataNode hardware -- Low storage density cluster -- High storage density cluster -- NameNode and JobTracker hardware configuration -- The NameNode hardware -- The JobTracker hardware -- Gateway and other auxiliary services -- Network considerations -- Hadoop hardware summary -- Hadoop distributions -- Hadoop versions -- Choosing Hadoop distribution -- Cloudera Hadoop distribution -- Hortonworks Hadoop distribution -- MapR -- Choosing OS for the Hadoop cluster -- Summary -- 2. Installing and Configuring Hadoop -- Configuring OS for Hadoop cluster -- Choosing and setting up the filesystem -- Setting up Java Development Kit -- Other OS settings -- Setting up the CDH repositories -- Setting up NameNode -- JournalNode, ZooKeeper, and Failover Controller -- Hadoop configuration files -- NameNode HA configuration -- JobTracker configuration -- Configuring the job scheduler -- JobQueueTaskScheduler -- FairScheduler -- CapacityTaskScheduler -- DataNode configuration -- TaskTracker configuration -- Advanced Hadoop tuning -- hdfs-site.xml -- mapred-site.xml -- core-site.xml -- Summary -- 3. Configuring the Hadoop Ecosystem -- Hosting the Hadoop ecosystem -- Sqoop -- Installing and configuring Sqoop -- Sqoop import example -- Sqoop export example -- Hive -- Hive architecture -- Installing Hive Metastore.

Installing the Hive client -- Installing Hive Server -- Impala -- Impala architecture -- Installing Impala state store -- Installing the Impala server -- Summary -- 4. Securing Hadoop Installation -- Hadoop security overview -- HDFS security -- MapReduce security -- Hadoop Service Level Authorization -- Hadoop and Kerberos -- Kerberos overview -- Kerberos in Hadoop -- Configuring Kerberos clients -- Generating Kerberos principals -- Enabling Kerberos for HDFS -- Enabling Kerberos for MapReduce -- Summary -- 5. Monitoring Hadoop Cluster -- Monitoring strategy overview -- Hadoop Metrics -- JMX Metrics -- Monitoring Hadoop with Nagios -- Monitoring HDFS -- NameNode checks -- JournalNode checks -- ZooKeeper checks -- Monitoring MapReduce -- JobTracker checks -- Monitoring Hadoop with Ganglia -- Summary -- 6. Deploying Hadoop to the Cloud -- Amazon Elastic MapReduce -- Installing the EMR command-line interface -- Choosing the Hadoop version -- Launching the EMR cluster -- Temporary EMR clusters -- Preparing input and output locations -- Using Whirr -- Installing and configuring Whirr -- Summary -- Index.
Abstract:
This book is a step-by-step tutorial filled with practical examples which will show you how to build and manage a Hadoop cluster along with its intricacies.This book is ideal for database administrators, data engineers, and system administrators, and it will act as an invaluable reference if you are planning to use the Hadoop platform in your organization. It is expected that you have basic Linux skills since all the examples in this book use this operating system. It is also useful if you have access to test hardware or virtual machines to be able to follow the examples in the book.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Electronic Access:
Click to View
Holds: Copies: