Cover image for Hadoop Beginner's Guide
Hadoop Beginner's Guide
Title:
Hadoop Beginner's Guide
Author:
Turkington, Garry.
ISBN:
9781849517317

9781628700237
Personal Author:
Publication Information:
Birmingham : Packt Pub., 2013.
Physical Description:
1 online resource (911 pages)
Contents:
Table of Contents; Hadoop Beginner's Guide; Hadoop Beginner's Guide; Credits; About the Author; About the Reviewers; www.PacktPub.com; Support files, eBooks, discount offers and more; Why Subscribe?; Free Access for Packt account holders; Preface; What this book covers; What you need for this book; Who this book is for; Conventions; Time for action -- heading; What just happened?; Pop quiz -- heading; Have a go hero -- heading; Reader feedback; Customer support; Downloading the example code; Errata; Piracy; Questions; 1. What It's All About; Big data processing; The value of data.

Historically for the few and not the manyClassic data processing systems; Scale-up; Early approaches to scale-out; Limiting factors; A different approach; All roads lead to scale-out; Share nothing; Expect failure; Smart software, dumb hardware; Move processing, not data; Build applications, not infrastructure; Hadoop; Thanks, Google; Thanks, Doug; Thanks, Yahoo; Parts of Hadoop; Common building blocks; HDFS; MapReduce; Better together; Common architecture; What it is and isn't good for; Cloud computing with Amazon Web Services; Too many clouds; A third way; Different types of costs.

AWS -- infrastructure on demand from AmazonElastic Compute Cloud (EC2); Simple Storage Service (S3); Elastic MapReduce (EMR); What this book covers; A dual approach; Summary; 2. Getting Hadoop Up and Running; Hadoop on a local Ubuntu host; Other operating systems; Time for action -- checking the prerequisites; What just happened?; Setting up Hadoop; A note on versions; Time for action -- downloading Hadoop; What just happened?; Time for action -- setting up SSH; What just happened?; Configuring and running Hadoop; Time for action -- using Hadoop to calculate Pi; What just happened?; Three modes.

Time for action -- configuring the pseudo-distributed modeWhat just happened?; Configuring the base directory and formatting the filesystem; Time for action -- changing the base HDFS directory; What just happened?; Time for action -- formatting the NameNode; What just happened?; Starting and using Hadoop; Time for action -- starting Hadoop; What just happened?; Time for action -- using HDFS; What just happened?; Time for action -- WordCount, the Hello World of MapReduce; What just happened?; Have a go hero -- WordCount on a larger body of text; Monitoring Hadoop from the browser; The HDFS web UI.

The MapReduce web UIUsing Elastic MapReduce; Setting up an account in Amazon Web Services; Creating an AWS account; Signing up for the necessary services; Time for action -- WordCount on EMR using the management console; What just happened?; Have a go hero -- other EMR sample applications; Other ways of using EMR; AWS credentials; The EMR command-line tools; The AWS ecosystem; Comparison of local versus EMR Hadoop; Summary; 3. Understanding MapReduce; Key/value pairs; What it mean; Why key/value data?; Some real-world examples; MapReduce as a series of key/value transformations.

Pop quiz -- key/value pairs.
Abstract:
As a Packt Beginner's Guide, the book is packed with clear step-by-step instructions for performing the most useful tasks, getting you up and running quickly, and learning by doing. This book assumes no existing experience with Hadoop or cloud services. It assumes you have familiarity with a programming language such as Java or Ruby but gives you the needed background on the other topics.
Holds: Copies: