Cover image for Hadoop Real World Solutions Cookbook
Hadoop Real World Solutions Cookbook
Title:
Hadoop Real World Solutions Cookbook
Author:
Owens, Jonathan.
ISBN:
9781849519137

9781621989103
Personal Author:
Publication Information:
Birmingham : Packt Pub., 2013.
Physical Description:
1 online resource (802 pages).
Series:
Community experience distilled. Quick answers to common problems

Community experience distilled.
General Note:
Using Apache Pig to sessionize web server log data.
Contents:
Table of Contents; Hadoop Real-World Solutions Cookbook; Hadoop Real-World Solutions Cookbook; Credits; About the Authors; About the Reviewers; www.packtpub.com; Support files, eBooks, discount offers and more; Why Subscribe?; Free Access for Packt account holders; Preface; What this book covers; What you need for this book; Who this book is for; Conventions; Reader feedback; Customer support; Downloading the example code; Errata; Piracy; Questions; 1. Hadoop Distributed File System -- Importing and Exporting Data; Introduction.

Importing and exporting data into HDFS using Hadoop shell commandsGetting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Moving data efficiently between clusters using Distributed Copy; Getting ready; How to do it ... ; How it works ... ; There's more ... ; Importing data from MySQL into HDFS using Sqoop; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Exporting data from HDFS into MySQL using Sqoop; Getting ready; How to do it ... ; How it works ... ; See also; Configuring Sqoop for Microsoft SQL Server; Getting ready; How to do it ... ; How it works ...

Exporting data from HDFS into MongoDBGetting ready; How to do it ... ; How it works ... ; Importing data from MongoDB into HDFS; Getting ready; How to do it ... ; How it works ... ; Exporting data from HDFS into MongoDB using Pig; Getting ready; How to do it ... ; How it works ... ; Using HDFS in a Greenplum external table; Getting ready; How to do it ... ; How it works ... ; There's more ... ; Using Flume to load data into HDFS; Getting ready; How to do it ... ; How it works ... ; There's more ... ; 2. HDFS; Introduction; Reading and writing data to HDFS; Getting ready; How to do it ... ; How it works ...

There's more ... Compressing data using LZO; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Reading and writing data to SequenceFiles; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Using Apache Avro to serialize data; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Using Apache Thrift to serialize data; Getting ready; How to do it ... ; How it works ... ; See also; Using Protocol Buffers to serialize data; Getting ready; How to do it ... ; How it works ... ; Setting the replication factor for HDFS; Getting ready.

How to do it ... How it works ... ; There's more ... ; See also; Setting the block size for HDFS; Getting ready; How to do it ... ; How it works ... ; 3. Extracting and Transforming Data; Introduction; Transforming Apache logs into TSV format using MapReduce; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Using Apache Pig to filter bot traffic from web server logs; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also; Using Apache Pig to sort web server log data by timestamp; Getting ready; How to do it ... ; How it works ... ; There's more ... ; See also.
Abstract:
Realistic, simple code examples to solve problems at scale with Hadoop and related technologies.
Title Subject:
Holds: Copies: