Real-Time Analytics : Techniques to Analyze and Visualize Streaming Data. için kapak resmi
Real-Time Analytics : Techniques to Analyze and Visualize Streaming Data.
Başlık:
Real-Time Analytics : Techniques to Analyze and Visualize Streaming Data.
Yazar:
Ellis, Byron.
ISBN:
9781118838020
Yazar Ek Girişi:
Basım Bilgisi:
1st ed.
Fiziksel Tanımlama:
1 online resource (378 pages)
İçerik:
Cover -- Title Page -- Copyright -- Contents -- Chapter 1 Introduction to Streaming Data -- Sources of Streaming Data -- Operational Monitoring -- Web Analytics -- Online Advertising -- Social Media -- Mobile Data and the Internet of Things -- Why Streaming Data Is Different -- Always On, Always Flowing -- Loosely Structured -- High-Cardinality Storage -- Infrastructures and Algorithms -- Conclusion -- Part I Streaming Analytics Architecture -- Chapter 2 Designing Real-Time Streaming Architectures -- Real-Time Architecture Components -- Collection -- Data Flow -- Processing -- Storage -- Delivery -- Features of a Real-Time Architecture -- High Availability -- Low Latency -- Horizontal Scalability -- Languages for Real-Time Programming -- Java -- Scala and Clojure -- JavaScript -- The Go Language -- A Real-Time Architecture Checklist -- Collection -- Data Flow -- Processing -- Storage -- Delivery -- Conclusion -- Chapter 3 Service Configuration and Coordination -- Motivation for Configuration and Coordination Systems -- Maintaining Distributed State -- Unreliable Network Connections -- Clock Synchronization -- Consensus in an Unreliable World -- Apache ZooKeeper -- The znode -- Watches and Notifications -- Maintaining Consistency -- Creating a ZooKeeper Cluster -- ZooKeeper's Native Java Client -- The Curator Client -- Curator Recipes -- Conclusion -- Chapter 4 Data-Flow Management in Streaming Analysis -- Distributed Data Flows -- At Least Once Delivery -- The "n+1" Problem -- Apache Kafka: High-Throughput Distributed Messaging -- Design and Implementation -- Configuring a Kafka Environment -- Interacting with Kafka Brokers -- Apache Flume: Distributed Log Collection -- The Flume Agent -- Configuring the Agent -- The Flume Data Model -- Channel Selectors -- Flume Sources -- Flume Sinks -- Sink Processors -- Flume Channels.

Flume Interceptors -- Integrating Custom Flume Components -- Running Flume Agents -- Conclusion -- Chapter 5 Processing Streaming Data -- Distributed Streaming Data Processing -- Coordination -- Partitions and Merges -- Transactions -- Processing Data with Storm -- Components of a Storm Cluster -- Configuring a Storm Cluster -- Distributed Clusters -- Local Clusters -- Storm Topologies -- Implementing Bolts -- Implementing and Using Spouts -- Distributed Remote Procedure Calls -- Trident: The Storm DSL -- Processing Data with Samza -- Apache YARN -- Getting Started with YARN and Samza -- Integrating Samza into the Data Flow -- Samza Jobs -- Conclusion -- Chapter 6 Storing Streaming Data -- Consistent Hashing -- "NoSQL" Storage Systems -- Redis -- MongoDB -- Cassandra -- Other Storage Technologies -- Relational Databases -- Distributed In-Memory Data Grids -- Choosing a Technology -- Key-Value Stores -- Document Stores -- Distributed Hash Table Stores -- In-Memory Grids -- Relational Databases -- Warehousing -- Hadoop as ETL and Warehouse -- Lambda Architectures -- Conclusion -- Part II Analysis and Visualization -- Chapter 7 Delivering Streaming Metrics -- Streaming Web Applications -- Working with Node -- Managing a Node Project with NPM -- Developing Node Web Applications -- A Basic Streaming Dashboard -- Adding Streaming to Web Applications -- Visualizing Data -- HTML5 Canvas and Inline SVG -- Data-Driven Documents: D3.js -- High-Level Tools -- Mobile Streaming Applications -- Conclusion -- Chapter 8 Exact Aggregation and Delivery -- Timed Counting and Summation -- Counting in Bolts -- Counting with Trident -- Counting in Samza -- Multi-Resolution Time-Series Aggregation -- Quantization Framework -- Stochastic Optimization -- Delivering Time-Series Data -- Strip Charts with D3.js -- High-Speed Canvas Charts -- Horizon Charts.

Conclusion -- Chapter 9 Statistical Approximation of Streaming Data -- Numerical Libraries -- Probabilities and Distributions -- Expectation and Variance -- Statistical Distributions -- Discrete Distributions -- Continuous Distributions -- Joint Distributions -- Working with Distributions -- Inferring Parameters -- The Delta Method -- Distribution Inequalities -- Random Number Generation -- Generating Specific Distributions -- Sampling Procedures -- Sampling from a Fixed Population -- Sampling from a Streaming Population -- Biased Streaming Sampling -- Conclusion -- Chapter 10 Approximating Streaming Data with Sketching -- Registers and Hash Functions -- Registers -- Hash Functions -- Working with Sets -- The Bloom Filter -- The Algorithm -- Choosing a Filter Size -- Unions and Intersections -- Cardinality Estimation -- Interesting Variations -- Distinct Value Sketches -- The Min-Count Algorithm -- The HyperLogLog Algorithm -- The Count-Min Sketch -- Point Queries -- Count-Min Sketch Implementation -- Top-K and "Heavy Hitters" -- Range and Quantile Queries -- Other Applications -- Conclusion -- Chapter 11 Beyond Aggregation -- Models for Real-Time Data -- Simple Time-Series Models -- Linear Models -- Logistic Regression -- Neural Network Models -- Forecasting with Models -- Exponential Smoothing Methods -- Regression Methods -- Neural Network Methods -- Monitoring -- Outlier Detection -- Change Detection -- Real-Time Optimization -- Conclusion -- Index -- EULA.
Özet:
Construct a robust end-to-end solution for analyzing and visualizing streaming data Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data, expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis platforms. The author is among a very few leading experts in the field. He has a prestigious background in research, development, analytics, real-time visualization, and Big Data streaming and is uniquely qualified to help you explore this revolutionary field. Moving from a description of the overall analytic architecture of real-time analytics to using specific tools to obtain targeted results, Real-Time Analytics leverages open source and modern commercial tools to construct robust, efficient systems that can provide real-time analysis in a cost-effective manner. The book includes: A deep discussion of streaming data systems and architectures Instructions for analyzing, storing, and delivering streaming data Tips on aggregating data and working with sets Information on data warehousing options and techniques Real-Time Analytics includes in-depth case studies for website analytics, Big Data, visualizing streaming and mobile data, and mining and visualizing operational data flows. The book's "recipe" layout lets readers quickly learn and implement different techniques. All of the code examples presented in the book, along with their related data sets, are available on the companion website.
Notlar:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Yazar Ek Girişi:
Elektronik Erişim:
Click to View
Ayırtma: Copies: