Cover image for Data Bonanza : Improving Knowledge Discovery in Science, Engineering, and Business.
Data Bonanza : Improving Knowledge Discovery in Science, Engineering, and Business.
Title:
Data Bonanza : Improving Knowledge Discovery in Science, Engineering, and Business.
Author:
Atkinson, Malcolm.
ISBN:
9781118540244
Personal Author:
Edition:
1st ed.
Physical Description:
1 online resource (578 pages)
Series:
Wiley Series on Parallel and Distributed Computing ; v.90

Wiley Series on Parallel and Distributed Computing
Contents:
The DATA Bonanza -- Contents -- CONTRIBUTORS -- FOREWORD -- PREFACE -- THE EDITORS -- PART I STRATEGIES FOR SUCCESS IN THE DIGITAL-DATA REVOLUTION -- 1. The Digital-Data Challenge -- 1.1 The Digital Revolution -- 1.2 Changing How We Think and Behave -- 1.3 Moving Adroitly in this Fast-Changing Field -- 1.4 Digital-Data Challenges Exist Everywhere -- 1.5 Changing How We Work -- 1.6 Divide and Conquer Offers the Solution -- 1.7 Engineering Data-to-Knowledge Highways -- References -- 2. The Digital-Data Revolution -- 2.1 Data, Information, and Knowledge -- 2.2 Increasing Volumes and Diversity of Data -- 2.3 Changing the Ways We Work with Data -- References -- 3. The Data-Intensive Survival Guide -- 3.1 Introduction: Challenges and Strategy -- 3.2 Three Categories of Expert -- 3.3 The Data-Intensive Architecture -- 3.4 An Operational Data-Intensive System -- 3.5 Introducing DISPEL -- 3.6 A Simple DISPEL Example -- 3.7 Supporting Data-Intensive Experts -- 3.8 DISPEL in the Context of Contemporary Systems -- 3.9 Datascopes -- 3.10 Ramps for Incremental Engagement -- 3.11 Readers' Guide to the Rest of This Book -- References -- 4. Data-Intensive Thinking with DISPEL -- 4.1 Processing Elements -- 4.2 Connections -- 4.3 Data Streams and Structure -- 4.4 Functions -- 4.5 The Three-Level Type System -- 4.6 Registry, Libraries, and Descriptions -- 4.7 Achieving Data-Intensive Performance -- 4.8 Reliability and Control -- 4.9 The Data-to-Knowledge Highway -- References -- PART II DATA-INTENSIVE KNOWLEDGE DISCOVERY -- 5. Data-Intensive Analysis -- 5.1 Knowledge Discovery in Telco Inc. -- 5.2 Understanding Customers to Prevent Churn -- 5.3 Preventing Churn Across Multiple Companies -- 5.4 Understanding Customers by Combining Heterogeneous Public and Private Data -- 5.5 Conclusions -- References -- 6. Problem Solving in Data-Intensive Knowledge Discovery.

6.1 The Conventional Life Cycle of Knowledge Discovery -- 6.2 Knowledge Discovery Over Heterogeneous Data Sources -- 6.3 Knowledge Discovery from Private and Public, Structured and Nonstructured Data -- 6.4 Conclusions -- References -- 7. Data-Intensive Components and Usage Patterns -- 7.1 Data Source Access and Transformation Components -- 7.2 Data Integration Components -- 7.3 Data Preparation and Processing Components -- 7.4 Data-Mining Components -- 7.5 Visualization and Knowledge Delivery Components -- References -- 8. Sharing and Reuse in Knowledge Discovery -- 8.1 Strategies for Sharing and Reuse -- 8.2 Data Analysis Ontologies for Data Analysis Experts -- 8.3 Generic Ontologies for Metadata Generation -- 8.4 Domain Ontologies for Domain Experts -- 8.5 Conclusions -- References -- PART III DATA-INTENSIVE ENGINEERING -- 9. Platforms for Data-Intensive Analysis -- 9.1 The Hourglass Reprise -- 9.2 The Motivation for a Platform -- 9.3 Realization -- References -- 10. Definition of the DISPEL Language -- 10.1 A Simple Example -- 10.2 Processing Elements -- 10.3 Data Streams -- 10.4 Type System -- 10.5 Registration -- 10.6 Packaging -- 10.7 Workflow Submission -- 10.8 Examples of DISPEL -- 10.9 Summary -- References -- 11. DISPEL Development -- 11.1 The Development Landscape -- 11.2 Data-Intensive Workbenches -- 11.3 Data-Intensive Component Libraries -- 11.4 Summary -- References -- 12. DISPEL Enactment -- 12.1 Overview of DISPEL Enactment -- 12.2 DISPEL Language Processing -- 12.3 DISPEL Optimization -- 12.4 DISPEL Deployment -- 12.5 DISPEL Execution and Control -- References -- PART IV DATA-INTENSIVE APPLICATION EXPERIENCE -- 13. The Application Foundations of DISPEL -- 13.1 Characteristics of Data-Intensive Applications -- 13.2 Evaluating Application Performance -- 13.3 Reviewing the Data-Intensive Strategy.

14. Analytical Platform for Customer Relationship Management -- 14.1 Data Analysis in the Telecoms Business -- 14.2 Analytical Customer Relationship Management -- 14.3 Scenario 1: Churn Prediction -- 14.4 Scenario 2: Cross Selling -- 14.5 Exploiting the Models and Rules -- 14.6 Summary: Lessons Learned -- References -- 15. Environmental Risk Management -- 15.1 Environmental Modeling -- 15.2 Cascading Simulation Models -- 15.3 Environmental Data Sources and Their Management -- 15.4 Scenario 1: ORAVA -- 15.5 Scenario 2: RADAR -- 15.6 Scenario 3: SVP -- 15.7 New Technologies for Environmental Data Mining -- 15.8 Summary: Lessons Learned -- References -- 16. Analyzing Gene Expression Imaging Data in Developmental Biology -- 16.1 Understanding Biological Function -- 16.2 Gene Image Annotation -- 16.3 Automated Annotation of Gene Expression Images -- 16.4 Exploitation and Future Work -- 16.5 Summary -- References -- 17. Data-Intensive Seismology: Research Horizons -- 17.1 Introduction -- 17.2 Seismic Ambient Noise Processing -- 17.3 Solution Implementation -- 17.4 Evaluation -- 17.5 Further Work -- 17.6 Conclusions -- References -- PART V DATA-INTENSIVE BEACONS OF SUCCESS -- 18. Data-Intensive Methods in Astronomy -- 18.1 Introduction -- 18.2 The Virtual Observatory -- 18.3 Data-Intensive Photometric Classification of Quasars -- 18.4 Probing the Dark Universe with Weak Gravitational Lensing -- 18.5 Future Research Issues -- 18.6 Conclusions -- References -- 19. The World at One's Fingertips: Interactive Interpretation of Environmental Data -- 19.1 Introduction -- 19.2 The Current State of the Art -- 19.3 The Technical Landscape -- 19.4 Interactive Visualization -- 19.5 From Visualization to Intercomparison -- 19.6 Future Development: The Environmental Cloud -- 19.7 Conclusions -- References.

20. Data-Driven Research in the Humanities-the DARIAH Research Infrastructure -- 20.1 Introduction -- 20.2 The Tradition of Digital Humanities -- 20.3 Humanities Research Data -- 20.4 Use Case -- 20.5 Conclusion and Future Development -- References -- 21. Analysis of Large and Complex Engineering and Transport Data -- 21.1 Introduction -- 21.2 Applications and Challenges -- 21.3 The Methods Used -- 21.4 Future Developments -- 21.5 Conclusions -- References -- 22. Estimating Species Distributions-Across Space, Through Time, and with Features of the Environment -- 22.1 Introduction -- 22.2 Data Discovery, Access, and Synthesis -- 22.3 Model Development -- 22.4 Managing Computational Requirements -- 22.5 Exploring and Visualizing Model Results -- 22.6 Analysis Results -- 22.7 Conclusion -- References -- PART VI THE DATA-INTENSIVE FUTURE -- 23. Data-Intensive Trends -- 23.1 Reprise -- 23.2 Data-Intensive Applications -- References -- 24. Data-Rich Futures -- 24.1 Future Data Infrastructure -- 24.2 Future Data Economy -- 24.3 Future Data Society and Professionalism -- References -- Appendix A: Glossary -- Appendix B: DISPEL Reference Manual -- Appendix C: Component Definitions -- INDEX -- Wiley Series on Parallel and Distributed Computing.
Abstract:
Complete guidance for mastering the tools and techniques of the digital revolution With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections. Emphasizing data-intensive thinking and interdisciplinary collaboration, The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book: Outlines the concepts and rationale for implementing data-intensive computing in organizations Covers from the ground up problem-solving strategies for data analysis in a data-rich world Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL Features in-depth case studies in customer relations, environmental hazards, seismology, and more Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering Includes sample program snippets throughout the text as well as additional materials on a companion website The Data Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or

large-scale computing.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Electronic Access:
Click to View
Holds: Copies: