Database Technology for Life Sciences and Medicine.

Title:

Author:

Plant, Claudia.

ISBN:

9789814307710

Personal Author:

Plant, Claudia.

Physical Description:

1 online resource (388 pages)

Series:

Science, Engineering, and Biology Informatics

Contents:

Contents -- Preface -- 1. Biomedical Databases and Data Mining -- 1.1 Databases and Knowledge Discovery in Biomedicine -- 1.2 Outline of this Book -- 2. DYNASTAT: A Methodology for Modeling of Multiagent Systems -- 2.1 Introduction -- 2.2 Literature Review -- 2.2.1 Multi-agent Systems in the Medical Domain -- 2.2.2 Methodologies for Modeling Multi-agent Systems -- 2.2.3 UML Modeling of Multi-agent Systems -- 2.3 DYNASTAT Methodology -- 2.3.1 Conceptual Overview of Multi-agent Systems -- 2.3.2 Modeling Phases -- 2.4 Use of UML 2.2 in the Framework of DYNASTAT Methodology -- 2.4.1 Overview of UML 2.2 -- 2.4.2 Usage of UML for Modeling Multi-agent Systems -- 2.4.3 Selected UML Diagrams -- 2.5 UML to Model Medical Multi-agent Systems -- 2.5.1 Medical Researcher (UML Examples) -- 2.5.1.1 Modeling Information Retrieval -- 2.5.2 Modeling Data Collection and Mining Experiments -- 2.5.3 General Practitioner (UML Examples) -- 2.6 Possible Applications -- 2.7 Conclusion -- 3. SciPort: An Extensible Data Management Platform for Biomedical Research -- 3.1 Introduction -- 3.1.1 Overview -- 3.1.2 Complex Requirements -- 3.1.3 Semantic Consistency -- 3.1.4 Data Sharing -- 3.1.5 Our Contributions -- 3.2 Related Work -- 3.3 Unified Scientific Data Modeling -- 3.3.1 The Scientific Document Model -- 3.3.2 XML Based Implementation of the Data Model -- 3.3.2.1 Metadata -- 3.3.2.2 Content -- 3.3.2.3 The Bene ts of XML -- 3.3.3 XML-based Schema Definition -- 3.3.4 Hierarchical Organization of Data -- 3.4 Document Authoring and Searching -- 3.4.1 Document Authoring -- 3.4.1.1 File Cabinet -- 3.4.1.2 Hierarchy and Schema Authoring -- 3.4.2 DICOM Data Anonymization -- 3.4.3 Searching Documents -- 3.4.4 Semantic Enabled Biomedical Data Management -- 3.4.4.1 Semantic Enabled Data Authoring -- 3.4.4.2 Semantic Enabled Search -- 3.5 Sharing Distributed Biomedical Data.

3.5.1 Sharing Data through a Central Server -- 3.5.2 Data Synchronization -- 3.6 The Architecture of SciPort -- 3.6.1 Open System Architecture -- 3.6.2 Rich Internet Based Application -- 3.6.3 The Architecture Components -- 3.6.4 SciPort Workow -- 3.6.5 High Adaptability through Customization -- 3.7 Conclusion -- 4. An Integrative Framework for Anonymizing Clinical and Genomic Data -- 4.1 Introduction -- 4.1.1 Challenges -- 4.1.2 Contributions -- 4.1.3 Organization -- 4.2 Related Work -- 4.2.1 Data Integration -- 4.2.2 Privacy-preserving Data Sharing -- 4.2.2.1 Anonymization of Di erent Data Types -- 4.2.2.2 Anonymization of Di erent Data Sources -- 4.3 The DIANOVA Framework -- 4.3.1 Setting -- 4.3.1.1 Parties -- 4.3.1.2 Datasets -- 4.3.1.3 Privacy and Utility Requirements -- 4.3.2 Data Flow in DIANOVA -- 4.4 Algorithms for Realizing DIANOVA -- 4.4.1 Algorithms for the Data Integrator (DI) -- 4.4.2 Algorithms for the ANOnymizer (ANO) -- 4.4.3 Algorithms for the View Auditor (VA) -- 4.5 Extensions of DIANOVA -- 4.6 Conclusion -- Acknowledgements -- 5. Data Integration Challenges: A Systems Biology Perspective -- 5.1 Introduction -- 5.2 Modeling Biological Systems -- 5.3 Biological and Mathematical Data Sources -- 5.4 Various Data Exchange Formats in Systems Biology -- 5.5 Building an Integrative Framework to Combine Modeling and Biological Data Sources -- 5.5.1 System Architecture -- 5.5.2 Data Employed in the PDVE -- 5.5.3 Data Employed in Simulations -- 5.5.4 Integration of Protein Interaction Data Sources -- 5.5.5 Integration of Mathematical Modeling Data Sources -- 5.5.6 Database Schema Structure -- 5.5.7 Pathway Model Implementation -- 5.5.8 Exporting the Pathway Model -- 5.6 Analysis of the Developed Integrative Environment -- 5.7 Conclusion -- 6. Ontology-based Data Integration: A Case Study in Clinical Trials -- 6.1 Introduction.

6.2 System Architecture and Overview -- 6.3 The CTDM Ontology -- 6.4 Ontology-based Data Integration of Study Relevant Information -- 6.4.1 Translation between Ontologies and Relational Schemas -- 6.4.2 Mapping Creation and Query Generation -- 6.4.2.1 Mapping of Datatype Properties -- 6.4.2.2 Mapping of Object Properties Representing 1:1-Relationships -- 6.4.2.3 Mapping of Object Properties Representing 1:N-relationships -- 6.4.2.4 Mapping of Object Properties Representing n:m-relationships -- 6.4.2.5 Inheritance Relationships -- 6.5 Assembly of ETL Processes Based on Ontology Mappings -- 6.6 Case Study and Evaluation -- 6.6.1 The Heart Failure Management (HFM) Pilot Trial -- 6.6.2 Evaluation of the CTDMO -- 6.6.3 Evaluation of the Mapping De nition and Integration Module Creation -- 6.6.3.1 Evaluation of Usability -- 6.6.3.2 Evaluation of Feasibility -- 6.7 Related Work -- 6.8 Conclusion -- 7. A Data Warehouse for Ca. Glomeribacter Gigasporarum Bacterium -- 7.1 Introduction -- 7.2 State-of-the-art of Metagenomics for Genomic Comparison -- 7.2.1 GMOD and Chado Database -- 7.3 BIOBITS System Architecture -- 7.3.1 Star Schema in BIOBITS Data Mart -- 7.3.2 System Architecture -- 7.3.2.1 Services on Chado and the Star Schema -- 7.4 Software Modules to Support Researchers' Activities -- 7.4.1 Case-based Reasoning -- 7.4.1.1 Multiple Abstractions on the Genome -- 7.4.1.2 Multi-dimensional Index Structures -- 7.4.2 Clustering Modules -- 7.4.2.1 Co-clustering -- 7.4.2.2 Proximity Measures -- 7.5 Conclusion -- 8. Quality of Medical Data: A Case Study -- 8.1 Introduction -- 8.1.1 Quality and Medical Data -- 8.1.2 Features of Medical Data -- 8.1.3 A Quality Methodology for Medical Data -- 8.1.4 Dimensions of Analysis -- 8.1.4.1 Data-related Dimensions -- 8.1.4.2 Meta-data-related Dimensions -- 8.1.4.3 System-related Dimensions -- 8.2 Case Study.

8.2.1 Description and Extraction of the Data -- 8.2.2 Quality Assessment -- 8.2.2.1 Data -- 8.2.2.2 Meta-data -- 8.2.2.3 System -- 8.3 Generation of a Summary Table -- 8.4 Conclusion -- 9. Efficient EMD-based Similarity Search in Medical Image Databases -- 9.1 Introduction -- 9.1.1 Related Work -- 9.1.2 Formal Definition of the Earth Mover's Distance -- 9.1.3 Multi-Step Query Processing and the EMD -- 9.2 Dimensionality Reduction for the EMD -- 9.2.1 Dimensionality Reduction -- 9.2.2 Optimal Dimensionality Reduction -- 9.2.2.1 Optimal Cost Reduction -- 9.2.2.2 Optimal Flow Reduction -- 9.2.3 Flow-based Reduction -- 9.3 Query Processing Algorithm -- 9.4 Evaluation on Medical Data Sets -- 9.5 Conclusion -- 10. Fast Multimedia Querying for Medical Applications -- 10.1 Introduction -- 10.2 Related Work -- 10.3 Subspace Tree -- 10.3.1 Orthogonal Projection and the Mean Value -- 10.3.2 Image Pyramid - A Subspace Tree -- 10.4 Experiments -- 10.5 Conclusion -- Acknowledgments -- 11. Ensemble Feature Selection in Biomedical Applications -- 11.1 Introduction -- 11.1.1 Statistical Hypothesis Testing -- 11.1.2 Information Gain -- 11.1.3 ReliefF -- 11.1.4 Biomarker Identifier -- 11.2 Evaluation of Feature Selection Approaches -- 11.2.1 Popular Classifiers for Assessing Discriminatory Ability of Selected Features -- 11.2.2 Classifier Validation -- 11.3 Ensemble Feature Selection -- 11.3.1 Stacked Feature Ranking -- 11.4 Biomedical Example -- 11.4.1 Data Collection -- 11.4.2 Sample Preparation and Mass Spectrometry Analysis -- 11.5 Computational Approach -- 11.5.1 Experimental Setup -- 11.6 Results -- 11.6.1 Comparison of the Predictive Power of the Applied Feature Ranking Methods -- 11.6.2 Identified Breath Gas Marker Candidates -- 11.7 Discussion -- 11.8 Conclusion -- Acknowledgment -- 12. Analysis of Breast Cancer Genomic Data by Fuzzy Association Rule Mining.

12.1 Introduction -- 12.1.1 Breast Cancer -- 12.1.1.1 Tumor stage -- 12.1.1.2 HER2/neu (Human Epidermal Growth Factor Receptor 2) -- 12.1.1.3 HER2 Testing Methodologies -- 12.1.1.4 Additional Biomarkers -- 12.2 Microarrays -- 12.3 Association Rule Mining -- 12.3.1 Association Rules: Formal Definition -- 12.4 Fuzzy Association Rules -- 12.5 Dataset -- 12.5.1 HER2 Testing Methodologies -- 12.5.2 Immunohistochemical Data -- 12.5.3 Microarray Data -- 12.6 Extracting the Fuzzy Association Rules -- 12.6.1 Fuzzy Top-Down Frequent-Pattern Growth Algorithm -- 12.6.1.1 Obtaining the List of Frequent Items -- 12.6.1.2 Building the Fuzzy Frequent-pattern Tree -- 12.6.1.3 Frequent Itemset Generation -- 12.6.2 Obtaining and Processing the Fuzzy Association Rules -- 12.6.2.1 Generating the Fuzzy Association Rules from the Frequent Itemsets -- 12.6.2.2 Deficiencies of the Con dence/Support Framework -- 12.7 Results -- 12.7.1 Exploratory Analysis of 2751 Patients -- 12.7.2 Whole-genome Expression Data and Prognostic Factors -- 12.8 Conclusion -- Acknowledgements -- 13. Graph Mining on Brain Co-activation Networks -- 13.1 Introduction -- 13.2 Related Work -- 13.2.1 Graph Dataset Mining -- 13.2.2 Large Graph Mining -- 13.3 Method -- 13.3.1 Basics of Graph Theory -- 13.3.2 Construction of Brain Co-activation Networks Out of fMRI Timeseries -- 13.3.3 Performing Frequent Subgraph Mining on Brain Co-activation Networks -- 13.3.4 Evaluation of Detected Motifs -- 13.4 Experiments -- 13.5 Conclusion -- 14. Automatic Identification of Surgery Indicators -- 14.1 Introduction -- 14.1.1 Aims and Objectives -- 14.1.2 Outline -- 14.2 Background -- 14.2.1 Referral Contents -- 14.2.2 Related Work -- 14.3 Approach -- 14.3.1 Supervised Learning -- 14.3.2 Theoretical Modeling -- 14.3.3 Models for Structured Referrals -- 14.3.4 Prediction Models for the Surgical Suite.

14.3.5 Algorithm Selection.

Abstract:

This book presents innovative approaches from database researchers supporting the challenging process of knowledge discovery in biomedicine. Ranging from how to effectively store and organize biomedical data via data quality and case studies to sophisticated data mining methods, this book provides the state-of-the-art of database technology for life sciences and medicine. A valuable source of information for experts in life sciences who want to be updated about the possibilities of database technology in their field, this volume will also be inspiring for students and researchers in informatics who are keen to contribute to this emerging field of interdisciplinary research.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Database management.

Genre:

Electronic books.

Electronic Access:

Click to View

Holds: Copies:

Available:*

Bound With These Titles

On Order