Biological Data Mining and Its Applications in Healthcare.

Title:

Author:

Li, Xiao-Li.

ISBN:

9789814551014

Personal Author:

Li, Xiao-Li.

Physical Description:

1 online resource (437 pages)

Contents:

Contents -- Preface -- Part I: Sequence Analysis -- Mining the Sequence Databases for Homology Detection: Application to Recognition of Functions of Trypanosoma brucei brucei Proteins and Drug Targets -- 1. Introduction -- 2. Remote homology driven approaches for protein function annotation -- 2.1. Sequence-based approaches for remote homology detection -- 2.1.1. Iterated searches using PSI-BLAST -- 2.1.2. Multi-profiles approach to improve sensitivity -- 2.1.3. Cascade PSI-BLAST -- 2.1.4. Hidden Markov Models -- 2.1.5. Profile-profile matching algorithms -- 2.2. Assessment of significant sequence alignments -- 3. Trypanosoma brucei: A case study -- 3.1. Overview on structural and functional domain assignments in T. brucei proteome -- 3.2. Fold assignments -- 3.3. Metabolic proteins in Trypanosoma brucei -- 3.3.1. Domain composition of metabolic proteins -- 3.3.2. Predicting drug targets based on remote homology approaches -- 4. Conclusions -- Acknowledgments -- References -- Identification of Genes and their Regulatory Regions Based on Multiple Physical and Structural Properties of a DNA Sequence -- 1. Introduction -- 2. Gene prediction methods -- 2.1. Background -- 2.2. Exon prediction based on the AR model and multifeature spectral analysis -- 3. Regulatory region (promoter) prediction methods -- 3.1. Background -- 3.2. Cascade AdaBoost algorithm -- 3.3. Hierarchical promoter prediction system based on signal, context and structural properties -- 3.4. Prediction of eukaryotic core promoters based on Isomap and support vector machine -- 3.5. Computational identification of disease-related genes and regulatory regions -- 4. Summary -- Acknowledgement -- References -- Mining Genomic Sequence Data for Related Sequences Using Pairwise Statistical Significance -- 1. Introduction -- 1.1. Biological sequence -- 1.2. Homology and similarity.

1.3. Sequence alignment -- 2. Statistical significance -- 2.1. Why statistical significance? -- 2.2. P-value in statistical significance -- 2.3. Modeling statistical for local sequence alignment -- 2.3.1. Coin-Toss model -- 2.3.2. Assessing the statistical significance using alignment scores -- 2.4. Gumbel extreme value distribution -- 3. Pairwise statistical significance -- 3.1. The definition of pairwise statistical significance -- 3.2. Parameters fitting of pairwise statistical significance -- 3.3. Evaluation of pairwise statistical significance -- 4. HPC solutions for accelerating pairwise statistical significance estimation -- 4.1. Parallel paradigms of HPC techniques -- 4.2. Implementations -- 4.3. Summary -- Acknowledgement -- References -- Part II: Biological Network Mining -- Indexing for Similarity Queries on Biological Networks -- 1. Introduction -- 2. Preliminaries -- 2.1. Definitions -- 2.2. Problem Formulation -- 3. Feature Based Indexing -- 4. Tree Based Indexing -- 5. Reference Based Indexing -- 6. Comparison between Indexing Strategies -- 7. Summary -- References -- Theory and Method of Completion for a Boolean Regulatory Network Using Observed Data -- 1. Introduction -- 1.1. Overview of Three Main Problems -- 2. Network Completion Problem -- 3. Completing Boolean Functions -- 3.1. Algorithms for Tree-like Networks -- 4. Detecting Change of Signaling Pathway -- 5. Integer Programming-based Method -- 5.1. Data of Signaling Pathway -- 5.2. Gene Expression Data -- 5.3. Representing by Linear Inequalities -- 6. Results of Computer Experiment -- 7. Conclusion -- References -- Mining Frequent Subgraph Patterns for Classifying Biological Data -- 1. Introduction -- 2. Frequent Graph Mining -- 2.1. A Primer on Graphs -- 2.2. Mining Frequent Subgraphs -- 2.3. Mining Closed Frequent Subgraphs -- 2.4. Mining Maximal Frequent Subgraphs.

3. Algorithms for Mining Frequent Subgraphs -- 3.1. Breadth-first Algorithms -- 3.2. Depth-first Algorithms -- 4. Classifying Chemical Compounds -- 5. Discovering Protein Structural Motifs -- 5.1. Representing Proteins as Graphs -- 5.2. Mining Family-Specific Subgraphs -- 5.2.1. Mining k-coherent subgraphs -- 5.2.2. Maximal frequent patterns as fingerprints -- 6. Challenges and Future Research -- Acknowledgement -- References -- On the Integration of Prior Knowledge in the Inference of Regulatory Networks -- 1. Introduction -- 2. Retrieval of prior information -- 3. Coding of prior representation -- 4. Integration of prior knowledge in the inference process -- 4.1. Bayesian networks -- 4.1.1. Score-based algorithms -- 4.2. Feature selection methods -- 4.2.1. Predictionet -- 4.3. Gene prioritization methods -- 5. Integration of data and prior knowledge: a case study -- 5.1. Inferring the networks -- 5.2. Validation -- 5.2.1. Comparison with state-of-the-art -- 5.2.2. Stability -- 5.2.3. Prediction -- 5.2.4. Comparing networks for the two data sets -- 5.2.5. The network: edges in accordance with the prior vs new edges -- 5.3. Discussion -- 6. Conclusion -- Acknowledgement -- References -- Part III: Classification, Trend Analysis and 3D Medical Images -- Classification and its Application to Drug-Target Prediction -- 1. Classification -- 1.1. k-Nearest Neighbor (k-NN) -- 1.2. Support Vector Machine -- 1.2.1. Linear SVM -- 1.2.2. Kernel SVM -- 1.3. Bayesian classification -- 1.4. Decision trees -- 1.5. Regression models for classification -- 1.5.1. Logistic Regression -- 1.5.2. Regularized Least Squares -- 1.6. Ensemble classifier -- 2. Drug-target interaction prediction -- 2.1. Background -- 2.2. A binary classification problem -- 2.3. Bipartite graph model (BGM) -- 2.4. Bipartite local model (BLM).

2.5. Enhanced BLM with training data inferring for new drug/target candidates -- 3. Experimental study -- 3.1. Datasets -- 3.2. Approaches compared -- 3.3. Evaluation -- 3.4. Performance comparison -- 4. Summary -- References -- Characterization and Prediction of Human Protein-Protein Interactions -- 1. Introduction -- 2. Classification methods -- 2.1. Framework of classification -- 2.2. Classification algorithms -- 2.2.1. Decision tree -- 2.2.2. Artificial neural network -- 2.2.3. Bayesian model -- 2.2.4. Support vector machine -- 2.2.5. Random forest -- 2.3. Model validation and evaluation -- 3. Application to human PPI data -- 3.1. Background of human PPI and biological problem statement -- 3.2. Human PPIs databases -- 3.3. Datasets for computational study of PPIs -- 3.4. Protein features used for predicting PPIs -- 3.5. Case studies on human PPI prediction -- 3.5.1. Application of naïve Bayesian model -- 3.5.2. Application of semi-naïve Bayesian model -- 3.5.3. Application of Active Learning and Random Forest -- 4. Conclusions -- Acknowledgments -- References -- Trend Analysis -- 1. Introduction -- 2. Methods of trend analysis -- 2.1. Age-period-cohort model -- 2.1.1. Introduction -- 2.1.2. Theoretical foundation -- 2.1.3. Empirical study -- 2.1.4. Problems and solutions -- 2.1.5. Computation software -- 2.1.6. Conclusion -- 2.2. Joinpoint regression -- 2.2.1. Introduction -- 2.2.2. Mathematical model -- 2.2.3. Empirical study -- 2.2.4. Problems and solutions -- 2.2.5. Conclusion -- 2.3. Time series analysis -- 2.3.1. Introduction -- 2.3.2. Mathematical model -- 2.3.3. Empirical study -- 2.3.4. Computation software -- 2.3.5. Conclusion -- 2.4. Cox-Stuart trend test -- 2.4.1. Introduction -- 2.4.2. Mathematical model -- 2.4.3. Empirical study -- 2.4.4. Computation software -- 2.4.5. Conclusion -- 2.5. RUNS test -- 2.5.1. Introduction.

2.5.2. Mathematical model -- 2.5.3. Empirical study -- 2.5.4. Conclusion -- 2.6. Functional data analysis -- 2.6.1. Introduction -- 2.6.2. Mathematical model -- 2.6.3. Empirical study -- 2.6.4. Computation software -- 2.6.5. Conclusion -- 3. Summary -- References -- Data Acquisition and Preprocessing on Three Dimensional Medical Images -- 1. Introduction -- 2. Three Dimensional Image Acquisition Techniques -- 3. Three Dimensional Image Segmentation -- 3.1. Deformable Model -- 3.2. Three Dimensional Image Segmentation with Deformable Model -- 3.3. A Case Study in Segmentation with Active Appearance Models -- 4. Three Dimensional Image Registration -- 4.1. Image Registration Methods -- 4.2. BrainAligner: A Case Study in 3D Image Registration -- References -- Part IV: Text Mining and its Biomedical Applications -- Text Mining in Biomedicine and Healthcare -- 1. Introduction to Text Mining -- 2. Natural Language Processing Techniques Used in Text Mining -- 2.1. Introduction of Natural Language Processing -- 2.2. Lexical Analysis -- 2.3. Syntactic Analysis -- 2.3.1. Grammar -- 2.3.2. Treebank -- 2.4. Semantic Analysis -- 2.4.1. Semantic Role Labeling -- 2.4.2. Proposition Bank -- 3. Information Extraction -- 3.1. Named Entity Recognition -- 3.1.1. Challenges in Named Entity Recognition -- 3.1.2. Machine Learning -- 3.1.3. Named Entity Recognition as a Sequence Labelling Task -- 3.2. Entity Linking -- 3.2.1. Challenges in Entity Linking -- 3.2.2. Linking of Gene Mention in BioCreative Workshop -- 3.3. Relation Extraction -- 3.4. Co-reference Resolution -- 4. Case Study Using a Real Database -- 4.1. Background -- 4.2. Text Mining-based Database Curation -- 4.2.1. Stage 1: Dataset Collection -- 4.2.2. Stage 2: Metadata Generation -- 4.2.3. Stage 3: Candidate Genes Extraction -- 4.2.4. Stage 4: Manual Verification -- 4.3. T-HOD Content and Analyses.

4.4. Useful Text Mining Resources.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Genre:

Electronic Access:

Holds: Copies:

Available:*

Bound With These Titles

On Order