Machine Learning Approaches to Bioinformatics.

Title:

Author:

Yang, Zheng Rong.

ISBN:

9789814287319

Personal Author:

Yang, Zheng Rong.

Physical Description:

1 online resource (337 pages)

Series:

Science, Engineering, and Biology Informatics

Contents:

Contents -- Preface -- 1 Introduction -- 1.1 Brief history of bioinformatics -- 1.2 Database application in bioinformatics -- 1.3 Web tools and services for sequence homology Alignment -- 1.3.1 Web tools and services for protein functional site identification -- 1.3.2 Web tools and services for other biological data -- 1.4 Pattern analysis -- 1.5 The contribution of information technology -- 1.6 Chapters -- 2 Introduction to Unsupervised Learning -- 3 Probability Density Estimation Approaches -- 3.1 Histogram approach -- 3.2 Parametric approach -- 3.3 Non-parametric approach -- 3.3.1 K-nearest neighbour approach -- 3.3.2 Kernel approach -- Summary -- 4 Dimension Reduction -- 4.1 General -- 4.2 Principal component analysis -- 4.3 An application of PCA -- 4.4 Multi-dimensional scaling -- 4.5 Application of the Sammon algorithm to gene data -- Summary -- 5 Cluster Analysis -- 5.1 Hierarchical clustering -- 5.2 K-means -- 5.3 Fuzzy C-means -- 5.4 Gaussian mixture models -- 5.5 Application of clustering algorithms to the Burkholderia pseudomallei gene expression data -- Summary -- 6 Self-organising Map -- 6.1 Vector quantization -- 6.2 SOM structure -- 6.3 SOM learning algorithm -- 6.4 Using SOM for classification -- 6.5 Bioinformatics applications of VQ and SOM -- 6.5.1 Sequence analysis -- 6.5.2 Gene expression data analysis -- 6.5.3 Metabolite data analysis -- 6.6 A case study of gene expression data analysis -- 6.7 A case study of sequence data analysis -- Summary -- 7 Introduction to Supervised Learning -- 7.1 General concepts -- 7.2 General definition -- 7.3 Model evaluation -- 7.4 Data organisation -- 7.5 Bayes rule for classification -- Summary -- 8 Linear/Quadratic Discriminant Analysis and K-nearest Neighbour -- 8.1 Linear discriminant analysis -- 8.2 Generalised discriminant analysis -- 8.3 K-nearest neighbour.

8.4 KNN for gene data analysis -- Summary -- 9 Classification and Regression Trees, Random Forest Algorithm -- 9.1 Introduction -- 9.2 Basic principle for constructing a classification tree -- 9.3 Classification and regression tree -- 9.4 CART for compound pathway involvement prediction -- 9.5 The random forest algorithm -- 9.6 RF for analyzing Burkholderia pseudomallei gene expression profiles -- Summary -- 10 Multi-layer Perceptron -- 10.1 Introduction -- 10.2 Learning theory -- 10.2.1 Parameterization of a neural network -- 10.2.2 Learning rules -- 10.3 Learning algorithms -- 10.3.1 Regression -- 10.3.2 Classification -- 10.3.3 Procedure -- 10.4 Applications to bioinformatics -- 10.4.1 Bio-chemical data analysis -- 10.4.2 Gene expression data analysis -- 10.4.3 Protein structure data analysis -- 10.4.4 Bio-marker identification -- 10.5 A case study on Burkholderia pseudomallei gene expression data -- Summary -- 11 Basis Function Approach and Vector Machines -- 11.1 Introduction -- 11.2 Radial-basis function neural network (RBFNN) -- 11.3 Bio-basis function neural network -- 11.4 Support vector machine -- 11.5 Relevance vector machine -- Summary -- 12 Hidden Markov Model -- 12.1 Markov model -- 12.2 Hidden Markov model -- 12.2.1 General definition -- 12.2.2 Handling HMM -- 12.2.3 Evaluation -- 12.2.4 Decoding -- 12.2.5 Learning -- 12.3 HMM for sequence classification -- Summary -- 13 Feature Selection -- 13.1 Built-in strategy -- 13.1.1 Lasso regression -- 13.1.2 Ridge regression -- 13.1.3 Partial least square regression (PLS) algorithm -- 13.2 Exhaustive strategy -- 13.3 Heuristic strategy - orthogonal least square approach -- 13.4 Criteria for feature selection -- 13.4.1 Correlation measure -- 13.4.2 Fisher ratio measure -- 13.4.3 Mutual information approach -- Summary -- 14 Feature Extraction (Biological Data Coding) -- 14.1 Molecular sequences.

14.2 Chemical compounds -- 14.3 General definition -- 14.4 Sequence analysis -- 14.4.1 Peptide feature extraction -- 14.4.2 Whole sequence feature extraction -- Summary -- 15 Sequence/Structural Bioinformatics Foundation - Peptide Classification -- 15.1 Nitration site prediction -- 15.2 Plant promoter region prediction -- Summary -- 16 Gene Network - Causal Network and Bayesian Networks -- 16.1 Gene regulatory network -- 16.2 Causal networks, networks, graphs -- 16.3 A brief review of the probability -- 16.4 Discrete Bayesian network -- 16.5 Inference with discrete Bayesian network -- 16.6 Learning discrete Bayesian network -- 16.7 Bayesian networks for gene regulartory networks -- 16.8 Bayesian networks for discovering peptide patterns -- 16.9 Bayesian networks for analysing Burkholderia pseudomallei gene data -- Summary -- 17 S-Systems -- 17.1 Michealis-Menten change law -- 17.2 S-system -- 17.3 Simplification of an S-system -- 17.4 Approaches for structure identification and parameter estimation -- 17.4.1 Neural network approach -- 17.4.2 Simulated annealing approach -- 17.4.3 Evolutionary computation approach -- 17.5 Steady-state analysis of an S-system -- 17.6 Sensitivity of an S-system -- Summary -- 18 Future Directions -- 18.1 Multi-source data -- 18.2 Gene regulatory network construction -- 18.3 Building models using incomplete data -- 18.4 Biomarker detection from gene expression data -- Summary -- References -- Index.

Abstract:

This book covers a wide range of subjects in applying machine learning approaches for bioinformatics projects. The book succeeds on two key unique features. First, it introduces the most widely used machine learning approaches in bioinformatics and discusses, with evaluations from real case studies, how they are used in individual bioinformatics projects. Second, it introduces state-of-the-art bioinformatics research methods. Furthermore, the book includes R codes and example data sets to help readers develop their own bioinformatics research skills. The theoretical parts and the practical parts are well integrated for readers to follow the existing procedures in individual research. Unlike most of the bioinformatics textbooks on the market, the content coverage is not limited to just one subject. A broad spectrum of relevant topics in bioinformatics including systematic data mining and computational systems biology researches are brought together in this book, thereby offering an efficient and convenient platform for undergraduate/graduate teaching.An essential textbook for both final year undergraduates and graduate students in universities, as well as a comprehensive handbook for new researchers, this book will also serve as a practical guide for software development in relevant bioinformatics projects.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Bioinformatics -- Case studies.

Bioinformatics.

Machine learning -- Case studies.

Machine learning.

Genre:

Electronic books.

Electronic Access:

Click to View

Holds: Copies:

Available:*

Bound With These Titles

On Order