Analysis of Biological Data : A Soft Computing Approach.

Title:

Author:

Bandyopadhyay, Sanghamitra.

ISBN:

9789812708892

Personal Author:

Bandyopadhyay, Sanghamitra.

Physical Description:

1 online resource (352 pages)

Series:

Science, Engineering, and Biology Informatics ; v.3

Science, Engineering, and Biology Informatics

Contents:

CONTENTS -- Preface -- Part I OVERVIEW -- Chapter 1 Bioinformatics: Mining the Massive Data from High Throughput Genomics Experiments Haixu Tang and Sun Kim -- 1 Introduction -- 2 Recent Development of Classical Topics -- 2.1 Sequence alignment -- 2.2 Genome sequencing and fragment assembly -- 2.3 Gene annotation -- 2.4 RNA folding -- 2.5 Motif finding -- 2.6 Protein structure prediction -- 3 Emerging Topics from New Genome Technologies -- 3.1 Comparative genomics: beyond genome comparison -- 3.2 Pathway reconstruction -- 3.3 Microarray analysis -- 3.4 Proteomics -- 3.5 Protein-protein interaction -- 4 Conclusion -- Acknowledgement -- References -- Chapter 2 An Introduction to Soft Computing Amit Konar and Swagatam Das -- 1 Classical AI and its Pitfalls -- 2 What is Soft Computing? -- 3 Fundamental Components of Soft Computing -- 3.1 Fuzzy sets and fuzzy logic -- 3.2 Neural networks -- 3.3 Genetic algorithms -- 3.4 Belief networks -- 4 Synergism in Soft Computing -- 4.1 Neuro-fuzzy synergism -- 4.2 Neuro-GA synergism -- 4.3 Fuzzy-GA synergism -- 4.4 Neuro-belief network synergism -- 4.5 GA-belief network synergism -- 4.6 Neuro-fuzzy-GA synergism -- 5 Some Emerging Areas of Soft Computing -- 5.1 Artificial life -- 5.2 Particle swarm optimization (PSO) -- 5.3 Artificial immune system -- 5.4 Rough sets and granular computing -- 5.5 Chaos theory -- 5.6 Ant colony systems (ACS) -- 6 Summary -- References -- Part II BIOLOGICAL SEQUENCE AND STRUCTURE ANALYSIS -- Chapter 3 Reconstructing Phylogenies with Memetic Algorithms and Branch-and-Bound José E. Gallardo, Carlos Cotta and Antonio J. Fernández -- 1 Introduction -- 2 A Crash Introduction to Phylogenetic Inference -- 3 Evolutionary Algorithms for the Phylogeny Problem -- 4 A BnB Algorithm for Phylogenetic Inference -- 5 A Memetic Algorithm for Phylogenetic Inference -- 6 A Hybrid Algorithm.

7 Experimental Results -- 7.1 Experimental setting -- 7.2 Sensitivity analysis on the hybrid algorithm -- 7.3 Analysis of results -- 8 Conclusions -- Acknowledgment -- References -- Chapter 4 Classification ofRNASequences with Support Vector Machines Jason T. L. Wang and Xiaoming Wu -- 1 Introduction -- 2 Count Kernels and Marginalized Count Kernels -- 2.1 RNA sequences with known secondary structures -- 2.2 RNA sequences with unknown secondary structures -- 3 Kernel Based on Labeled Dual Graphs -- 3.1 Labeled dual graphs -- 3.2 Marginalized kernel for labeled dual graphs -- 4 A New Kernel -- 4.1 Extracting features for global structural information -- 4.2 Extracting features for local structural information -- 5 Experiments and Results -- 5.1 Data and parameters -- 5.2 Results -- 6 Conclusion -- Acknowledgment -- References -- Chapter 5 Beyond String Algorithms: Protein Sequence Analysis using Wavelet Transforms Arun Krishnan and Kuo-Bin Li -- 1 Introduction -- 1.1 String algorithms -- 1.2 Sequence analysis -- 1.3 Wavelet transform -- 2 Motif Searching -- 2.1 Introduction -- 2.2 Methods -- 2.3 Results -- 2.4 Allergenicity prediction -- 3 Transmembrane Helix Region (HTM) Prediction -- 4 Hydrophobic Cores -- 5 Protein Repeat Motifs -- 6 Sequence Comparison -- 7 Prediction of Protein Secondary Structures -- 8 Disease Related Studies -- 9 Other Functional Prediction -- 10 Conclusion -- References -- Chapter 6 Filtering Protein Surface Motifs Using Negative Instances of Active Sites Candidates Nripendra L. Shrestha and Takenao Ohkawa -- 1 Introduction -- 2 Protein Structural Data and Surface Motifs -- 2.1 Protein structural data -- 2.2 Protein molecular surface data -- 2.3 Functions of a protein and structural motifs -- 2.3.1. Functions of a protein -- 2.3.2. Active sites -- 2.3.3. Protein structural motifs -- 3 Overview of SUMOMO.

3.1 Surface motif extraction -- 3.1.1. Unit surfaces construction -- 3.1.2. Sorting of vector pairs using buckets -- 3.1.3. Merging of candidate motifs -- 3.2 Filtering using similarity between local surfaces -- 3.2.1. Similarity focused on adjacent motif sets -- 3.2.2. Significance of surface motifs -- 3.3 Problems with SUMOMO -- 4 Filtering Surface Motifs using Negative Instances of Protein Active Sites Candidates -- 4.1 Survey on the features to distinguish real active sites from the active sites candidates -- 4.1.1. Dissimilarity -- 4.1.2. Procedure -- 4.1.3. Results -- 4.2 Ranking active sites candidates -- 5 Evaluations -- 6 Conclusions and Future Works -- References -- Chapter 7 Distill: A Machine Learning Approach to Ab Initio Protein Structure Prediction Gianluca Pollastri, Davide Baú and Alessandro Vullo -- 1 Introduction -- 2 Structural Features -- 2.1 One-dimensional structural features -- 2.1.1. Secondary structure -- 2.1.2. Solvent accessibility -- 2.1.3. Contact density -- 2.2 Two-dimensional structural features -- 2.2.1. Contact maps -- 2.2.2. Coarse topologies -- 3 Review of Statistical Learning Methods Applied -- 3.1 RNNs for undirected graphs -- 3.2 1D DAG-RNN -- 3.2.1. Ensembling 1D DAG-RNNs -- 3.3 2D DAG-RNN -- 4 Predictive Architecture -- 4.1 Data set generation -- 4.2 Training protocols -- 4.3 One-dimensional feature predictors -- 4.3.1. Porter -- 4.3.2. Pale Ale -- 4.3.3. Brown Ale -- 4.4 Two-dimensional feature predictors -- 4.4.1. XXStout -- 4.4.2. XStout -- 5 Modeling Protein Backbones -- 5.1 Protein representation -- 5.2 Constraints-based pseudo energy -- 5.3 Optimization algorithm -- 6 Reconstruction Results -- 7 Conclusions -- Acknowledgment -- References -- Chapter 8 In Silico Design of Ligands using Properties of Target Active Sites Sanghamitra Bandyopadhyay, Santanu Santra, Ujjwal Maulik and Heinz Muehlenbein.

1 Introduction -- 2 Relevance of Genetic Algorithm for Drug Design -- 3 Basic Issues -- 3.1 Core formation -- 3.2 Chromosome representation -- 3.3 Fitness computation -- 4 Main Algorithm -- 5 Experimental Results -- 6 Discussion -- References -- Part III GENE EXPRESSION AND MICROARRAY DATA ANALYSIS -- Chapter 9 Inferring Regulations in a Genomic Network from Gene Expression Profiles Nasimul Noman and Hitoshi Iba -- 1 Introduction -- 2 Modeling Gene Regulatory Networks by S-system -- 2.1 Canonical model description -- 2.2 Genetic network inference problem by S-system -- 2.3 Decoupled S-system model -- 2.4 Fitness function for skeletal network structure -- 3 Inference Method -- 3.1 Trigonometric Differential Evolution (TDE) -- 3.2 Proposed algorithm -- 3.2.1. Mutation phase -- 3.3 Local search procedure -- 4 Simulated Experiment -- 4.1 Experiment 1: inferring small scale network in noise free environment -- 4.1.1. Experimental setup -- 4.1.2. Result -- 4.2 Experiment 2: inferring small scale network in noisy environment -- 4.2.1. Experimental setup -- 4.2.2. Results -- 4.3 Experiment 3: inferring medium scale network in noisy environment -- 4.3.1. Experimental setup -- 4.3.2. Results -- 5 Analysis of Real Gene Expression Data -- 5.1 Experimental data set -- 5.1.1. Results -- 6 Discussion -- 7 Conclusion -- References -- Chapter 10 A Reliable Classification of Gene Clusters for Cancer Samples Using a Hybrid Multi- Objective Evolutionary Procedure Kalyanmoy Deb, A. Raji Reddy and Shamik Chaudhuri -- 1 Introduction -- 2 Class Prediction Procedure -- 2.1 Two-class classification -- 2.2 Multi-class classification -- 3 Evolutionary Gene Selection Procedure -- 3.1 The optimization problem -- 3.2 A multi-objective evolutionary algorithm -- 3.3 A multi-modal NSGA-II -- 3.4 Genetic operators and modified domination operator.

3.5 NSGA-II search using a fixed classifier size -- 3.6 Overall procedure -- 4 Simulation Results -- 4.1 Complete leukemia study -- 4.2 Diffuse large B-cell lymphoma dataset -- 4.3 Colon cancer dataset -- 4.4 NCI60 multi-class tumor dataset -- 5 Conclusions -- References -- Chapter 11 Feature Selection for Cancer Classification using Ant Colony Optimization and Support Vector Machines A. Gupta, V. K. Jayaraman and B. D. Kulkarni -- 1 Introduction -- 2 Ant Colony Optimization -- 3 Support Vector Machines -- 4 Proposed Ant Algorithm -- 4.1 State transition rules -- 4.2 Evaluation procedure -- 4.3 Global updating rule -- 4.4 Local updating rule -- 5 Algorithm Outline -- 6 Experiments -- 6.1 Datasets -- 6.1.1. Colon cancer dataset -- 6.1.2. Brain cancer dataset -- 6.1.3. Leukemia dataset -- 6.2 Preprocessing -- 6.3 Experimental setup -- 7 Results and Discussion -- 8 Conclusions -- Acknowledgments -- References -- Chapter 12 Sophisticated Methods for Cancer Classifi- cation using Microarray Data Sung-Bae Cho and Han-Saem Park -- 1 Introduction -- 2 Backgrounds -- 2.1 DNA microarray -- 2.1.1. DNA microarray -- 2.1.2. Oligonucleotide microarray -- 2.2 Feature selection methods -- 2.3 Base classifiers -- 2.4 Classifier ensemble methods -- 3 Sophisticated Methods for Cancer Classification -- 3.1 Ensemble with negatively correlated features -- 3.2 Combinatorial ensemble -- 3.3 Searching optimal ensemble with GA -- 4 Experiments -- 4.1 Datasets -- 4.2 Ensemble with negative correlated features -- 4.3 Combinatorial ensemble -- 4.4 Optimal ensemble with GA -- 5 Conclusions -- Acknowledgement -- References -- Chapter 13 Multiobjective Evolutionary Approach to Fuzzy Clustering of Microarray Data Anirban Mukhopadhyay, Ujjwal Maulik and Sanghamitra Bandyopadhyay -- 1 Introduction -- 2 Structure of Gene Expression Data Sets -- 3 Cluster Analysis -- 3.1 K-means.

3.2 K-medoids.

Abstract:

Bioinformatics, a field devoted to the interpretation and analysis of biological data using computational techniques, has evolved tremendously in recent years due to the explosive growth of biological information generated by the scientific community. Soft computing is a consortium of methodologies that work synergistically and provides, in one form or another, flexible information processing capabilities for handling real-life ambiguous situations. Several research articles dealing with the application of soft computing tools to bioinformatics have been published in the recent past; however, they are scattered in different journals, conference proceedings and technical reports, thus causing inconvenience to readers, students and researchers. This book, unique in its nature, is aimed at providing a treatise in a unified framework, with both theoretical and experimental results, describing the basic principles of soft computing and demonstrating the various ways in which they can be used for analyzing biological data in an efficient manner. Interesting research articles from eminent scientists around the world are brought together in a systematic way such that the reader will be able to understand the issues and challenges in this domain, the existing ways of tackling them, recent trends, and future directions. This book is the first of its kind to bring together two important research areas, soft computing and bioinformatics, in order to demonstrate how the tools and techniques in the former can be used for efficiently solving several problems in the latter. Sample Chapter(s). Chapter 1: Bioinformatics: Mining the Massive Data from High Throughput Genomics Experiments (160 KB). Contents: Overview: Bioinformatics: Mining the Massive Data from High Throughput Genomics Experiments (H Tang & S Kim); An Introduction to Soft Computing (A Konar & S

Das); Biological Sequence and Structure Analysis: Reconstructing Phylogenies with Memetic Algorithms and Branch-and-Bound (J E Gallardo et al.); Classification of RNA Sequences with Support Vector Machines (J T L Wang & X Wu); Beyond String Algorithms: Protein Sequence Analysis Using Wavelet Transforms (A Krishnan & K-B Li); Filtering Protein Surface Motifs Using Negative Instances of Active Sites Candidates (N L Shrestha & T Ohkawa); Distill: A Machine Learning Approach to Ab Initio Protein Structure Prediction (G Pollastri et al.); In Silico Design of Ligands Using Properties of Target Active Sites (S Bandyopadhyay et al.); Gene Expression and Microarray Data Analysis: Inferring Regulations in a Genomic Network from Gene Expression Profiles (N Noman & H Iba); A Reliable Classification of Gene Clusters for Cancer Samples Using a Hybrid Multi-Objective Evolutionary Procedure (K Deb et al.); Feature Selection for Cancer Classification Using Ant Colony Optimization and Support Vector Machines (A Gupta et al.); Sophisticated Methods for Cancer Classification Using Microarray Data (S-B Cho & H-S Park); Multiobjective Evolutionary Approach to Fuzzy Clustering of Microarray Data (A Mukhopadhyay et al.). Readership: Graduate students and researchers in computer science, bioinformatics, computational and molecular biology, artificial intelligence, data mining, machine learning, electrical engineering, system science; researchers in pharmaceutical industries.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Bioinformatics.

Electronic books. -- local.

Genre:

Added Author:

Electronic Access:

Holds: Copies:

Available:*

Bound With These Titles

On Order