Research interests: Bioinformatics and Systems Biology

Genome Analysis - pattern recognition, comparative genomics, data mining, developing specialized databases
o Development of Comprehensive Gene Database (In collaboration with CCMB, Hyderabad)
o Identifying Tandem Repeats and Segmental Duplications in Genomic Sequences
o An Integrated Data Mining Tool for Function Analysis of SNPs (In collaboration with CCMB, Hyderabad)
o Identifying Genomic Islands and Pathogenicity Islands – developed a web-based tool, GIIPro
o Constructing Genetic Linkage Maps using STR markers (in collaboration with IIL, Hyderabad)
Proteomics - pattern recognition, graph theory, developing repeats database
o Identifying Repeats in Protein Sequences - Algorithm and Database Development
o Graph Theory Approach for Analyzing Protein Structures
Systems Biology
o Dynamical Systems Modeling of Biological Systems

Teaching:
PG courses: Biostatistics, Introduction to Bioinformatics, Advanced Bioinformatics, Genome Analysis, Projects in Computational Biology
UG courses: Elements of Bioinformatics.


Research interests: Bioinformatics and Systems Biology

Identifying Repeats in Protein Sequences

Introduction
Identifying tandem repeats in the proteome of any organism is important not only for understanding the structure and function of the proteins but also for analyzing the association of abnormal expansion of repeat regions with disorders. We have developed an efficient tool for identifying Peptide Periodic Repeats (PEPPER) in protein sequences. The tool identifies tandem n-mer repeats and single amino acid repeats and reports the consensus repeat pattern, the complete repeat region, the score and alignment of the consensus with the repeat region along with percentage mismatch and insertions/deletions. Presently we are also looking into identifying protein motifs.

Related Publication:
1. PEPPER – A Tool for Identifying PEPtide PEriodic Repeats accepted in the International Conference in Bioinformatics, Dec 18-20, 2006, New Delhi.
2. Protein Tandem Repeat DataBase (PTRDB), P. Krishna Manjari, V. Kiran Kumar, Rima Kumari and Nita Parekh, accepted in International Conference on Bioinformatics & Drug Discovery, Dec 20 – 22, 2007, Hyderabad.

Associated people:
K. Kasturi Kiran (M.Tech 2004), Radhika B. (M.Tech 2004), P. Krishna Manjari (MTech 2006-08), V. Kiran Kumar (MTech 2006-08), Nita Parekh

An Integrated Tool for SNP Function Analysis

Introduction
Single nucleotide polymorphisms (SNPs) are commonly used for association studies to find genes responsible for complex genetic diseases. The complex diseases may involve many genes and hundreds of alleles but only a small portion of them are functional polymorphisms that contribute to disease phenotypes. Assessment of the risk requires access to a variety of heterogeneous biological databases and analytical tools. A web server is being developed to facilitate the functional analysis of SNPs by mining data from various resources and providing a detailed report for the query.

Associated people:
Kasturi Nadella (MSIT, 2002), Ajeet Pandey (MTech 2005), Anshu Bharadwaj (Phd Student, CCMB), Shrish Tiwari (Sct., CCMB), P. Krishna Manjari (MTech 2006-08), V. Kiran Kumar (MTech 2006-08), Nita Parekh

Identifying Genomic Islands and Pathogenicity Islands

Introduction
In recent years many different genomic islands have been discovered in a variety of pathogenic as well as non-pathogenic bacteria. Because they promote genetic variability, genomic islands play an important role in microbial evolution. Pathogenicity islands (PAIs) are a subset of GIs and represent distinct genetic elements encoding virulence factors of pathogenic bacteria. A gene in a genome is defined as putative alien (pA) if its codon usage difference from the average gene exceeds a high threshold and codon usage differences from ribosomal protein genes, chaperone genes and protein synthesis processing factors are also high. pA gene clusters in bacterial genomes are relevant for detecting genomic islands (GIs), including pathogenicity islands (PAIs). We have developed a tool using four approaches to identify GIs and PAIs: G+C genome variation (the standard method); genomic signature divergences (dinucleotide bias); extremes of codon bias; and anomalies of amino acid usage.

Related Publication:
1. Genomic Islands Identification in Prokaryotic Genomes (GIIPro), Ruchi Jain and Nita Parekh, accepted in International Conference on Bioinformatics & Drug Discovery, Dec 20 – 22, 2007, Hyderabad.

Associated people:
Senthil Kumar (Phd student), Rishi Arvind (MTech 2004), Hemanth Sanna Reddy (MTech 2005), Ruchi Jain (Ms by Research, 2007-08), Sandeep Ramineni (Project student), Nita Parekh

Model Protein Structures Using Graph Theory Approach

Introduction
In this project we propose to use graph theory methods to understand protein structure, folding and function. Graph theory is a branch of discrete mathematics that is used in the study of various real-world networks and their properties. Chemical molecules being a set of atoms or group of atoms (vertices) connected by covalent bonds (deges) have also been extensively investigated by graph theory. The structure of biopolymers like proteins is governed to a large extent by non-covalent interactions, and graph theory is being used to gain insights into the structures of proteins. Analysis of the topological details of proteins with known structures, such as clustering of specific types of amino acids important for structure, folding and function, is of great value as large number of protein structures are now available. Identification of amino acid clusters and hubs in such protein structure graphs provide interesting insights into the structure, stability, folding and function of proteins.

Associated people:
Ramesh Nerella (MTech 2005), Ruchi Jain (Ms by Research, 2006-08), Nita Parekh

Development of Comprehensive Gene Database

Introduction:
An important pattern recognition problem in biological sequences is gene prediction – the region that codes for proteins. What are the important conserved patterns or motifs in exonic and intronic regions of eukaryotic genes, splice site recognition, promoters & regulatory sequences found in the vicinity of genic regions, etc. are some of the important questions in gene prediction. Developing a specialized database of genes would greatly facilitate in this analysis. We are developing a Comprehensive Gene Database (CGD) of mammals be integrating information from various NCBI resources.

Related Publication:
1. Gene Prediction in silico at National Seminar on Bioinformatics and Functional Genomics, conducted by Bioinformatics Centre, Pondicherry University, Feb 15 – 17, 2005.
2. Computational Issues in Gene Prediction, at 40th National Convention of Computer Society India, hosted by CSI Hyderabad Chapter, Nov 9 - 12, 2005.
3. Tool to find Absolute Location of Genes in Human Genome, presented at the National Seminar on Systems Approach to Bioinformatics, conducted by Bioinformatics Centre, Pondicherry University, Feb 18 - 20, 2004. (Report no: IIIT/TR/2004/31)

Associated people:
G. Madhukar Reddy (MSIT 2002), Ch. Jagan Mohan Reddy (MSIT 2002), Sai Deepthi (MSIT 2002), Kasturi Nadella (MSIT 2002), B. Subramanyam Sarath (M.Tech 2004), Ramesh Narella (M.Tech 2005), Shrish Tiwari (Sct., CCMB), Nita Parekh

Dynamical Systems Modeling of Biological Systems

Introduction
Networks of coupled dynamical systems have been used to model biological oscillators, excitable media, neural networks, genetic control networks and many other self-organizing systems. In general, the connection topology is assumed to be either completely regular (e.g., diffusively-coupled system) or completely random. However, most biological networks lie somewhere between these two extremes. We would like to explore some simple models of networks that can be tuned through this middle ground – regular networks re-wired to introduce increasing amounts of disorder. These systems, called small-world networks, can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. From the perspective of nonlinear dynamics, it would be interesting to understand how a network of interacting dynamical systems – be they neurons, chemical concentrations, or species population – behave collectively, given their individual dynamics and coupling architecture.

Related Publication:
1. Controllability of Spatially Extended Systems Using the Pinning Approach, Nita Parekh and S. Sinha, Physica A 318, 200-212 (2003).
2. Controlling Dynamics in Spatially Extended Systems, Nita Parekh and S. Sinha, Phys. Rev. E. 65, 036227-1 to 9 (2002).
3. Global and Local Control of Spatiotemporal Chaos in Coupled Map Lattices, Nita Parekh, S. Parthasarthy and S. Sinha, Phys. Rev. Lett. 81, 1401 (1998).

Associated people:
Sunaina K. (MSIT 2002), Rishi Arvind (M.Tech 2004), Snehansu Ghosh (M.Tech 2004), Nita Parekh, Somdatta Sinha (CCMB)