Research interests: Bioinformatics and Systems Biology
• Genome Analysis - pattern recognition, comparative
genomics, data mining, developing specialized databases
o Development of Comprehensive Gene Database (In collaboration with CCMB, Hyderabad)
o Identifying Tandem Repeats and Segmental Duplications in Genomic Sequences
o An Integrated Data Mining Tool for Function Analysis of SNPs (In collaboration
with CCMB, Hyderabad)
o Identifying Genomic Islands and Pathogenicity Islands – developed a
web-based tool, GIIPro
o Constructing Genetic Linkage Maps using STR markers (in collaboration with
IIL, Hyderabad)
• Proteomics - pattern recognition, graph theory, developing
repeats database
o Identifying Repeats in Protein Sequences - Algorithm and Database Development
o Graph Theory Approach for Analyzing Protein Structures
• Systems Biology
o Dynamical Systems Modeling of Biological Systems
Teaching:
• PG courses: Biostatistics, Introduction to Bioinformatics,
Advanced Bioinformatics, Genome Analysis, Projects in Computational Biology
• UG courses: Elements of Bioinformatics.
Research interests: Bioinformatics and Systems Biology
Identifying Repeats in Protein Sequences
Introduction
Identifying tandem repeats in the proteome of any organism is important not
only for understanding the structure and function of the proteins but also for
analyzing the association of abnormal expansion of repeat regions with disorders.
We have developed an efficient tool for identifying Peptide Periodic Repeats
(PEPPER) in protein sequences. The tool identifies tandem n-mer repeats and
single amino acid repeats and reports the consensus repeat pattern, the complete
repeat region, the score and alignment of the consensus with the repeat region
along with percentage mismatch and insertions/deletions. Presently we are also
looking into identifying protein motifs.
Related Publication:
1. PEPPER – A Tool for Identifying PEPtide
PEriodic Repeats accepted in the International
Conference in Bioinformatics, Dec 18-20, 2006, New Delhi.
2. Protein Tandem Repeat DataBase (PTRDB), P. Krishna Manjari,
V. Kiran Kumar, Rima Kumari and Nita Parekh, accepted in International Conference
on Bioinformatics & Drug Discovery, Dec 20 – 22, 2007, Hyderabad.
Associated people:
K. Kasturi Kiran (M.Tech 2004), Radhika B. (M.Tech 2004), P. Krishna Manjari
(MTech 2006-08), V. Kiran Kumar (MTech 2006-08), Nita Parekh
An Integrated Tool for SNP Function Analysis
Introduction
Single nucleotide polymorphisms (SNPs) are commonly used for association studies
to find genes responsible for complex genetic diseases. The complex diseases
may involve many genes and hundreds of alleles but only a small portion of them
are functional polymorphisms that contribute to disease phenotypes. Assessment
of the risk requires access to a variety of heterogeneous biological databases
and analytical tools. A web server is being developed to facilitate the functional
analysis of SNPs by mining data from various resources and providing a detailed
report for the query.
Associated people:
Kasturi Nadella (MSIT, 2002), Ajeet Pandey (MTech 2005), Anshu Bharadwaj (Phd
Student, CCMB), Shrish Tiwari (Sct., CCMB), P. Krishna Manjari (MTech 2006-08),
V. Kiran Kumar (MTech 2006-08), Nita Parekh
Identifying Genomic Islands and Pathogenicity Islands
Introduction
In recent years many different genomic islands have been discovered in a variety
of pathogenic as well as non-pathogenic bacteria. Because they promote genetic
variability, genomic islands play an important role in microbial evolution.
Pathogenicity islands (PAIs) are a subset of GIs and represent distinct genetic
elements encoding virulence factors of pathogenic bacteria. A gene in a genome
is defined as putative alien (pA) if its codon usage difference from the average
gene exceeds a high threshold and codon usage differences from ribosomal protein
genes, chaperone genes and protein synthesis processing factors are also high.
pA gene clusters in bacterial genomes are relevant for detecting genomic islands
(GIs), including pathogenicity islands (PAIs). We have developed a tool using
four approaches to identify GIs and PAIs: G+C genome variation (the standard
method); genomic signature divergences (dinucleotide bias); extremes of codon
bias; and anomalies of amino acid usage.
Related Publication:
1. Genomic Islands Identification in Prokaryotic Genomes (GIIPro),
Ruchi Jain and Nita Parekh, accepted in International Conference on Bioinformatics
& Drug Discovery, Dec 20 – 22, 2007, Hyderabad.
Associated people:
Senthil Kumar (Phd student), Rishi Arvind (MTech 2004), Hemanth Sanna Reddy
(MTech 2005), Ruchi Jain (Ms by Research, 2007-08), Sandeep Ramineni (Project
student), Nita Parekh
Model Protein Structures Using Graph Theory Approach
Introduction
In this project we propose to use graph theory methods to understand protein
structure, folding and function. Graph theory is a branch of discrete mathematics
that is used in the study of various real-world networks and their properties.
Chemical molecules being a set of atoms or group of atoms (vertices) connected
by covalent bonds (deges) have also been extensively investigated by graph theory.
The structure of biopolymers like proteins is governed to a large extent by
non-covalent interactions, and graph theory is being used to gain insights into
the structures of proteins. Analysis of the topological details of proteins
with known structures, such as clustering of specific types of amino acids important
for structure, folding and function, is of great value as large number of protein
structures are now available. Identification of amino acid clusters and hubs
in such protein structure graphs provide interesting insights into the structure,
stability, folding and function of proteins.
Associated people:
Ramesh Nerella (MTech 2005), Ruchi Jain (Ms by Research, 2006-08), Nita Parekh
Development of Comprehensive Gene Database
Introduction:
An important pattern recognition problem in biological sequences is gene prediction
– the region that codes for proteins. What are the important conserved
patterns or motifs in exonic and intronic regions of eukaryotic genes, splice
site recognition, promoters & regulatory sequences found in the vicinity
of genic regions, etc. are some of the important questions in gene prediction.
Developing a specialized database of genes would greatly facilitate in this
analysis. We are developing a Comprehensive Gene Database (CGD) of mammals be
integrating information from various NCBI resources.
Related Publication:
1. Gene Prediction in silico at National Seminar on Bioinformatics and Functional
Genomics, conducted by Bioinformatics Centre, Pondicherry University, Feb 15
– 17, 2005.
2. Computational Issues in Gene Prediction, at 40th National Convention of Computer
Society India, hosted by CSI Hyderabad Chapter, Nov 9 - 12, 2005.
3. Tool to find Absolute Location of Genes in Human Genome, presented at the
National Seminar on Systems Approach to Bioinformatics, conducted by Bioinformatics
Centre, Pondicherry University, Feb 18 - 20, 2004. (Report no: IIIT/TR/2004/31)
Associated people:
G. Madhukar Reddy (MSIT 2002), Ch. Jagan Mohan Reddy (MSIT 2002), Sai Deepthi
(MSIT 2002), Kasturi Nadella (MSIT 2002), B. Subramanyam Sarath (M.Tech 2004),
Ramesh Narella (M.Tech 2005), Shrish Tiwari (Sct., CCMB), Nita Parekh
Dynamical Systems Modeling of Biological Systems
Introduction
Networks of coupled dynamical systems have been used to model biological oscillators,
excitable media, neural networks, genetic control networks and many other self-organizing
systems. In general, the connection topology is assumed to be either completely
regular (e.g., diffusively-coupled system) or completely random. However, most
biological networks lie somewhere between these two extremes. We would like
to explore some simple models of networks that can be tuned through this middle
ground – regular networks re-wired to introduce increasing amounts of
disorder. These systems, called small-world networks, can be highly clustered,
like regular lattices, yet have small characteristic path lengths, like random
graphs. From the perspective of nonlinear dynamics, it would be interesting
to understand how a network of interacting dynamical systems – be they
neurons, chemical concentrations, or species population – behave collectively,
given their individual dynamics and coupling architecture.
Related Publication:
1. Controllability of Spatially Extended Systems Using the Pinning Approach,
Nita Parekh and S. Sinha, Physica A 318, 200-212 (2003).
2. Controlling Dynamics in Spatially Extended Systems, Nita Parekh and S. Sinha,
Phys. Rev. E. 65, 036227-1 to 9 (2002).
3. Global and Local Control of Spatiotemporal Chaos in Coupled Map Lattices,
Nita Parekh, S. Parthasarthy and S. Sinha, Phys. Rev. Lett. 81, 1401 (1998).
Associated people:
Sunaina K. (MSIT 2002), Rishi Arvind (M.Tech 2004), Snehansu Ghosh (M.Tech 2004),
Nita Parekh, Somdatta Sinha (CCMB)