Searching for genes and reconstructing the past: Messy solutions to good
problems and good solutions to messy problems
Junhyong Kim
Department of Ecology and Evolutionary Biology
Yale University
Problems in computational biology and bioinformatics require flexible
approaches using a variety of quantitative tools and biological
intuition. At times the problems are well defined and we need to use
whatever tools on hand--as inelegant as they may be, to solve the
problem. I demonstrate an example where I use a combination of moving
window profiles and non-parametric discriminant analysis to isolate
olfactory receptor genes from the Drosophila genome database. Other more
mature problems in computational biology require more sophisticated
treatment. Estimation of evolutionary trees-- a tree-graph representation
of genealogical relationships, is a complex statistical problem. Here I
show how the problem can be viewed as a geometry problem in a vector
space of joint probability distributions. This geometric view gives a
better intuitive approach to the evolutionary tree estimation problem and
yields new theorems and algorithms.