Department of Statistics
Oklahoma State University
A Powerful Statistical Method for Identifying Genes under Positive Selection from the Human Genome
Genes which produce characteristics that are more favorable in a particular environment will be more abundant in the next generation are said to be under positive selection. Genes under positive selection tend to evolve rapidly and are believed to play important roles in functions, such as immune defenses. Methodologies available for testing neutrality based on the site-frequency spectrum (SFS), a vector summarized from single nucleotide polymorphism sites (SNPs) data, usually assume independence among components of the SFS. However, correlations due to linkage or selection may significantly reduce the power of hypothesis tests. Here we present a new method, Poisson pairwise difference method, for detecting signals of positive selection from arbitrary correlated SFS. We show through both simulation and real data analysis that the performance of this method is quite stable for any level of correlation among elements of SFS. We demonstrate that the log-likelihood ratio test derived from this approach has excellent power to detect positive selection. We applied this method to 162 genes from SeattleSNPs (23 European Americans and 24 African Americans) and identified 32 genes that showed significant evidence of positive selection. This result is about 75% consistent with early studies in the literature.
Refreshments at 3:45pm in Snedecor 2101.