Detection of Rare and Faint Signals on High-Dimensional Count Distribution
Detection of Rare and Faint Signals on High-Dimensional Count Distribution
Date: | Wednesday, June 25 |
Time: | 8:00 am -- 9:00 am |
Place: | 2113 Snedecor Hall |
Speaker: | Yumou Qiu |
Abstract:
Detecting rare and faint signals in high-dimensional count data arise in many scientific studies, for example, the analysis of differentially expressed genes across different biological conditions based on RNA-seq data. In this paper, we consider the signal detection problem under the Generalized Linear Models (GLMs) and their extensions, which include the linear model as a special case. Based on the maximum likelihood estimators (MLEs), a thresholding statistic with a single threshold level is proposed to test the existence of differentially expressed response variables under the alternative with rare and faint signals. The Cramer type moderate deviation for multi-dimensional MLEs with non-identically distributed data is derived, which is the prerequisite to study the properties of the thresholding test statistics. For the case of linear regression, the detection boundary is obtained, and the proposed thresholding test can attain such boundary. A multi-threshold test is constructed by maximizing the standardized thresholding statistics over a set of thresholds. Extensions to Generalized Linear Mixed Models are made, where Gauss quadratures and data cloning are used to approximate the MLEs of such models. Numerical simulations and a case study on maize RNA-seq data are conducted to conrm and demonstrate the proposed testing approaches.