PhD Seminar: Lingnan Yuan
Title: Improving RNA-Seq Transcript Quantification
Abstract: RNA-seq is a deep sequencing technique used to analyze the transcriptome, the collection of messenger RNA (mRNA) molecules in a cell or cells. Many of the existing tools use EM algorithm for transcripts quantification and we want to improve the performance of these tools in several aspects. In the first part, we incorporated EM acceleration methods: Anderson acceleration, SQUAREM and Quasi-Newton methods to one of the most popular transcript quantification tool Salmon, the accelerated EM algorithms could speed up the original slow EM algorithm without cost of accuracy, and the performance is consistent across any initial points and dataset complexity. In the second part, we focus on the sparsity issue of bulk and especially single-cell RNA-Seq data, a penalty function is implemented in to complete data log likelihood, encourages both shrinkage and parsimony to the transcripts abundances. The performance of penalized EM algorithm is compared with the original EM in the ability to identify truly absent transcripts and capture expressed ones using Precision-Recall curves.