PhD Seminar: Zhili Qiao, Some Clustering Methods for Omics Data
Speaker: Zhili Qiao, PhD Candidate, Department of Statistics, Iowa State University
Title: Some Clustering Methods for Omics Data
Abstract: The advent of high-throughput sequencing technologies has significantly facilitated omics research over the past two decades, generating a vast volume of data. Unsupervised learning technologies such as cluster analysis serve as powerful tools to uncover hidden patterns and associations in omics data. However, the intricate natural structure and unique characteristics of omics data often make generic clustering methods inefficient or unsuitable.
To address these issues, we have developed several clustering and biclustering algorithms specifically tailored for omics data, targeting varying perspectives. In this talk, we propose an over-clustering and merging framework for clustering non-convex, non-spherical-shaped biological data. The idea originates from the fact that K-means and its extensions perform poorly when true feature classes show a highly imbalanced pattern. We develop a novel K-means merging algorithm that utilizes a similarity measure based on log-concavity. Operating under minimal local distributional assumptions, our algorithm demonstrates strong consistency and guarantees rapid convergence, and shows superior performance over competing methods in variety of simulation and real benchmark datasets.