Sign-Ons

Seminars: Dept Seminar


A Separability Index for Clustering and Classification Problems

Date: Monday, September 28
Time: 4:10 pm -- 5:00 pm
Place: 3105 Snedecor
Speaker: Anna Peterson, Department of Statistics, ISU

Abstract:

 

We propose a separation index that captures the degree of difficulty in a clustering or classification problem where each observation is generated from one of K different p-variate Gaussian distributions. This index is motivated by the intuition that an observation from a Gaussian distribution should be closer in general to its own mean than to the mean of a different Gaussian distribution. We develop a data-simulation algorithm with a specified value of the index. Such data with varying values of the index can be used for comparison of various clustering algorithms. We explore several theoretical properties of this index as well as compare the performance of well-known clustering techniques with simulated data generated using our algorithm.