Dept. Seminar - Karen Nielsen
Karen Nielsen
University of Michigan
Statistical Tools for Exploring and Testing Features of Waveform Data
Due to increased popularity of mobile fitness-tracking and wearable technologies, physiological sensors, and smartphones, there is an increased need for analytic tools for waveform data resulting from time-intensive longitudinal designs. Existing methods summarize this complex data into a single number, then treat this number as an observed value while ignoring the context it came from and the unstated assumptions the summary process imposes on the shape of the data. This may lead to incorrect conclusions, or failure to answer the intended research question.
Using brain Event-Related Potentials (ERPs) as an example, I show how both analytic work concerning distributional theory and simulation studies allow for in-depth exploration of competing metrics for component amplitude. Metrics are discussed and assessed in terms of false positives, false negatives, and coverage probability for simulated ERP studies. I then build on this work by treating the simulation data-generating model as an analysis framework that can incorporate substantive expectations researchers have about the shape of individuals' waveforms. By combining multilevel mixed effects models with basis functions traditionally used in functional data analysis, I am able to test hypotheses on landmark characteristics of the waveform across all levels of the data collection hierarchy: temporal patterns, patterns across channels, individual differences, and differences across experimental conditions. This flexible framework can be implemented using existing statistical software and promotes reproducible research by removing some of the more subjective steps from the analytic pipeline.
An additional example using GPS data underscores how exploring additional aspects of the curve fit to data can provide new insight. I propose deviation from typical path as a new dimension of lifespace, or the geographic area occupied by a person, that can be quantified as the errors produced when fitting a principal curve to location data. This measure captures finer detail than existing lifespace measures, is not reliant on self-report, and can be used to explore both intra- and inter-individual variability.