Seminar, Hengrui Cai, Towards Trustworthy Machine Learning: A Causal Lens on Learning Non-Spuriousness

Hengrui Cai

Seminar, Hengrui Cai, Towards Trustworthy Machine Learning: A Causal Lens on Learning Non-Spuriousness

Sep 9, 2024 - 11:00 AM
to Sep 9, 2024 - 11:50 AM

Speaker: Hengrui Cai, Assistant Professor, University of California, Irvine

Title: Towards Trustworthy Machine Learning: A Causal Lens on Learning Non-Spuriousness

Abstract: The causal revolution has spurred interest in understanding complex relationships across various fields. Most existing methods aim to discover causal relationships among variables within a complex, large-scale system. However, in practice, only a small number of variables are relevant to the outcomes of interest. As a result, causal estimation using the full causal representation, especially with limited data, could lead to many falsely discovered, spurious variables that are highly correlated with but have no causal impact on the target outcome. We propose learning a class of necessary and sufficient causal graphs that only contain causally relevant variables, utilizing probabilities of causation. The framework is further extended to natural language processing models to disentangle the 'black box' by identifying true rationales when two or more snippets are highly inter-correlated, thus contributing similarly to prediction accuracy. We leverage two causal desiderata, non-spuriousness and efficiency, establishing their theoretical identification as the main component of learning necessary and sufficient in language models. The superior performance of our proposed methods is demonstrated in real-world reviews and medical datasets through extensive experiments.