DATE & TIME:   Monday, April 12, 2004 4:10 pm

LOCATION:   319 Snedecor

SPEAKER:   Jerry Reiter, Duke University, Institute of Statistics and
                   Decision Sciences, Durham, NC

 

TITLE: Disclosure Limitation Via Multiply-Imputed, Synthetic Data Sets

 

ABSTRACT:

Several statistical agencies use, or are considering the use of, multiple
imputation to limit the risk of disclosing respondents' identities or
sensitive attributes in public use data files.  For example, agencies can
release fully synthetic datasets, comprising random samples of units from
the sampling frame with simulated values of survey variables.  Or, agencies
can release partially synthetic datasets, comprising the units originally
surveyed with some collected values, such as sensitive values at high risk
of disclosure or values of key identifiers, replaced with multiple
imputations.  In this talk, I describe these synthetic data approaches.  I
discuss methods for obtaining valid inferences from the synthetic data sets
based on the concepts of multiple imputation for missing data.

COFFEE: 3:45 p.m., 104 Snedecor Hall