Seminar Notice Statistical Laboratory Iowa State University DATE AND TIME: Monday, September 8, 1997, 4:10 p.m. PLACE: 319 Snedecor Hall SPEAKER: Sanjib Basu Division of Statistics Northern Illinois University TITLE: Estimation of Undetected Errors ABSTRACT An unknown number, N, of errors/defects/faults exists in a software and k reviewers with unknown competencies inspect in parallel to find these errors. Given the list of errors detected by each inspector, the problem is to estimate the number of undetected errors still remaining in the software. This is the well-known Binomial N estimation problem, but here we have parallel informa-tion from k reviewers. Another similar problem arises in capture-recapture sampling where we want to estimate the size of a closed biological population based on multiple trappings. Our first approach to the undetected error problem will be through a series of interesting but simple probability calculations. These calculations establish that detection of errors is harder when the errors are uneven/non- identical than when they are identical. However, it is better to have a few good and a few bad reviewers (i.e., non-identical reviewers) rather than all average reviewers. Also, after a while, increasing the number of reviewers may not result in any detection. Our second approach will be more statistical in nature where we propose Bayesian estimation methods for N. We address the following typical features of software review in our estimation procedure: (i) Reviewers often differ in their detection skills; (ii) Discussion among the reviewers often induces non- independence; and (iii) Some errors are easier to detect, others are harder. Several interesting phenomena happen. In the non-independent reviewer model, we start with 2k many parameters but it turns out that it is enough to specify prior distributions for only TWO of the 2k parameters. For the heterogeneous errors case, we develop a flexible Bayesian model and emonstrate how Gibbs sampling can be used to routinely calculate the Bayesian estimator for N and other quantities of interest. We develop another Bayesian model which can simultaneously incorporate heterogeneities of the errors and the reviewers. These models are illustrated in simulated data and data from an actual Bell Laboratories review of AT&T 5 ESS telephone switch. COFFEE: 3:45 p.m., 104 Snedecor Hall