2024 Statistics REU: Application Deadline is March 22

Apply to become an undergraduate intern for the Iowa State University Department of Statistics 2024 REU summer program. 


2024 Research Experience for Undergraduate Students Program Information

Program Dates: June 3-August 1, 2024

Application Deadline: March 22, 2024

Apply Online: https://bit.ly/3Szgcdh 

Compensation: All admitted students receive a weekly stipend of $550 plus an allowance for travel, housing, and meals.

Contact: Anthony Greiter, learning and development specialist, agreiter@iastate.edu

The Iowa State University Department of Statistics is accepting applications for its summer research experience for undergraduate students (REU). The 2024 program will provide students with the opportunity to conduct hands-on research using their computational and statistical skills.

During the nine-week immersive internship program, students will work closely with peer mentors, faculty members, and graduate students on current research projects. They will gain valuable research experience, leadership skills, and a deeper understanding of statistics.

The Statistics REU program will begin on June 3 and continue through August 1. Students are expected to participate in program activities for approximately 40 hours each week and join in team meetings and other scheduled events. Students will present their research findings at the end-of-program poster session.


Application Process

Students of underrepresented groups are strongly encouraged to apply. Preference will be given to students who have completed their sophomore year in an undergraduate degree program. Students from all higher education institutions may apply.


Applications for the 2024 Statistics REU will be accepted until all available spaces have been filled. To guarantee an application is reviewed, please apply by March 22 at https://bit.ly/3Szgcdh. If students are selected for the program, they will receive notification by April 5.


2024 Research Projects

REU students may elect to work on one of the following projects:


Project 1: Comparing Small Area Estimators

Large-scale surveys play a crucial role in federal statistical systems around the world. These surveys are used to collect information related to diverse application areas, including health, education, economics, crime, and agriculture. Surveys are often designed with the aim of producing estimates for large regions, such as a nation. In these large regions, sample sizes are high enough that the estimates are considered reliable. However, estimates for large regions are often insufficient for data users who require estimates for small regions, such as states, for research or policy purposes. A classic example of this problem occurs in the context of the Bureau of Justice Statistics’ National Crime Victimization Survey. This survey is designed to produce reliable estimates of national-level crime victimization rates. Data users, however, have requested estimates for states and even metropolitan statistical areas. A challenge arises because the sample sizes in these small estimation domains are often small. Therefore, standard survey estimators are unreliable, and other estimation techniques are needed. A widely adopted solution to this problem is to use a technique called “small area estimation.” In small area estimation, statistical models are used to obtain estimates for domains with small sample sizes. The main type of statistical model used for small area estimation is the class of random effect models. These models incorporate extra information through model covariates and assumptions that the random effects for different areas have common distributional properties. In this summer program, the students will first learn about small area estimation models. They will then compare alternative small area models using real and simulated data.


Project 2: Detecting the Cause of Disease When it is a Fast-Moving Pathogen

Highly variable viruses, like COVID-19, evolve so fast that they "out run" diagnostic tests designed to detect current variants of the virus. An alternative approach is to simply sequence (determine the sequence of A, C, G, and T in the genome of) the virus and diagnose a pathogen when the match to current strains is good enough. There are multiple available research tasks, from estimating the infecting pathogen's genome to testing whether the estimate is an instance of the pathogen of concern. Students will use statistics, computing, and deep learning (in one case) to conduct the research.


Project 3: Modeling Tropical Cyclone Tracks and Intensity Using Historical Storm Data

Tropical cyclones are among the most destructive natural phenomena that lead to severe damage to life and property. Much of our understanding of tropical cyclones is derived from data. IBTrACS (International Best Track Archive for Climate Stewardship, https://www.ncdc.noaa.gov/ibtracs/) provides global tropical cyclone best track data with several climate variables such as maximum sustained wind speed and minimum central pressure. These variables have spatial resolution about 10km and temporal resolution of 6 hours. The main objective in this project is to use spatial statistics methods to model hurricane tracks and intensity using the IBTrACS data. The preliminary plan is to perform exploratory data analysis to visualize hurricane tracks and intensity using statistical software R and packages ggplot2 and sf. Then students will exploit space-time process models such as point processes to estimate hurricane tracks and intensity. Finally, students will perform validation study with most recent storms such as Hurricane Franklin (2023).


Project 4: Estimating the Probability of True “Hits” when Searching Databases of Bullet and Cartridge Case Images

A crime has been committed, and firearms examiners have recovered spent bullets and cartridge cases from the scene. To generate investigative leads, examiners compare images of those bullets and cartridge cases to thousands of images that other examiners have uploaded onto a database. The hope is to get “a hit,” that is, to find one or more images in the database that look like the crime scene samples. The only available database at the time is called NIBIN and is proprietary, as is the algorithm that produces a similarity score between the query sample and the images in the database. Consequently, there is no way to know how many times a search may fail to find a real match or the probability that the search may result in false hits.

Using our images and algorithms developed by researchers at Iowa State University, we will assemble a small database and mimic what examiners do when they query NIBIN. To explore the probability of false positives, we will carry out searches when true matches are not included in the database. Students will learn to use R for the statistical calculations and to operate instruments that produce high-resolution images of the surface of bullets and cartridge cases.