• Chico State Chemistry Faculty: Erik Wasinger, Lisa Kendhammer and Tiffany Anderson
  • Data Owner: Tom Rosenow, Institutional Research
  • SLC Coordinator: Dawn Frank

Data Science Student Team (DSST)

  • Joseph Shifman: Oversight
  • Rica Rebusit: Taskmaster
  • Faith Fatchen: Liaison
  • Skip Moses: Time keeper


It is the top mission of the university to retain and graduate students who are well prepared for careers or graduate school. There has been a major shift in the student body with regards to demographics and preparation in the past 10 or so years. Now more than ever the faces and experiences of the faculty do not match that of their students. Chico State has been struggling with a low 4-year graduation rate, and a high first year drop out rate for first time freshmen. Large CSU system wide initiatives such as the Graduation Initiative 2025 (GI2025) have been formed to research what works and implement innovative changes to remove barriers that impede student success and be more strategic about the way campus serves students.

One approach being taken by Chico State is to implement a student learning program called Supplemental Instruction (SI). This program that has shown to be effective in published research but has yet to be evaluated on our campus.

Project Goals

Assist the clients in answering the following question:

What impact does Supplemental Instruction have on student success at Chico State?

This question will be explored through the full data science lifecycle from problem definition, data collection and exploration, visualization, modeling (both for understand relationships and to make predictions), creating recommendations and incorporating feedback.

Timeline / Milestones

Details regarding deliverables to be added by DSST

Timeline / Milestones

Details regarding deliverables to be added by DSST

In Progress

  • Clean up the github (readme.r, data_management.rmd, take out scratch files)
  • Contact Erik to clarify who to invite to stakeholders meeting, and clarify how much CEM stuff we should include.
  • Start pulling in poster figs and remarks into a powerpoint.


  • The poster draft is due next Tuesday, 04/26/2022
  • Getting these models completed need to be done ASAP
    • Examine a basic model of the form grades ~ attended SI at least once.
    • Logistic Regression model using different equity gaps to model probability of one year retention.
    • Explore a course level model explaining DWF rates and SI.
  • We need a table for comparing our sample to the population (2016-present)
  • Create a .rmd file generating all our pictures
    • Create some visualizations of the distribution of grades faceted on equity gaps
    • Create similar plot for
  • Exploration of CEM with S.I as a treatment and grade as outcome.
  • Prepare some slides for next share out 04/19/2022
  • Start working on Abstract (Due 04/18/2022)
  • The client will give a 15-20 minute project introduction at 9:30am on 2/10 in Butte 203
  • The Data Science Student Team (DSST) will complete Data Security and CITI training by 2/18/22
  • DSST will research the problem and generate a list of data sources needed to address the question.
  • Read/Summarize exact matching paper and 5-10 most recent works. Complete by March 10th.
  • Clean Student Profile, Student Program and Course Description data sets. Update code books.
  • Acquire SI data set.
  • Created grades for SI courses, clean student profile data, and filtered student profile data to students who have taken at least one SI course for EDA.
  • Aggregate the course level data (Doesn’t seem to be a fruitful approach. On hold for now).
  • Look at number of extra semesters taken to graduate. Add additional variables (Transfer student, first gen, major program.) Need to address how to handle pandemic/online semesters here. Pandemic seems to limit this approach.
  • Select a poster template by 03/27/2022.

Long term goals.

  • Completed!

Project Updates

Weekly updates done by DSST via PR


Today, we met for the first time. Roles were assigned, and we discussed what our first steps will be.

  • Next Steps:

    1. Complete security training.
    2. Download data.
    3. Start exploring and building a codebook.
  • Remarks:

    • Keep separate scratch files.
    • PR will be flexible.
    • Respond on discord by end of day.
    • Add data sets to gitignore file before commiting.


  • Milestones

  • Data codebooks complete

  • Started reviewing literature

  • Next steps

  • Exploratory data analysis

  • Discuss literature as group

  • Literature summary


  • Met with Erik to discuss first steps.

    • Still waiting on SI visit data.

    • Look at Fresno State paper by Hongtao Yue et al.

    • Look into clinical trial methodologies for techniques.

    • Explore GPA data and quantify some of the equity gaps.

    • Data Still needed:

      • SI visit level data
      • Student date (or) year of birth
      • Flag if student is Pell eligible (Probably can’t get)
      • Flag if student participated in upward bound
      • Flag if student lived in residence hall
      • Flag if student is enrolled in honors program
      • Flag if student is enrolled in EOP
      • Student permanent address zip (or) county