A collection of projects performed in STAT courses.
In the SPRING 2021 semester, my Python in Data Science Programming
course (STAT 430) assigned a final project that required students to produce various visualizations using MLB's Statcast
baseball data. For my specific project, I utilized the pybaseball
baseball data package to obtain instances of unique pitches for a given range of seasons (in my case, 2016-2020) and plotly
to build an app capable of inputting user criteria and outputting corresponding visualizations.
In the FALL 2020 semester, a few classmates and I in STAT 425 (Applied Regression and Design) explored robust regression methods and the specific applications said methods could have on our ability to derive conclusions on data sets that inherently violated homoscedasiticity and presented high leverage observations. We utilized a wage data set of 2013 Los Angeles public workers, utilizing position-based, department-based, salary-based, benefits-based statistics to determine the influence of outliers on a model's predictive capabilities. We conclude that robust regression models are superior to their linear regression counterparts given the situations we were presented with (see violations above); however, as a relatively new method in (yet, another) relatively new field of science, I would recommend taking our paper with a grain of salt, as we statisticians and others continue to explore the proper applications of robust regression.
This project was a cumulative effort of a few classmates and me in my STAT 385 course (statistical programming). We were tasked with providing information related to COVID-19 that a typical University of Illinois agent may be inclined to consume. Although the outcome is not the most visually appealing app (we were given a relatively short timeline), we all enjoyed taking part in designing a user-interface console that actively aggregates and outputs relevant information for a user to interact with.