Skip to content

A collection of projects completed as part of Udacity's Data Analyst Nanodegree

Notifications You must be signed in to change notification settings

justinrgarrard/NanodegreePortfolio

Repository files navigation

Nanodegree Portfolio

A collection of projects completed as part of Udacity's Data Analyst Nanodegree.

  1. Jupyter Data Analysis
  2. R Exploratory Data Analysis
  3. Tableau Visualization
  4. SQL Data Wrangling
  5. Inferential Statistics
  6. Scikit-Learn Machine Learning

Jupyter Data Analysis

Baseball Statistics

This project used Sean Lahman's Major League Baseball data set to investigate whether or not the level of professional baseball players had, overall, improved. The inquiry was limited from 1955 to 2017 and placed an emphasis on batter ability (measured with On-Base plus Slugging) and pitcher ability (measured with Fielding Independent Pitching).

Highlights

  • No trend, positive or negative, was observed in player ability.

  • Environmental factors (like changing the strike zone) account for much more variability in statistics than player ability.

  • Uses Python, matplotlib, pandas, and numpy.

R Exploratory Data Analysis

U.S. College Statistics

This project investigated a few key variables from College Scorecard, a dataset created by the U.S. Department of Education to evaluate universities across the nation. An emphasis was placed on four-year universities with variables related to admissions, finances, and location.

Highlights

  • There appears to be a noticable trend relating tuition and five-year completion rates.

  • There is also a distinct correlation between funding type (public, non-profit, for-profit) and completion rate.

  • Uses R and ggplots.

Tableau Data Visualization

U.S. College Statistics

This project focused specifically on for-profit universities. Unlike the R data exploration project (which used the same dataset), this project analyzed data across many years.

https://public.tableau.com/views/NanodegreeDataVisProjectII/For-ProfitUniversityConcerns?:embed=y&:display_count=yes

Highlights

  • A Tableau Story which details some of the concerns surrounding for-profit universities.

  • Multiple interactive charts that can filtered by year.

SQL Data Wrangling

OpenStreetMap Southwest Idaho

This project attempted to clean and organize a set of geographical data for Southwest Idaho.

Highlights

  • Conversions between XML, CSV, and SQL data.

  • SQL queries and simple regular expressions.

Inferential Statistics

Provided Stroop Effect Data

This project made use of descriptive and inferential statistics to analyze the significance of the Stroop Effect for a given set of data.

Highlights

  • Formal report of statistical significance written in LaTeX.

  • Histograms generated with RStudio.

  • Data analyzed with Google Spreadsheets.

Scikit-Learn Machine Learning

Enron Data

This project scanned a pool of Enron email data for patterns, then built a classifier to determine persons likely involved in illicit activities.

Highlights

  • Multiple algorithms used with parameter tuning.

  • Charts illustrating the efficacy of particular features.

  • A writeup detailing the forms of assessment used (accuracy, precision, recall, F1).

About

A collection of projects completed as part of Udacity's Data Analyst Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published