Skip to content

This repository highlights the projects I did as a student of the Applied Data Science Lab at WorldQuant University.

License

Notifications You must be signed in to change notification settings

Rohit-Rannavre/Applied-Data-Science-Lab-2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

⭐ Applied Data Science Lab [2023]


The Applied Data Science Lab, offered by WorldQuant University, is an immersive online program designed to equip students with practical skills in addressing real-world, intricate challenges. Spanning over 16 weeks, the program engages participants in a series of comprehensive data science projects that enable them to develop proficiency in data wrangling, analysis, model-building and effective communication through hands-on experience.

Throughout the program, I had the opportunity to actively participate in eight fascinating projects, all designed to strengthen fundamental data science concepts. Let me offer a concise explanation of each project.


Analyzed a dataset comprising 21,000 properties to ascertain whether real estate prices are predominantly influenced by property size or location. The process involved importing and cleaning data from a CSV file, creating data visualizations and exploring the correlation between the two variables.


Developed a linear regression model aimed at predicting apartment prices in Argentina. Established a data pipeline to handle missing values and encode categorical features, subsequently enhancing model performance through the reduction of overfitting.


Built an ARMA time-series model for forecasting particulate matter levels in Kenya. Extracted data from a MongoDB database using pymongo and fine-tuned model performance through hyperparameter adjustments.


Constructed logistic regression and decision tree models for predicting earthquake damage to buildings. Extracted relevant data from a SQLite database and identified biases within the dataset that may contribute to discriminatory outcomes.


Developed random forest and gradient boosting models for predicting the likelihood of a company going bankrupt. Navigated the Linux command line, addressed data imbalance through resampling techniques and evaluated the impact of performance metrics such as precision and recall.


Constructed a k-means model to cluster US consumers into distinct groups, employed principal component analysis (PCA) for data visualization and finally designed an interactive dashboard using Plotly Dash.


Performed a chi-square test to assess the impact of email communication on program enrollment at WQU. Developed custom Python classes for an Extract, Transform, Load (ETL) process and designed an interactive data application following a three-tiered design pattern.


Developed a GARCH time series model to forecast asset volatility, retrieved stock data via an API, cleaned and stored the data in a SQLite database and lastly constructed an API to deliver model predictions.



NOTE: The code content of the projects will not be uploaded due to copyright issues.


Let's connect:

LinkedIn Gmail

About

This repository highlights the projects I did as a student of the Applied Data Science Lab at WorldQuant University.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published