Welcome to the Data Insider project! This repository documents my journey through the second challenge of the Alura LATAM Data Science challenges, focusing on data visualization.
In this challenge, I delved into understanding the behavior of large corporations worldwide using different editions of Forbes 2000 from 2015 onwards. Additionally, data from Fortune 500 editions from 2015 onwards was also analyzed.
The primary objective was to produce a detailed report to objectively understand the behavior of these corporations. This involved merging datasets, analyzing discrepancies, and providing insights through data visualization techniques.
All necessary data until 2022 were provided, and additional datasets were obtained from data.world.
One of the main challenges was merging the Forbes 2000 datasets with the Fortune Global 500 datasets. To address this, I utilized the fuzzywuzzy
library to locate companies appearing on the Fortune list that did not correspond to the Forbes list.
After preparing the datasets, analyses and insights were documented based on provided questionnaires. For each question, a graph was created. Additionally, I implemented racing bar charts and an interactive bubble plot to enhance visualization.
Beyond the challenge requirements, I included three additional sections:
- Analysis by Author: Personal analysis leading to a presentation included in this repository.
- Interactive Bubble Plot: An engaging visualization created using the
bubbly
library. - Further Insights: Additional analysis and insights on the datasets.
The challenge is divided into the following eight parts:
- Setting up the Environment
- Data Collection
- Data Wrangling
- Table Summarization and Exploratory Analysis
- Data Query
- Insights
- Bubble Plot
- Further Insights - 2