Exploratory Data Analysis (EDA) is a crucial process in the financial industry, particularly in lending. This project involves analyzing patterns in customer attributes and loan data to make informed loan approval decisions.
- General Information
- Technologies Used
- Data Cleaning and Preparation
- Analysis
- Conclusions
- Acknowledgements
- Contact
- Project Objective: The goal of this analysis is to develop insights that can be used to predict whether a new loan applicant is likely to default.
- Importance of EDA in Lending: By leveraging data, financial institutions can gain valuable insights into customer behaviour, loan performance, and market trends, enabling them to make informed decisions.
- Industry Challenges: The lending industry is constantly evolving and faces numerous challenges such as increasing default rates and changing customer preferences. Understanding these challenges is essential for successful loan approval processes.
- Company Context: A consumer finance company is looking for patterns in customer and loan attributes that are associated with loan defaults. Identifying these patterns can improve loan approval decisions and mitigate risks.
- Python: version 3.12.0
- Numpy: version 1.26.1
- Pandas: version 2.1.2
- Plotly: version 5.18.0
- Matplotlib: version 3.x
- Jupyter: version 7.0.6
- Git: version 2.42.1
- Anaconda: latest version
- Data Collection: Gathered loan and customer data from reliable sources.
- Data Cleaning: Removed duplicates, handled missing values, and corrected inconsistencies.
- Data Transformation: Converted categorical variables, normalized numerical data, and created new features where necessary.
- Univariate Analysis: Examined individual variables to understand their distributions and identify outliers.
- Segmented Univariate Analysis: Analyzed data segments to understand the behaviour of different customer groups.
- Bivariate Analysis: Explored relationships between pairs of variables to uncover correlations and potential causations.
- Visualizations: Created visual representations of data insights using Plotly and Matplotlib to better understand and communicate findings.
- Data Cleaning: Efficiently and accurately cleaned the extracted data.
- Insights from Univariate Analysis: Identified key trends and outliers in individual variables.
- Insights from Segmented Univariate Analysis: Gained understanding of how different customer segments behave.
- Insights from Bivariate Analysis: Discovered relationships between variables and identified factors driving loan defaults.
- Visualizations: Provided visual summaries of findings to facilitate decision-making.
- Resources Utilized:
Created by @SandeepGitGuy and @NishanthAV - feel free to contact us!