The case study focuses on risk analytics in banking, where the goal is to reduce the risk of losing money when giving loans to customers. Loan companies struggle to lend to people with little or no credit history, which some use to their advantage by defaulting on loans.
Imagine you work for a company that lends money to urban customers. Your task is to use Exploratory Data Analysis (EDA) to study loan application data. This helps ensure that capable applicants aren't wrongly rejected and risky ones are identified.
When deciding to approve a loan, the company faces two risks:
- If a reliable applicant is rejected, the company loses business.
- If a risky applicant is approved, the company may lose money.
Your job is to predict whether a person is eligible for a loan based on various factors like income, credit history, and more.
Approach
Data Understanding and Preprocessing:
- Load the dataset.
- Explore the features and types of data.
- Handle missing data.
- Convert categorical data into numbers.
Exploratory Data Analysis (EDA):
- Analyze statistics like mean and median.
- Visualize the data with charts and draw conclusions.
Model Development:
- Split data into training and testing sets.
- Train and test several machine learning models, such as Logistic Regression and Decision Trees.
- Model Selection and Tuning:
Compare models using cross-validation.
- Choose the best model and fine-tune it for better performance.
Model Evaluation:
- Test the final model’s performance.
- Analyze key metrics to understand how well it works.
Conclusion: Summarize the findings and results.