This project is based on a Kaggle competition Playground Series - Season 4, Episode 5 that focuses on predicting flood probability. The model is implemented using TensorFlow and Keras, with an emphasis on fine-tuning for performance.
train.csv
: The training dataset containing features and flood probability labels.test.csv
: The test dataset used for generating predictions.sample_submission.csv
: A sample submission file provided by Kaggle to format the prediction results.main.ipynb
: The Jupyter Notebook from which this preprocessing, training, evaluating and creating submission file.submission.csv
: The final submission file generated by the model.
The goal of this project is to predict the probability of flooding based on various environmental and infrastructural factors. The model uses a Sequential neural network with dense layers, trained on the provided dataset. The project involves data preprocessing, model training, and evaluation.
- Combined training and test datasets for consistent preprocessing.
- Created new features
Ratio1
andRatio2
derived from the existing features. - Scaled the data using
RobustScaler
to handle outliers and ensure a uniform scale across features.
- Dense Layer 1: 64 units, ReLU activation.
- Dense Layer 2: 32 units, ReLU activation.
- Output Layer: 1 unit, linear activation for regression.
The model is compiled with the Adam optimizer and a mean squared error loss function. A custom R^2 score metric is also used to evaluate model performance.
- Batch Size: 1024
- Epochs: 100
- The model is trained on the training set, with evaluation on the same set due to the absence of a validation split in this script.
The model's performance is evaluated using the R^2 score and loss on the training set. The best model based on these metrics is saved and used for generating predictions on the test set.
To run this project, ensure you have Python installed along with the required libraries listed in requirements.txt
.
pip install -r requirements.txt
-
Run the Training Script On:
main.ipynb
This will train the model and save the predictions in
submission.csv
. -
Submit Predictions: Upload
submission.csv
to the Kaggle competition to evaluate the model's performance.
This project is made by Hüseyin Battal and is intended for educational purposes as part of the Kaggle competition.
GitHub: https://github.com/huseyinbattal3469
LinkedIn: https://www.linkedin.com/in/huseyin-battal/