Image-Enhancement

Date: 2024-08-10

This repository is a small project on SRCNN (Super Resolution Convolutional Neural Network) for image enhancement. This is done as a personal project to understand the logic behind SRCNN and how it works. This project is implemented using TensorFlow and Keras. The dataset used for training is the DIV2K dataset which is a high-quality dataset for image super-resolution. The dataset can be downloaded from here.

Requirements

Python 3.10.11
TensorFlow 2.17.0
Keras 3.4.1
Numpy 1.24.3
Matplotlib 3.9.1
Scikit-Image 0.24.0
Scikit-Learn 1.5.1
Pillow 10.4.0

Design

The SRCNN model is made up of three layers:

Convolutional Layer 1: 9x9 kernel size, 64 filters, ReLU activation, padding = 'same', kernel_initializer = 'glorot_uniform'
Convolutional Layer 2: 3x3 kernel size, 32 filters, ReLU activation, padding = 'same', kernel_initializer = 'glorot_uniform'
Convolutional Layer 3: 5x5 kernel size, 3 filters, Linear activation, padding = 'same', kernel_initializer = 'glorot_uniform'

The model is trained using the DIV2K dataset. The dataset is divided into 560 images for training, 120 images for validation, and 120 images for testing. The images are cropped to 512x512 pixels and downsampled to x3 smaller size and scaled back to 512x512 pixels using bicubic interpolation.

The model is trained using a custom loss function using the mean squared error (MSE), structural similarity index (SSIM), and VGG19 loss. The loss function is defined as follows:

loss = 0.8 * sigmoid(MSE) + 0.1 * (1 - (1 + SSIM)/2)) + 0.1 * sigmoid(VGG19_loss)

This loss function is used to balance the trade-off between the MSE, SSIM, and VGG19 Perceptual loss. This custom loss function is not optimal and can be improved.

The model is trained using the Adam optimizer with a learning rate of 1e-4 and a batch size of 2. The model is trained for 100 epochs with early stopping if the validation loss does not improve for 10 epochs.

Results

The model is evaluated using the test dataset. The results are shown below:

Original Image:

Bicubic Interpolation:

SRCNN:

SRCNN's Overall Evaluation Metrics:

Average PSNR: 30.1528
Average SSIM: 0.8926
Average MSE: 107.8839
Range of PSNR: 18.9426 - 49.5030
Range of SSIM: 0.5610 - 0.9918
Range of MSE: 0.7291 - 829.5155

Bicubic Interpolation's Overall Evaluation Metrics:

Average PSNR: 34.2904
Average SSIM: 0.8825
Average MSE: 93.8065
Range of PSNR: 18.7908 - 58.9477
Range of SSIM: 0.4983 - 0.9982
Range of MSE: 0.0829 - 859.0077

Although the Bicubic Interpolation scores higher for most of the evaluation metrics, the SRCNN model is able to produce better visual results compared to Bicubic Interpolation as shown in the images above.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
Process-Images.ipynb		Process-Images.ipynb
README.md		README.md
SRCNN.ipynb		SRCNN.ipynb
bi_example.png		bi_example.png
original.png		original.png
requirements.txt		requirements.txt
sr_image_46.png		sr_image_46.png
srcnn_model.keras		srcnn_model.keras

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Enhancement

Requirements

Design

Results

References

About

Releases

Packages

Languages

AchiraLaovong/Image-Enhancement

Folders and files

Latest commit

History

Repository files navigation

Image-Enhancement

Requirements

Design

Results

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages