Skip to content

BingQuanChua/COVID-19-Msia-Cases-And-Vaccination

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malaysia COVID-19 Cases and Vaccination

An in-depth extension of our previous Assignment.

📚Datasets

Data taken as of 6-10-2021 (cut off date)

Data source:

  1. COVID-19 Open Data from the Minister of Health (MoH)
    https://github.com/MoH-Malaysia/covid19-public

  2. Vaccination Data from COVID-19 Immunisation Task Force (CITF)
    https://github.com/CITF-Malaysia/citf-public

📖Table of Contents

A list of questions that we have came up with.

Exploratory Data Analysis

  • Analyse which group of population are more vulnerable to COVID cases in Malaysia.
  • Analyse how COVID cases vary across time dimensions at different granularity.
  • What is the stationarity of the time-series dataset?
  • What are the vaccination and registration rates per state in Malaysia?
  • What are the types and total number of side effects for each type of vaccine?
  • Which type of vaccine is given to more people?
  • Which states are recovering? Which of the states shows a decrease in the number of COVID-19 cases?
  • When is the time of the day with most MySejahtera check-ins?
  • What are the dates with the highest number of checkins? How does it correlate with the number of cases and deaths during the day?
  • Rate of Serious Vaccine Side Effect VS COVID Death Rate without obtaining vaccine, which one is more dangerous?

Clustering Analysis

  • How well does each state handle COVID-19 cases based on past COVID-19 cases and deaths records?

Regression and Classification

  • By utilizing the previous COVID-19 records, is it possible to construct a model capable of predicting/classifying the number of cases for the upcoming day or week?

Time-series Regression

  • How can the Malaysian government predict the number of daily new cases accurately based on past data in order to deploy appropriate movement control measures?

📒Jupyter Notebooks

A guide to reading our Jupyter Notebooks.

Reading the dataset, basic data cleaning and simple EDAs for each of the dataset category:

EDA_Epidemic
EDA_Vaccination_and_Registration
EDA_MySejahtera

A deeper exploration into the datasets with questions to gain a better understandings and findings:

EDA_Questions

Data Mining with Clustering Analysis, Regression, Classification and Time-Series Regression:

DM_Clustering
DM_Regression_and_Classification
DM_Time-Series_Regression

🌱Deployment

Our results are deployed on Heroku in the form of a Streamlit webapp.

Check out our project on Heroku! Using light mode is recommended.

Screenshots:

Navigation

Clustering Analysis

📑References

  1. COVID-19: What Is Hidden Behind the Official Numbers?
  2. How to Develop LSTM Models for Time Series Forecasting
  3. Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras
  4. Evaluate the Performance Of Deep Learning Models in Keras
  5. Multivariate Time Series Forecasting with LSTMs in Keras
  6. Stationarity in Time Series Analysis Explained using Python
  7. Time Series Analysis using ARIMA and LSTM(in Python and Keras)
  8. How to Remove Non-Stationarity in Time Series Forecasting

About

💉 Malaysia COVID-19 Cases and Vaccination, a Data Mining approach

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages