Personal machine learning project on building a Diabetes disease prediction model with implementation of a web application.
web app: https://share.streamlit.io/group4day2019/ml_internship2021/main/merged.py
About Data
This dataset contains the sign and symptpom data of newly diabetic or would be diabetic patient.This has been collected using direct questionnaires from the patients of Sylhet Diabetes Hospital in Sylhet, Bangladesh and approved by a doctor.
Features of the dataset: Diabetes_data_upload.csv
The dataset consist of total 15 features and one target variable named class.
- Age: Age in years ranging from (20years to 65 years)
- Gender: Male / Female
- Polyuria: Yes / No
- Polydipsia: Yes/ No
- Sudden weight loss: Yes/ No
- Weakness: Yes/ No
- Polyphagia: Yes/ No
- Genital Thrush: Yes/ No
- Visual blurring: Yes/ No
- Itching: Yes/ No
- Irritability: Yes/No
- Delayed healing: Yes/ No
- Partial Paresis: Yes/ No
- Muscle stiffness: yes/ No
- Alopecia: Yes/ No
- Obesity: Yes/ No
Class: Positive / Negative
Relevant Papers:
Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques [Web Link] Authors and affiliations M. M. Faniqul IslamEmail Rahatara Ferdousi Sadikur Rahman Humayra Yasmin Bushra
Citation Request:
Islam, MM Faniqul, et al. 'Likelihood prediction of diabetes at early stage using data mining techniques.' Computer Vision and Machine Intelligence in Medical Image Analysis. Springer, Singapore, 2020. 113-125.
Islam, MM Faniqul, et al. 'Likelihood prediction of diabetes at early stage using data mining techniques.' Computer Vision and Machine Intelligence in Medical Image Analysis. Springer, Singapore, 2020. 113-125
Diabetes.csv
Source: https://www.kaggle.com/uciml/pima-indians-diabetes-database.
Data description: The Pima Indians Diabetes Dataset consists of several medical parameters and one dependent parameter (outcome) of binary values. The data-set is mainly for Female gender of at least 21 years old of Pima Indian heritage and the description of the data-set is as follows;
- 9 columns with 8 independent parameters, 1 outcome parameter with uniquely identified 768 observations having 268 positive for diabetes (1) and 500 negative for diabetes (0). The 9 columns are the following. Pregnancies: Number of times pregnant. Glucose: Oral glucose tolerant test result. Blood pressure: Diastolic blood pressure values in mmHg skin thickness: triceps skin fold thickness in mmInsulin: 2-Hour serum insulin (mu U/ml) BMI: Body mass index. DiabetesPedigreeFunction : Diabetes Pedigree function. Age: Age in years. Outcome: Class 1 indicates person with diabetes, 0 indicates other.