sdg-classification-bert (sdgBERT App)

This repository relates to two web applications powered by a fine-tuned BERT, sdgBERT, for classifying text concerning the United Nations Sustainable Development Goals (SDG). The manually labeled data used in fine-tuning sdgBERT was obtained from the OSDG Community Dataset, publicly available at https://zenodo.org/record/5550238#.Y93vry9ByF4. The OSDG dataset includes text from diverse fields; hence, the sdgBERT model and the web apps are generic and can be used to predict the SDG of most texts. Note that sdgBERT predicts SDG1 to SDG16 only, excluding SDG17 and the "Other" category for non-SDG text.

sdgBERT Repository: You can access the sdgBERT model repository at: https://huggingface.co/sadickam/sdgBERT

The two apps supports SDG 1 to SDG 16 shown in the image below Source:https://www.un.org/development/desa/disabilities/about-us/sustainable-development-goals-sdgs-and-disability.html

App 1: SDG Text Classifier App

This app can be accessed from:

(Hugging Face Space): https://sadickam-sdg-text-classifier-app.hf.space

Key functions of App 1:

Single text prediction: copy/paste or type in a text box
Multiple text prediction: upload a CSV file (Note: The column containing the texts to be predicted must be titled "text_inputs". The app will generate an output csv file that you can download. This downloadable file will include all the original columns in the uploaded CVS, a column for predicted SDGs, and a column for prediction probability scores. If any of the text in text_inputs is longer than the maximum model sequence length of approximately 300 - 400 words (i.e., 512-word pieces), it will be automatically truncated.

App 2: Document SDG App

This app can be accessed from:

(Hugging Face Space): https://sadickam-document-sdg-app-cpu.hf.space

This app allows users to analyze PDF documents to check their alignment with the United Nations Sustainable Development Goals (SDGs). When a PDF is uploaded a PDF, the app processes the text to identify and classify content corresponding to the first 16 UN SDGs. The analysis can be conducted at the page-level or sentence-level, and users can specify the range of PDF document pages to be analyzed. This page specification function can be used to exclude tables of contents, references, appendices, etc.

Use fine-tuned BERT Transformer model directly

If you would like to directly use the fine-tuned BERT model, you can easily achieve that using the code below:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("sadickam/sdgBERT")

model = AutoModelForSequenceClassification.from_pretrained("sadickam/sdgBERT")

Or just clone the model repo from Hugging Face using the code below:

git lfs install
git clone https://huggingface.co/sadickam/sdg-classification-bert

# if you want to clone without large files – just their pointers
# prepend your git clone with the following env var:
GIT_LFS_SKIP_SMUDGE=1

OSDG online tool

The OSDG has an online tool for SDG clsssification of text. I will encourage you to check it out at https://www.osdg.ai/ or visit their github page at https://github.com/osdg-ai/osdg-data to learm more about their tool.

To do

Add model evaluation metrics
Citation information

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.devcontainer		.devcontainer
pages		pages
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sdg-classification-bert (sdgBERT App)

App 1: SDG Text Classifier App

App 2: Document SDG App

Use fine-tuned BERT Transformer model directly

OSDG online tool

To do

About

Releases

Packages

Languages

License

sadickam/sdg-classification-bert

Folders and files

Latest commit

History

Repository files navigation

sdg-classification-bert (sdgBERT App)

App 1: SDG Text Classifier App

App 2: Document SDG App

Use fine-tuned BERT Transformer model directly

OSDG online tool

To do

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages