Fys-stk3155_Project3

A classification model using "Fake news classification" dataset by Bhavik Jikadara for classifying fake news.

We perform classification experiments on detecting fake versus credible news articles using three distinct models:

Logistic Regression
Neural Network classifier
Decision Tree.

By comparing their performances and examining the most influential features, we gain insight into how textual patterns inform classification decisions.

Contributors:

Elaha Ahmadi, Herman Scheele & Theodor Jaarvik

Instructions

If there is any problems with file pathing, remove "../" as there has been issues between contributors, use branch 'absolute_path_configured', you can do this by writing "git checkout absolute_path_configured" in your terminal

Start by running the file "Run_All.py", this will create the final dataset, and build & save all the models, as well as do one run of our model testing file
Then run the file "Model_Testing.py" if you want to test the models more than once, no retraining is needed as the models are saved.

setup and imports

The packages and frameworks needed to run this project is:

Language Python 3.9
IPython

Python Libraries:

pandas (imported as pd)
numpy (imported as np)
matplotlib.pyplot (imported as plt)
seaborn (imported as sns)
scikit-learn (imported as sklearn)
keras (imported as keras)
joblib (imported as joblib)
shap (imported as shap)

Scikit-learn Modules:

model_selection (for train_test_split, cross_val_score)
metrics (for accuracy_score, classification_report, roc_auc_score, roc_curve)
linear_model (for LogisticRegression)
tree (for DecisionTreeClassifier, plot_tree)
ensemble (for RandomForestClassifier, BaggingClassifier, GradientBoostingClassifier)
feature_extraction.text (for TfidfVectorizer)

Keras Modules:

models (for Sequential)
layers (for Dense)
optimizers (for SGD, Adam)
utils (for to_categorical)

Other:

os (for file path manipulation)
re (for regular expressions)

Other Notes and Known Issues

Running the file "Run_All.py" will take a while to run, this is because we are training the Neural Network model(approximately 5 minutes).
Running the file 'Model_Testing.py' will cause warnings, altough it should not cause any issues on the results as far as i am aware.
(This might not affect you) I have had issues with file pathing, so i am using relative paths, other contributors have used absolute paths and have not had any issues. Use branch 'absolute_path_configured' if you have issues, you can do this by writing "git checkout absolute_path_configured" in your terminal

Aknowledgements

Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
Chollet, F. (2015). Keras. https://keras.io
Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems (pp. 4765-4774).
Waskom, M. L. (2021). seaborn: Statistical data visualization. Journal of Open Source Software, 6(60), 3021.
Joblib: Python Parallel Computing, https://joblib.readthedocs.io/, 2023.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
Code		Code
Results		Results
data		data
.gitattributes		.gitattributes
README.md		README.md
Report_Project_3.pdf		Report_Project_3.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fys-stk3155_Project3

Contributors:

Instructions

setup and imports

Other Notes and Known Issues

Aknowledgements

About

Releases

Packages

Contributors 3

Languages

TheodorJaarvik/Fys-stk3155_Project3

Folders and files

Latest commit

History

Repository files navigation

Fys-stk3155_Project3

Contributors:

Instructions

setup and imports

Other Notes and Known Issues

Aknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages