"Using Baselines for Algorithm Audits" submitted to European Data and Computational Journalism Conference, 2017

By Jennifer A Stark and Nick Diakopoulos

The Data

Collecting data

Data were collected automatically using a web scraper once per day using code based on this project. Images were downloaded and related information such as the web link, collection datetime, the search term (e.g. Hillary Clinton, Donald Trump) etc were stored in a MySQL database housed on our AWS space which was then filtered and downloaded as a csv.

Processed data

Baseline image processing can be found in BASELINE directory, while data processing for image box images, found on the main Google search results page, are in IMAGE_BOX.

Analysis is divided up into analysing the images themselves for sentiment using the Microsoft APIs, and analysing the sources of the images (e.g. Business Insider, Breitbart, Salon). News source main analysis can be found in Statistics.ipynb in the main directory.

Requirements

Python 3
ipython notebook / Jupyter
pandas
numpy
matplotlib.pyplot
json
shelve
PIL
imagehash
argparse
GoogleScraper

Funding

This project was funded by a grant from the Tow Center for Digital Journalism to study computational and data journalism in the context of algorithmic accountability reporting.

Feedback

Email Jennifer A Stark at starkja@umd.edu

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
BASELINE		BASELINE
IMAGE_BOX		IMAGE_BOX
.gitignore		.gitignore
Facebook_study.csv		Facebook_study.csv
LICENSE		LICENSE
README.md		README.md
Statistics.ipynb		Statistics.ipynb
index.py		index.py
logo-all-sides-medium.png		logo-all-sides-medium.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

"Using Baselines for Algorithm Audits" submitted to European Data and Computational Journalism Conference, 2017

By Jennifer A Stark and Nick Diakopoulos

The Data

Collecting data

Processed data

Requirements

Funding

Feedback

About

Releases

Packages

Languages

License

comp-journalism/Baseline_Problem_for_Algorithm_Audits

Folders and files

Latest commit

History

Repository files navigation

"Using Baselines for Algorithm Audits" submitted to European Data and Computational Journalism Conference, 2017

By Jennifer A Stark and Nick Diakopoulos

The Data

Collecting data

Processed data

Requirements

Funding

Feedback

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages