moral-decision-dataset

Overview

This repository contains code files and documentation pertaining to the Moral Decision Dataset (MDD). The MDD describes real-world cases, associated parameters, and case-based moral decisions. To enhance the usability of this resource, this overview provides additional information on how the dataset was created and a tutorial based on using the dataset for moral decision determination tasks.

#Table of Contents:

Introduction
Methodology
Dataset Description - MDD

3.1 Dataset Features

3.2 Moral Decision - Y
Ethics Scoring Algorithm - ESA

4.1 Modules

4.2 Moral Judgement

4.3 Context-Sensitive Thresholding
Resource Specifications
Repository Details
Setting Up
Tutorial
A Cautionary Tale!
Resource Maintenance

#Introduction The ubiquity of autonomous systems in critical decision-making capacities with significant impacts on society and its functioning makes it imperative to provide them with moral cognitive abilities. To facilitate this effort, we have curated a Moral Decision Dataset (MDD) that captures everyday scenarios where a question for morality is raised, along with parameters that aid its moral decision, and the decision itself. MDD is created using an LLM-aided methodology using seed data from online sources, which are then preprocessed, extracted, summarized, and augmented using state-of-the-art LLMs. This paper also provides a brief overview of how language models may be used to curate and develop datasets from sparse and highly abstract data. To demonstrate the validity and robustness of the dataset, we also present an Ethics Scoring Algorithm (ESA) that reuses the parameters defined in the dataset to calculate ethical scores for isolated actions. Furthermore, the ESA introduces the novel concept of context-sensitive thresholding to discretize grey areas in an effort to resolve ethical dilemmas. This work aims to facilitate moral reasoning in AI systems that are deployed in various sections of society through a clearly outlined methodology, modular development, and generalized applicability.

This project makes the following contributions:

A methodology to develop and curate a dataset for sparse, abstract, and subjective data using language models.
A Moral Decision Dataset that captures scenarios and associated parameters that aid the moral decision.
A knowledge graph (KG) that extends the MDD.
An Ethics Scoring Algorithm that provides ethical judgment based on ethics theory and available contextual information.
A method to quantify case-specific grey areas using context-sensitive thresholding.

#Methodology

Domain Understanding The project started with extensive research, including:

Insights from legal professionals.
Analysis of real-world cases on forums like Reddit and Quora.
Understanding variability in legal interpretations across jurisdictions.

Data Collection Raw legal data was extracted from Reddit subreddits across multiple countries (India, UK, Canada, etc.) using a custom Python script. Data format:

Title
Case Text
Upvote Ratio

Feature Extraction Key features extracted using LLMs:

Active Agent
Passive Agent
Action Done by Active Agent
Domain
Ethical Issues
Consequences (severity, utility, duration)
Moral Intentions
Ethical Principles Upheld/Violated
Relationship Between Agents
Moral Decision

Summarization Cases were summarized using a predefined template for accuracy: The did to which led to . The had <good/bad/neutral> moral intention, however, the violated which caused .
Augmentation Data augmentation techniques were used to generate multiple instances of legal cases by varying:

Context
Agents
Ethical issues

Validation Results from Llama-3 were validated using Gemma LLM with additional feedback. This ensured accuracy and consistency.

#Setup Instructions

Prerequisites

Python 3.8+
Kaggle API
Together.ai API Key

Dependencies Install required libraries:

pip install pandas spacy tqdm json re togetherai

Usage Feature Extraction Run the feature extraction script:

python feature_extraction.py

Summarization Summarize case texts:

python summarization.py

Evaluation Validate and rate summaries and features:

python evaluation.py

Augmentation Generate augmented cases: python augmentation.py

Outputs -Extracted Features: feature_extraction.csv -Summaries: summary.csv -Evaluated Data: evaluated_data.json -Augmented Cases: augmented_cases.json

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
code		code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

moral-decision-dataset

Overview

About

Releases

Packages

Contributors 2

Languages

License

kracr/moral-decision-dataset

Folders and files

Latest commit

History

Repository files navigation

moral-decision-dataset

Overview

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages