Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factscore replication and CommunityLM Analysis #85

Merged
merged 64 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
149dcb6
Sentiment analysis for FactScore
AakritiKinra Dec 12, 2024
493b70b
Create factscorer.py
AakritiKinra Dec 13, 2024
d29d480
Create abstain_detection.py
AakritiKinra Dec 13, 2024
e978eb9
Add files via upload
AakritiKinra Dec 13, 2024
2ac9bf6
Create openai_lm.py
AakritiKinra Dec 13, 2024
c3e55d6
CommunityLM Factuality Analysis
AakritiKinra Dec 13, 2024
2131f35
Create README.md
AakritiKinra Dec 13, 2024
e54ccb9
Create requirements.txt
AakritiKinra Dec 13, 2024
cf109d1
Rename community_lm_factuality_analysis.ipynb to community_lm_factual…
AakritiKinra Dec 13, 2024
d23814d
Rename community_lm_factuality_responses.ipynb to community_lm_factua…
AakritiKinra Dec 13, 2024
d48c46a
Code reorganization
AakritiKinra Dec 13, 2024
a73c272
Folder rename
AakritiKinra Dec 13, 2024
3f09d1e
Update README.md
AakritiKinra Dec 13, 2024
0f94e39
Update community_lm_factuality_responses.ipynb
AakritiKinra Dec 13, 2024
1f81838
Update community_lm_factuality_analysis.ipynb
AakritiKinra Dec 13, 2024
d720dfb
Update README.md
AakritiKinra Dec 13, 2024
901d588
Update README.md
AakritiKinra Dec 13, 2024
756d990
Update community_lm_utils.py
AakritiKinra Dec 13, 2024
c900084
Update community_lm_utils.py
AakritiKinra Dec 13, 2024
798bb27
Create lm.py
AakritiKinra Dec 13, 2024
cec0bd0
Added hallucination eval
AakritiKinra Dec 13, 2024
281ffc3
Delete examples/factscore_eval/factscore directory
AakritiKinra Dec 13, 2024
d21dadf
Update community_lm_factuality_responses.ipynb
AakritiKinra Dec 17, 2024
06b96b2
Update community_lm_factuality_responses.ipynb
AakritiKinra Dec 17, 2024
ac155cb
Delete examples/factscore_eval/community_lm_factuality_analysis.ipynb
AakritiKinra Dec 17, 2024
8dfefac
Added replication and CommunityLM Analysis
AakritiKinra Dec 17, 2024
873b2a3
Added fixes for ruff and mypy
AakritiKinra Dec 18, 2024
67f1bb4
ruff changes
AakritiKinra Dec 18, 2024
e44927a
Update community_lm_utils.py
AakritiKinra Dec 18, 2024
5fd9304
mypy changes
AakritiKinra Dec 18, 2024
efefd37
mypy changes
AakritiKinra Dec 19, 2024
5a8f74f
Update clm.py
AakritiKinra Dec 19, 2024
200cf07
Update README.md
AakritiKinra Dec 19, 2024
eef83c1
Update retrieval.py
AakritiKinra Dec 29, 2024
c709e39
Update retrieval.py
AakritiKinra Dec 29, 2024
44011f4
Update retrieval.py
AakritiKinra Dec 29, 2024
96de358
Update retrieval.py
AakritiKinra Dec 29, 2024
0d15386
Update atomic_facts.py
AakritiKinra Dec 31, 2024
f68b051
Update atomic_facts.py
AakritiKinra Dec 31, 2024
560b280
Update atomic_facts.py
AakritiKinra Dec 31, 2024
b7eac00
Update atomic_facts.py
AakritiKinra Dec 31, 2024
62fe55b
Update atomic_facts.py
AakritiKinra Jan 1, 2025
47f6c32
Update atomic_facts.py
AakritiKinra Jan 1, 2025
a81ad84
Update atomic_facts.py
AakritiKinra Jan 1, 2025
a07dac5
Update clm.py
AakritiKinra Jan 1, 2025
d19bf86
Update clm.py
AakritiKinra Jan 1, 2025
f5ac6cd
Update clm.py
AakritiKinra Jan 1, 2025
1bf8944
Update clm.py
AakritiKinra Jan 1, 2025
876f09d
Update npm.py
AakritiKinra Jan 1, 2025
1b0e604
Update npm.py
AakritiKinra Jan 1, 2025
ee8beb2
Update npm.py
AakritiKinra Jan 1, 2025
18e454b
Update npm.py
AakritiKinra Jan 1, 2025
582a524
Update npm.py
AakritiKinra Jan 1, 2025
cac4479
Update utils.py
AakritiKinra Jan 9, 2025
e00001c
Update factscorer.py
AakritiKinra Jan 9, 2025
17dca2b
Update factscorer.py
AakritiKinra Jan 10, 2025
27fc44e
Update openai_lm.py
AakritiKinra Jan 10, 2025
6e2c07e
Update openai_lm.py
AakritiKinra Jan 10, 2025
cd1259e
Update openai_lm.py
AakritiKinra Jan 10, 2025
697dd6f
Update openai_lm.py
AakritiKinra Jan 10, 2025
626dac7
Update lm.py
AakritiKinra Jan 10, 2025
27175a7
Update lm.py
AakritiKinra Jan 10, 2025
be9518b
Update openai_lm.py
AakritiKinra Jan 10, 2025
01556de
Update utils.py
AakritiKinra Jan 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions examples/community_lm/community_lm_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@

import os
import tqdm
import csv
from pathlib import Path
from community_lm_constants import anes_df
import pandas as pd
import numpy as np
from typing import cast

from llments.lm.lm import LanguageModel
from llments.lm.rag import RAGLanguageModel
Expand Down Expand Up @@ -162,3 +164,66 @@ def compute_group_stance(

df = pd.DataFrame(rows, columns=columns)
df.to_csv(output_filename)

def compute_group_stance_factscore(
evaluator: SentimentEvaluator,
input_filename: str,
) -> dict[str, float]:
"""Calculates group sentiment for the democratic and republican parties.

Args:
input_filename (str): The input filename.
evaluator: The sentiment evaluator.

Returns:
dict: A dictionary with keys 'democratic' and 'republican' containing their respective sentiments.
"""
democratic_responses = []
republican_responses = []

try:
with open(input_filename, mode='r', newline='', encoding='utf-8') as csvfile:
reader = csv.DictReader(csvfile)

for row in reader:
party = row['Party'].strip().lower()
response = row['Response'].strip()
if not response:
continue # Skip empty responses
if party == 'democrats':
democratic_responses.append(response)
elif party == 'republicans':
republican_responses.append(response)
else:
print(f"Warning: Unknown party '{party}' in row: {row}")

# Function to evaluate sentiments
def evaluate_sentiments(
responses: list[str],
party_name: str,
) -> float:
"""Calculates sentiment for given responses and party.

Args:
responses (list[str]): A list containing synthetic tweets for all politicians of a given party.
party_name (str): The party for which we calculate the sentiment.

Returns:
float: Group sentiment towards politicians of a given party.
"""
sentiment_vals = evaluator.evaluate_batch(responses, minibatch_size=len(responses))
group_sentiment = np.mean(sentiment_vals) * 100
return cast(float, group_sentiment)

# Calculate sentiments for each group
sentiments = {}
sentiments['democratic'] = evaluate_sentiments(democratic_responses, 'democratic')
sentiments['republican'] = evaluate_sentiments(republican_responses, 'republican')
return sentiments

except FileNotFoundError:
print(f"Error: The file '{input_filename}' does not exist.")
return {}
except Exception as e:
print(f"An error occurred: {e}")
return {}
83 changes: 83 additions & 0 deletions examples/factscore_eval/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# FActScore

This is a replication of the experiments from
[FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form
Text Generation](https://aclanthology.org/2023.emnlp-main.741) (Min et al., EMNLP
2023).

## Dependencies

To better align with the original implementation in the paper,
we recommend using the version of the dependencies in the `requirements.txt` file.
You can install the dependencies by running the following command:

```bash
pip install -r requirements.txt
```

## Configuration

### OpenAI API Key

To access and use OpenAI's services (such as GPT models),
you must obtain an API key from OpenAI.
After acquiring your API key, store it in a txt file
such as `key.txt` amd pass it when creating an instance
of the `FactScorer` class.

### Data Preparation

Before running the code to generate the CommunityLM responses,
make sure you have created the following directory to store the data:

```bash
mkdir -p factscore_data
```

This will be the data directory for the FactScore analysis
and all CSV files must be inside this folder.

## Reference

Some of this code and data was derived from the
[FActScore repo](https://github.com/shmsw25/FActScore).

If you use this example, we would appreciate if you acknowledge
[LLMents](https://github.com/neulab/llments) and the original paper.

```bibtex
@misc{
title = "{LLMents}: A Toolkit for Language Model Experiments",
author = "
Graham Neubig and
Aakriti Kinra and
Mihir Bansal and
Qingyang Liu and
Rohan Modi and
Xinran Wan
",
year = "2024",
howpublished = "https://github.com/neulab/llments",
}
```

```bibtex
@inproceedings{min-etal-2023-factscore,
title = "{FA}ct{S}core: Fine-grained Atomic Evaluation of Factual
Precision in Long Form Text Generation",
author = "Min, Sewon and
Krishna, Kalpesh and
Lyu, Xinxi and
Lewis, Mike and
Yih, Wen-tau and
Koh, Pang and
Iyyer, Mohit and
Zettlemoyer, Luke and
Hajishirzi, Hannaneh",
booktitle = "Proceedings of the 2023 Conference on Empirical
Methods in Natural Language Processing",
year = "2023",
publisher = "Association for Computational Linguistics",
doi = "10.18653/v1/2023.emnlp-main.741",
}
```
Loading
Loading