Geometry of Entity Representations in LMs

code for 事前学習済み言語モデルによるエンティティの概念化(NLP2023)

Setup

This Docker repository is utilized for constructing the environment.

Reproducing our experiments

Data downlad

The dataset for our experiments is available at url.
Please download the file "reproduction_data_NLP2023.zip" from Google Drive into your designated data directory, and then proceed to unzip the file.

Visualization of our results

The repository's result directory already includes the results of the experiment, which makes reproducing the visualization easy.

Run experiments according to visualization_reproduction.ipynb.

Get Contextual word Embeddings from the model

You can get the embeddings from BERT by executing the following code:

bash ./shell_file/get_embeddings_formBERT.sh

※ Please ensure that the data_dir_path in get_embeddings_formBERT.sh corresponds to the directory where you have extracted the 'data.zip' file.

Calculation of condensation rate

The condensation rate can be calculated to measure the degree of separation for each cluster. You can run the program below to calculate the condensation rate and get the result.

bash ./shell_file/cal_condensation_ratio.sh

※ Please ensure that the data_dir_path in cal_condensation_ratio.sh corresponds to the directory where you have extracted the 'data.zip' file.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
result/bert-base-uncased		result/bert-base-uncased
shell_file		shell_file
README.md		README.md
cal_own_cluster.py		cal_own_cluster.py
get-embedding.py		get-embedding.py
visualization_reproduction.ipynb		visualization_reproduction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Geometry of Entity Representations in LMs

Setup

Reproducing our experiments

Data downlad

Visualization of our results

Get Contextual word Embeddings from the model

Calculation of condensation rate

About

Releases

Packages

Languages

cl-tohoku/Geo-Ent-in-LMs

Folders and files

Latest commit

History

Repository files navigation

Geometry of Entity Representations in LMs

Setup

Reproducing our experiments

Data downlad

Visualization of our results

Get Contextual word Embeddings from the model

Calculation of condensation rate

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages