Skip to content

Latest commit

 

History

History
36 lines (20 loc) · 1.35 KB

README.md

File metadata and controls

36 lines (20 loc) · 1.35 KB

PINYON

PINYON implements a community detection algorithm that can consider the context and meaning of the entities in a post. PINYON accurately identifies semantically related posts in various contexts.

Using PINYON

Pre-processing

In order to use PINYON first, we need to pre-process the corpus of social media posts (tweets). The TweetsCOV19(may2020) can be downloaded using this link

After downloading the tweets dataset, we need to execute the three scripts in the tweets_process directory.

Once all the scripts finish executing, we need to obtain the tweets' original text. This can be done using Hydrator

Embeddings download

The embedding for each corresponding KG needs to be downloaded and placed in the embedding/data/ directory

DBpedia

Wikidata

UMLS

The PINYON SCD Approach

Now that we have all the necessary data, we can run the PINYON approach against the three KGs (UMLS, Wikidata, and DBpedia). For example, to run the approach against UMLS, please use the following:

python3 run_umls.py