Challenge-ALTeGraD

Molecule Retrieval with Natural Language Queries

Overview

The objective of this project is to explore and implement machine learning techniques for retrieving molecules (graphs) based on natural language queries. In this challenge, participants are provided with a text query and a list of molecules represented as graphs, with no additional reference or textual information about the molecules. The task is to identify and retrieve the molecule that corresponds to the given query. We aim to develop a model capable of performing this task with promising performance.

Content

main.py

To run the training pipeline:

python main.py --load_config=config/train_config.yaml

Model.py

Contains the implementation of the model, which includes the text encoder and graph encoder.

dataloader.py

Loading the data.

loss.py

Includes different loss functions.

pretrain_graph_model.py

To pretrain the graph encoder model, run the training pipeline:

python pretrain_graph_model.py --load_config=config/pretrain_graph_model.yaml

pretrain_text_model.py

To pretrain the text encoder model, run the training pipeline:

python pretrain_text_model.py

save_graph_names.py

Stores graph names for training, validation and test sets to be used for graph encoder pre-training.

view_functions.py

Different strategies of data augmentation for pretraining graph

Note

Some code sections (view_functions, some functions and classes in dataloader.py, losses.py and pretrain_graph_model.py) related to pretraining the graph are sourced from this repository: https://github.com/paridhimaheshwari2708/GraphSSL.

Dependencies

torch
torch_geometric
transformers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Challenge-ALTeGraD

Overview

Content

main.py

Model.py

dataloader.py

loss.py

pretrain_graph_model.py

pretrain_text_model.py

save_graph_names.py

view_functions.py

Note

Dependencies

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
config		config
.gitignore		.gitignore
Model.py		Model.py
README.md		README.md
altegrad_2023_Challenge.pdf		altegrad_2023_Challenge.pdf
dataloader.py		dataloader.py
loss.py		loss.py
main.py		main.py
pretrain_graph_model.py		pretrain_graph_model.py
pretrain_text_model.py		pretrain_text_model.py
requirements.txt		requirements.txt
save_graph_names.py		save_graph_names.py
view_functions.py		view_functions.py

wiamadnan/Challenge-ALTeGraD

Folders and files

Latest commit

History

Repository files navigation

Challenge-ALTeGraD

Overview

Content

main.py

Model.py

dataloader.py

loss.py

pretrain_graph_model.py

pretrain_text_model.py

save_graph_names.py

view_functions.py

Note

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages