GitHub - d-f/binding-affinity: Protein-Ligand binding affinity prediction with Protein LLMs and Graph Attention Networks.

process_dataset.py processes the PDBBind dataset through the deepchem library. It will embed the sequence of amino acid abbreviations with a Protein LLM, convert the rdkit Molecule objects into torch_geometric graphs and saves the binding affinity each in separate directories.

determine_bond_types.py determines which bond types are present within the dataset for nomalization purposes

train_models.py allows for training a pure GAT or a combination of a GAT and Transformer depending on what is set for the model_type parameter. If a pure GAT is used, protein LLM embeddings are concatenated to atomic features when creating a ligand molecule graph and the graph is used for whole graph regression. If a combination of transformer and GAT is used, the GAT will be used to embed the graph, and a transformer will predict the binding affinity between the protein embedding and embedded ligand graph.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
determine_bond_types.py		determine_bond_types.py
process_dataset.py		process_dataset.py
train_models.py		train_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

d-f/binding-affinity

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages