Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPO terms retrieved with Gene.get and Disease.get don't match those in hpo.jax.org #26

Closed
aruta321 opened this issue Feb 20, 2025 · 3 comments

Comments

@aruta321
Copy link

Hello,

I tried to retrieve the HPO terms for a list of genes and diseases in a OMIM database. It seems like they don't directly correspond to the HPO terms in the HPO database, though. For example, CARD9 has 20 associated HPO terms. PyHPO using
this code snippet returns 93 HPO terms.
gene_term = HPOSet.from_queries(Gene.get(gene).hpo)

Many of these are somewhat related to the HPO terms in the set above.

Image

This looks like it's retrieving all terms and their parents to the root node. Is there a way to only retrieve the terms in the database without their parents? Thanks for your help.

@anergictcell
Copy link
Owner

Hi, you are right, pyhpo is indeed linking all parent terms to the corresponding genes as well. This was intentional from me at the time, but turned out to actually not be the correct way. So far, I didn't change the behavior in pyhpo due to backwards compatibility. I'm considering changing it at some point in the future, but have no timeline as of now.

I did fix it, however, in the hpo3 library, which works almost identical to pyhpo in its use. It uses a Rust backend (but can be fully used with Python only), so its much faster and only lacks a few features. Unless you rely on a unique feature of pyhpo, I recommend using hpo3 instead anyways.

Your code would be only slightly different:

pip install hpo3
from pyhpo import Ontology, Gene

Ontology()
gene_term = Gene.get("CARD9").hpo_set()

len(gene_term)
# ==> 20

@aruta321
Copy link
Author

I think it may be useful to make this distinction in the documentation until it has been updated. Having these two tools functioning in different ways could lead to some confusion.

@anergictcell
Copy link
Owner

I just fixed this issue with #28 in version 4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants