Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing annotations #4

Open
mozack opened this issue Jul 12, 2023 · 4 comments
Open

Missing annotations #4

mozack opened this issue Jul 12, 2023 · 4 comments

Comments

@mozack
Copy link

mozack commented Jul 12, 2023

Hi,

Thank you for this fantastic resource!

The CAT genes index does not appear to have annotation entries for 3 samples:
HG002
HG005
NA19240

https://github.com/human-pangenomics/HPP_Year1_Assemblies/blob/main/annotation_index/Year1_assemblies_v2_genbank_CAT_genes.index

Are the gene annotations for these 3 samples available elsewhere?

Thanks!

@wwliao
Copy link

wwliao commented Jul 12, 2023

The CAT pipeline was dependent on the Minigraph-Cactus graph, resulting in its applicability to only 44 samples (HG002, HG005, NA19240 were set aside to facilitate their use in benchmarking). Conversely, the Ensembl pipeline should include gene annotations for all 47 samples. The link to access the Ensembl gene annotations is: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=submissions/8E6C4ACC-FEA9-4DD8-94A3-B92234206F95--Y1_ENSEMBL_V1/

@mhaukness-ucsc, could you please check if the above link is the version used in the HPRC marker paper?

@juklucas, in your opinion, should we consider providing an index file for the Ensembl gene annotations as well?

@mozack
Copy link
Author

mozack commented Jul 13, 2023

Thanks so much! I see the Ensembl annotations and will try them out.

@diekhans
Copy link

The above link should be correct for CAT for comparisons to marker paper results; however Ensembl should be used for new analysis.

@sdu-lcy
Copy link

sdu-lcy commented Nov 20, 2024

The CAT pipeline was dependent on the Minigraph-Cactus graph, resulting in its applicability to only 44 samples (HG002, HG005, NA19240 were set aside to facilitate their use in benchmarking). Conversely, the Ensembl pipeline should include gene annotations for all 47 samples. The link to access the Ensembl gene annotations is: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=submissions/8E6C4ACC-FEA9-4DD8-94A3-B92234206F95--Y1_ENSEMBL_V1/

@mhaukness-ucsc, could you please check if the above link is the version used in the HPRC marker paper?

@juklucas, in your opinion, should we consider providing an index file for the Ensembl gene annotations as well?

Do the coordinates of Ensembl gene annotations in this link match the assembly version in assembly_index/Year1_assemblies_v2_genbank.index?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants