Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

14.2.2 beta diversity #658

Open
antagomir opened this issue Feb 3, 2025 · 0 comments
Open

14.2.2 beta diversity #658

antagomir opened this issue Feb 3, 2025 · 0 comments

Comments

@antagomir
Copy link
Member

Section 14.2.2 has this text: "Before applying rarefaction, selecting the most variable features can help minimize variation caused by random subsampling. These features have the highest read counts, while rare features tend to increase sampling variation. This approach facilitates comparison of results between non-rarefied and rarefied distance calculations later on." and following code example.

That example picks top-5 most variable taxa before estimating beta diversity.

This is good to speed up analyses in some cases but in general, I tend to think that beta diversity should be estimated based on all available features. I have received reports from users directly following the top-5 taxa selection as the default procedure. This shouldn't be a general default. The text could be more explicit about that (or alternatively we could remove that top-n subsetting)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant