Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge changes from main #82

Merged
merged 6 commits into from
Nov 11, 2024
Merged

Merge changes from main #82

merged 6 commits into from
Nov 11, 2024

Conversation

GavinHuttley
Copy link
Collaborator

@GavinHuttley GavinHuttley commented Nov 11, 2024

Summary by Sourcery

Enhance the README with a comprehensive description of the 'diverse_seq' tool's capabilities and update the 'ruff' dependency version in the build configuration. Add 'statsmodels' to the documentation dependencies.

Enhancements:

  • Update the README to provide a more detailed description of the 'diverse_seq' tool, highlighting its alignment-free algorithms and efficiency in phylogenetic workflows.

Build:

  • Update the 'ruff' dependency version from 0.7.2 to 0.7.3 in the 'pyproject.toml' file.
  • Add 'statsmodels' to the documentation dependencies in the 'pyproject.toml' file.

GavinHuttley and others added 6 commits November 11, 2024 12:10
[CHANGED] added a dependency to doc optional extras
REL: bumped release to 2024.11.8a3
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](astral-sh/ruff@0.7.2...0.7.3)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
DOC: update intro and add link to preprint
Copy link

sourcery-ai bot commented Nov 11, 2024

Reviewer's Guide by Sourcery

This PR updates the project's documentation, dependencies, and version number. The main changes include a significant revision of the README.md description to better reflect the project's capabilities and purpose, along with minor version updates for dependencies and the package itself.

No diagrams generated as the changes look simple and do not need a visual representation.

File-Level Changes

Change Details Files
Updated project description and capabilities in documentation
  • Revised project tagline to emphasize alignment-free algorithms and phylogenetic workflows
  • Added detailed performance metrics and use cases
  • Included reference to preprint paper
  • Enhanced description of computational efficiency and scalability
README.md
Updated package dependencies and version
  • Upgraded ruff from version 0.7.2 to 0.7.3 in both test and dev dependencies
  • Added statsmodels to doc dependencies
  • Bumped package version from 2024.11.8a2 to 2024.11.8a3
pyproject.toml
src/diverse_seq/__init__.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@GavinHuttley GavinHuttley merged commit cfffa51 into JOSS Nov 11, 2024
30 checks passed
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @GavinHuttley - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟡 Documentation: 1 issue found

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@@ -50,7 +50,7 @@ test = [
"pytest",
"pytest-cov",
"pytest-xdist",
"ruff==0.7.2",
"ruff==0.7.3",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider consolidating the duplicated Ruff dependency specification

The Ruff dependency is currently specified with the same version in both test and dev dependencies. Consider moving it to a shared dependency list or using dependency inheritance to avoid potential maintenance issues.

`diverse_seq` provides tools for selecting a representative subset of sequences from a larger collection. It is an alignment-free method which scales linearly with the number of sequences. It identifies the subset of sequences that maximize diversity as measured using Jensen-Shannon divergence. `diverse_seq` provides a command-line tool (`dvs`) and plugins to the Cogent3 app system (prefixed by `dvs_`) allowing users to embed code in their own scripts. The command-line tools can be run in parallel.
`diverse-seq` implements computationally efficient alignment-free algorithms that enable efficient prototyping for phylogenetic workflows. It can accelerate parameter selection searches for sequence alignment and phylogeny estimation by identifying a subset of sequences that are representative of the diversity in a collection. We show that selecting representative sequences with an entropy measure of *k*-mer frequencies correspond well to sampling via conventional genetic distances. The computational performance is linear with respect to the number of sequences and can be run in parallel. Applied to a collection of 10.5k whole microbial genomes on a laptop took ~8 minutes to prepare the data and 4 minutes to select 100 representatives. `diverse-seq` can further boost the performance of phylogenetic estimation by providing a seed phylogeny that can be further refined by a more sophisticated algorithm. For ~1k whole microbial genomes on a laptop, it takes ~1.8 minutes to estimate a bifurcating tree from mash distances.

You can read more about the methods implemented in `diverse_seq` in the preprint [here](https://biorxiv.org/cgi/content/short/2024.11.10.622877v1).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (documentation): Inconsistent package name formatting between diverse-seq and diverse_seq

Consider using a consistent name throughout the documentation to avoid confusion. If the different forms are intentional (e.g., project name vs package name), please clarify this.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 11785962632

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 91.892%

Totals Coverage Status
Change from base Build 11785938693: 0.0%
Covered Lines: 1190
Relevant Lines: 1295

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants