Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow software guidelines #2

Open
9 of 43 tasks
mkatsanto opened this issue Sep 16, 2022 · 0 comments
Open
9 of 43 tasks

Follow software guidelines #2

mkatsanto opened this issue Sep 16, 2022 · 0 comments

Comments

@mkatsanto
Copy link
Collaborator

mkatsanto commented Sep 16, 2022

  • Python and either Snakemake or Nextflow SHOULD be used for any new software projects, unless there are valid reasons not to use them
  • A second person MUST be assigned as reviewer to any new software development project and the reviewer MUST have at least "maintainer" permissions on the software
  • All software MUST be appropriately licensed
    • A permissive (free and open-source license SHOULD be used (e.g., Apache License 2.0 or MIT), but under special circumstances (e.g., derivative work of existing software that requires a particular license, interest in patenting the software), a restrictive license (copyright or copyleft) MAY be used
  • All software MUST be versioned
    • Semantic versioning SHOULD be used, but if other versioning schemes are more appropriate for a given software, they MAY be used, as long as they are documented, consistent and unambiguous
    • A changelog SHOULD be kept (can be automatically generated, especially if Conventional Commits are adopted (see below)
    • All releases SHOULD be tagged via Git
  • All software MUST be version controlled via Git
    • Code MUST be pushed to GitHub (pre-release code MAY be pushed to GitLab instead)
    • Projects SHOULD adopt the "GitHub flow" for merging new code (complex projects MAY adopt more sophisticated flows)
    • Code changes SHOULD be reviewed at least by the assigned reviewer
    • Proposed code changes SHOULD be short and focused (do not mix unrelated code changes in the same commit, feature branch, pull/merge request)
    • Commit messages and/or pull/merge requests titles SHOULD be short and descriptive (consider adopting Conventional Commits)
    • Proposed code changes SHOULD be described in some detail in pull/merge requests
    • The following code merging strategies SHOULD be considered:
      • Protect the default branch from pushing code directly
      • Adopt a linear commit history by squashing and rebasing all commits in a pull/merge request (foregoes the requirement to write semantic commit messages for feature branches)
  • All software MUST be tested
    • Functional "end-to-end"/"integration" tests MUST be provided along with appropriate test files and a mechanism to decide whether tests passed or failed; test resources shipped with the repository MUST be kept small
    • Where appropriate, unit tests SHOULD be provided, and they SHOULD cover every code statement
    • Tests requiring large data files or compute resoures exceeding those available via CI MUST be set up on the login node (details will follow)
  • A Continuous Integration pipeline triggering tests MUST be set up for each project and it MUST run all tests shipped along with the repository
    • On GitHub, GitHub Actions SHOULD be used
    • On GitLab, GitLab CI/CD SHOULD be used with the available group runners (container-based and shell-based) on our dedicated testing machine
  • All software MUST be appropriately documented
    • Each project MUST contain a README.md file in its root directory; the file MUST contain at least the following items:
      • A short synopsis of the projects
      • Basic usage
      • Build/installation/deployment instructions
      • Where appropriate: extended description (e.g., details on algorithms)
      • Where appropriate: extended usage (e.g., complex use cases requiring specific configuration)
      • A link to the adopted license
      • A link to or description of the versioning scheme
      • A link to an issue/bug tracker
      • Contact information (MAY use zavolab-biozentrum@unibas.ch)
    • Extensive documentation MAY be migrated to a dedicated service (e.g., Read the Docs)
    • The documentation SHOULD include information on how interested parties may contribute to development
    • All code units (modules, classes, methods, functions) SHOULD be documented (e.g., in Python, docstrings SHOULD be used to describe inputs/output of a function and any errors the function raises explicitly)
  • All dependencies MUST be appropriately embedded in each project
    • Workflows MUST provide dependency information via containers and/or Conda/Mamba
    • For pure Python code, depedencies SHOULD be managed with either pip or Poetry; dependencies MUST be listed in a dedicated file in a format that the package manager can interpret
    • For workflows and projects mixing Python code with executables, dependencies SHOULD be managed with Conda/Mamba
    • Tools SHOULD provide instructions to create a container image of the software (but note the comment on Bioconda and Biocontainers in the publishing section below)
  • All software projects SHOULD be published via appropriate channels to increase adoption and ease/improve installation, reusability, findability and longevity
    • Snapshots of a software published in a journal SHOULD be uploaded to Zenodo to mint a DOI
    • Tools SHOULD be submitted to Bioconda (or another appropriate Conda channel); this will also automatically create a corresponding Docker image file on Biocontainers, and so maintaining an instructions file for creating images will not be necessary in most cases
    • Workflows SHOULD be submitted to one or more appropriate workflow registries (e.g., Workflow Hub, Dockstore) or provided/tagged in such a way that they will be automatically picked up by crawlers (e.g., Snakemake Workflow Catalog
    • Publishing SHOULD be automated by adding the appropriate commands/directives to the CI configuration
  • All code (including documentation) MUST be consistently formatted
    • Linters (for Python, e.g., Flake8 and/or Pylint) SHOULD be used and included in the CI workflow
    • Code formatters (for Python, e.g., Black) MAY be used and enforced (e.g., via CI) to further increase code consistency
  • The following philosophies/design patterns SHOULD be adopted:
    • Keep functions small and focused
    • Keep code changes small and focused (do not mix unrelated code changes in the same commit)
    • DRY (Don't Repeat Yourself): Do not copy/paste code; encapsulate the code and import it in multiple places instead
    • YAGNI (You Ain't Gonna Need It): Implement what the software really needs to do; do not anticipate future use cases
    • Provide type information even if optional (e.g., in Python: use type hints and lint with mypy), especially for non-trivial software
  • All published software SHOULD be appropriately advertised; consider
    • posting on social media (Twitter, LinkedIn)
    • reaching out to the UniBas or SIB communications teams
    • preparing and publicly uploading a poster and/or presentation
    • presenting at meetings, seminars & conferences
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant