Skip to content

v3.0.3

Latest
Compare
Choose a tag to compare
@taoliu taoliu released this 20 Feb 17:57
8a833cf

Changes for MACS (3.0.3)

Features added

  1. Now support FRAG format for single-cell ATAC-seq in callpeak and
    pileup. FRAG format is used by 10x Genomics to store alignments from
    the single-cell ATAC-seq pipeline cellranger-atac or the multi-omics
    pipeline cellranger-arc. The format is essentially BEDPE with two
    additional columns: the barcode and the count of fragments aligned to
    the same location with the same barcode. Support for FRAG in other
    tools is coming soon, as well as for hmmratac calls.

    If you specify FRAG as your input format:

    • You can use a barcode list for a subset of cells with --barcodes,
      then callpeak will identify peaks and pileup will build pileup
      track for the fragments of this subset of cells.
    • Duplicates will not get removed as we'll assume all fragments are
      valid. Optionally, an option, --max-count, can be applied to set
      the maximum count.
  2. We transitioned our pyx codes to py codes, adopting a 'pure
    Python style' with PEP-484 type annotations. This change has made our
    source codes more compatible with Python programming tools such as
    flake8. During this process, we performed further code cleaning and
    eliminated unnecessary dependencies. We intend to continue improving
    our code quality in the future.

  3. We have modified the handling of 'blacklist' regions in the
    hmmratac tool. This change impacts both the Expectation-Maximization
    (EM) step that estimates fragment length distributions, and the Hidden
    Markov Model (HMM) step that learns and predicts nucleosome states. We
    now exclude aligned fragments located in the 'blocklist' regions
    before both steps. We implemented the exclude functions in both
    PETrackI and PETrackII to support this feature. For more detailed
    information and the reasoning behind it, refer to issue #680.

  4. We have tested Numpy>=2. Now MACS3 can be run on Numpy version 1 and
    version 2.

Bugs fixed

  1. The hmmratagc option --keep-duplicate previously had the
    opposite effect of what its name and description suggested. Therefore,
    it was renamed to --remove-dup to more accurately describe the
    actual behavior. Duplicate fragments will not be removed by hmmratac
    unless this option is explicitly set up.

  2. hmmratac: wrong class name was used while saving digested signals
    in BedGraph files. Fixed multiple other issues related to output
    filenames. #682

  3. Fix issues in big-endian system in Parser.py codes. Enable
    big-endian support in BAM.py codes for accessig certain alignment
    records that overlap with given genomic coordinates using BAM/BAI
    files.

  4. predictd and filterdup: wrong variable name used while
    reading multiple pe/frag files.

Doc

  1. Explanation on the filtering criteria on SAM/BAM/BAMPE files.

PRs

  • Feat/macs3/reformat pyproject by @taoliu in #662
  • Feat/macs3/python style cython (1st) by @taoliu in #664
  • Feat/macs3/fragmentfile by @taoliu in #668
  • Expose the "peaks" field in BroadPeakIO by @kaizhang in #678
  • FRAG format support and bdg filename type fixed by @taoliu in #685
  • Change the way to exclude regions in hmmratac and fix the incorrect keep-duplicate option by @taoliu in #689
  • Feat/macs3/fragsupport by @taoliu in #690
  • numpy 2 support/prep for macs3.0.3 by @taoliu in #691

New Contributors

Full Changelog: v3.0.2...v3.0.3