Changes for MACS (3.0.3)
Features added
-
Now support FRAG format for single-cell ATAC-seq in
callpeak
and
pileup
. FRAG format is used by 10x Genomics to store alignments from
the single-cell ATAC-seq pipelinecellranger-atac
or the multi-omics
pipelinecellranger-arc
. The format is essentially BEDPE with two
additional columns: the barcode and the count of fragments aligned to
the same location with the same barcode. Support for FRAG in other
tools is coming soon, as well as forhmmratac
calls.If you specify FRAG as your input format:
- You can use a barcode list for a subset of cells with
--barcodes
,
thencallpeak
will identify peaks andpileup
will build pileup
track for the fragments of this subset of cells. - Duplicates will not get removed as we'll assume all fragments are
valid. Optionally, an option,--max-count
, can be applied to set
the maximum count.
- You can use a barcode list for a subset of cells with
-
We transitioned our
pyx
codes topy
codes, adopting a 'pure
Python style' with PEP-484 type annotations. This change has made our
source codes more compatible with Python programming tools such as
flake8
. During this process, we performed further code cleaning and
eliminated unnecessary dependencies. We intend to continue improving
our code quality in the future. -
We have modified the handling of 'blacklist' regions in the
hmmratac
tool. This change impacts both the Expectation-Maximization
(EM) step that estimates fragment length distributions, and the Hidden
Markov Model (HMM) step that learns and predicts nucleosome states. We
now exclude aligned fragments located in the 'blocklist' regions
before both steps. We implemented theexclude
functions in both
PETrackI and PETrackII to support this feature. For more detailed
information and the reasoning behind it, refer to issue #680. -
We have tested Numpy>=2. Now MACS3 can be run on Numpy version 1 and
version 2.
Bugs fixed
-
The
hmmratagc
option--keep-duplicate
previously had the
opposite effect of what its name and description suggested. Therefore,
it was renamed to--remove-dup
to more accurately describe the
actual behavior. Duplicate fragments will not be removed byhmmratac
unless this option is explicitly set up. -
hmmratac
: wrong class name was used while saving digested signals
in BedGraph files. Fixed multiple other issues related to output
filenames. #682 -
Fix issues in big-endian system in
Parser.py
codes. Enable
big-endian support inBAM.py
codes for accessig certain alignment
records that overlap with given genomic coordinates using BAM/BAI
files. -
predictd
andfilterdup
: wrong variable name used while
reading multiple pe/frag files.
Doc
- Explanation on the filtering criteria on SAM/BAM/BAMPE files.
PRs
- Feat/macs3/reformat pyproject by @taoliu in #662
- Feat/macs3/python style cython (1st) by @taoliu in #664
- Feat/macs3/fragmentfile by @taoliu in #668
- Expose the "peaks" field in BroadPeakIO by @kaizhang in #678
- FRAG format support and bdg filename type fixed by @taoliu in #685
- Change the way to exclude regions in
hmmratac
and fix the incorrectkeep-duplicate
option by @taoliu in #689 - Feat/macs3/fragsupport by @taoliu in #690
- numpy 2 support/prep for macs3.0.3 by @taoliu in #691
New Contributors
Full Changelog: v3.0.2...v3.0.3