You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I mapped my Hi-C library using bwa-mem2, and then ran your hic_qc.py script. I got this error:
tomas@edna:/media/tomas/Data2/hi_c/hi_c1/mapped$ hic_qc.py -b A524_mapped.bam -o A524_hic_qc
[hic_qc - 2025-02-16 09:38:59,572] parsing the first 1000000 read pairs in bam file A524_mapped.bam to QC Hi-C library quality
Traceback (most recent call last):
File "/home/tomas/bin/hic_qc/hic_qc.py", line 1147, in
QC.parse_bam(args.bam_file, max_read_pairs=args.num_reads)
File "/home/tomas/bin/hic_qc/hic_qc.py", line 234, in parse_bam
self.extract_header_info(bam_fh.header)
File "/home/tomas/bin/hic_qc/hic_qc.py", line 306, in extract_header_info
self.bwa_command = re.search(r'(bwa-mem2 )[^//]*', self.bwa_command_line).group()
AttributeError: 'NoneType' object has no attribute 'group'
It turns out that the hic_qc.py script already expects that the bam file could have been generated with bwa-mem2 (see Commit c0a178c), however, it looks only for "bwa-mem2" and does not recognize a specific implementation (avx2, sse41, etc) which is actually executed. The lines in the hic-qc.py script that check if the bam file was generated by bwa or bwa-mem2 are below.
# TODO: add more robust logic to different BAM headers, and/or comment the assumptions made in this code
if 'PG' in header and 'bwa' in header['PG'][0]['CL']:
self.bwa_command_line = header['PG'][0]['CL']
if 'bwa-mem2' in self.bwa_command_line:
self.bwa_command = re.search(r'(bwa-mem2 )[^//]*', self.bwa_command_line).group()
else:
self.bwa_command = re.search(r'(bwa )[^//]*', self.bwa_command_line).group()
When one executes bwa-mem2 one can either run "bwa-mem2" which then finds the appropriate implementation of bwa-mem2 for the computer on which bwa-mem2 is being executed (e.g. bwa-mem2.avx2), or one can directly execute the desired version (e.g. bwa-mem2.avx2). What is registered in the bam header is then "bwa-mem2 -command options" or "bwa-mem2.avx2 -command options", respectively, and the hic_qc.py script only recognizes "bwa-mem2 -command options", thus the error in hic_qc.py.
The solution to make hic_qc.py script to recognize "bwa-mem2 -command options" or "bwa-mem2.avx2 -command options" in the bam header is to change line 308 of hic_qc.py
from: self.bwa_command = re.search(r'(bwa-mem2 )[^//]', self.bwa_command_line).group()
to: self.bwa_command = re.search(r'(bwa-mem2)[^//]', self.bwa_command_line).group()
(remove the space behind 'bwa-mem2')
This will make hic_qc.py robust, and recognize all implementations of bwa-mem2.
By the way, I mapped my library with both bwa and bwa-mem2, and the QC results are identical. The only difference is that mapping with bwa-mem2 is about 2.5x faster than with bwa.
All the best,
Tomas
The text was updated successfully, but these errors were encountered:
Folks, @shawnpg @bnelsj ,
I mapped my Hi-C library using bwa-mem2, and then ran your hic_qc.py script. I got this error:
tomas@edna:/media/tomas/Data2/hi_c/hi_c1/mapped$ hic_qc.py -b A524_mapped.bam -o A524_hic_qc
[hic_qc - 2025-02-16 09:38:59,572] parsing the first 1000000 read pairs in bam file A524_mapped.bam to QC Hi-C library quality
Traceback (most recent call last):
File "/home/tomas/bin/hic_qc/hic_qc.py", line 1147, in
QC.parse_bam(args.bam_file, max_read_pairs=args.num_reads)
File "/home/tomas/bin/hic_qc/hic_qc.py", line 234, in parse_bam
self.extract_header_info(bam_fh.header)
File "/home/tomas/bin/hic_qc/hic_qc.py", line 306, in extract_header_info
self.bwa_command = re.search(r'(bwa-mem2 )[^//]*', self.bwa_command_line).group()
AttributeError: 'NoneType' object has no attribute 'group'
It turns out that the hic_qc.py script already expects that the bam file could have been generated with bwa-mem2 (see Commit c0a178c), however, it looks only for "bwa-mem2" and does not recognize a specific implementation (avx2, sse41, etc) which is actually executed. The lines in the hic-qc.py script that check if the bam file was generated by bwa or bwa-mem2 are below.
When one executes bwa-mem2 one can either run "bwa-mem2" which then finds the appropriate implementation of bwa-mem2 for the computer on which bwa-mem2 is being executed (e.g. bwa-mem2.avx2), or one can directly execute the desired version (e.g. bwa-mem2.avx2). What is registered in the bam header is then "bwa-mem2 -command options" or "bwa-mem2.avx2 -command options", respectively, and the hic_qc.py script only recognizes "bwa-mem2 -command options", thus the error in hic_qc.py.
The solution to make hic_qc.py script to recognize "bwa-mem2 -command options" or "bwa-mem2.avx2 -command options" in the bam header is to change line 308 of hic_qc.py
from: self.bwa_command = re.search(r'(bwa-mem2 )[^//]', self.bwa_command_line).group()
to: self.bwa_command = re.search(r'(bwa-mem2)[^//]', self.bwa_command_line).group()
(remove the space behind 'bwa-mem2')
This will make hic_qc.py robust, and recognize all implementations of bwa-mem2.
By the way, I mapped my library with both bwa and bwa-mem2, and the QC results are identical. The only difference is that mapping with bwa-mem2 is about 2.5x faster than with bwa.
All the best,
Tomas
The text was updated successfully, but these errors were encountered: