Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudo-PR: First release review! #45

Closed
wants to merge 200 commits into from
Closed
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
200 commits
Select commit Hold shift + click to select a range
b3b945f
Starte don some conifg options
Hammarn May 29, 2017
4b6039a
Added STAR-fusion process
Hammarn Jun 13, 2017
6aab765
added FusionInspector process
Hammarn Jun 14, 2017
d71485b
Started on the fusionCatcher process
Hammarn Aug 22, 2017
0d56f3a
bugfixing FusionCatcher process
Hammarn Aug 22, 2017
31d4452
Merge pull request #1 from Hammarn/master
Hammarn Aug 22, 2017
a9a64f5
Untested code tidying.
ewels Aug 22, 2017
3574a5a
Merge pull request #2 from ewels/master
Hammarn Aug 23, 2017
aacfcb5
New logo
ewels Aug 23, 2017
c68fa51
fixed syntax and got a successful start
Hammarn Aug 23, 2017
cbf8f62
Merge pull request #3 from ewels/master
Hammarn Aug 23, 2017
d68437c
Merge branch 'master' of github.com:SciLifeLab/NGI-RNAfusion
Hammarn Aug 23, 2017
c154a46
Added tags and other minor tweaks
Hammarn Sep 4, 2017
71e9e88
sorted out tag naming
Hammarn Sep 5, 2017
ce0570b
started on Docker support
Hammarn Sep 5, 2017
9a34bda
started on Docker support
Hammarn Sep 5, 2017
b98263a
Merge branch 'master' of github.com:Hammarn/NGI-RNAseq-Fusiondetect
Hammarn Sep 5, 2017
0929f7d
sorting out fusioninspector parameters
Hammarn Sep 5, 2017
a5ba131
Merge branch 'master' of github.com:Hammarn/NGI-RNAseq-Fusiondetect
Hammarn Sep 5, 2017
7f5b38e
Dockerfile
Hammarn Sep 5, 2017
1a8ea83
added fusioninspector to config
Hammarn Sep 5, 2017
374ba68
tweaking
Hammarn Sep 8, 2017
eeb5c7e
fixed markdown syntax
Hammarn Sep 14, 2017
0b60656
Fixing up output file naming
Hammarn Sep 22, 2017
fb9632f
Merge branch 'master' of github.com:Hammarn/NGI-RNAfusion
Hammarn Sep 22, 2017
7fd5016
spelling correction
Hammarn Sep 27, 2017
1bf1451
added in the . in time requirement
Hammarn Oct 2, 2017
efb69b2
removed superfluous params.outdirs
Hammarn Oct 2, 2017
3af6eb8
bumped up the FusionCatcher time requirement
Hammarn Oct 9, 2017
79a1aff
bumped up the FusionCatcher time requirement
Hammarn Oct 9, 2017
c773913
Merge branch 'master' of github.com:Hammarn/NGI-RNAfusion
Hammarn Oct 11, 2017
4dbd5fa
added fusion_genes_compare.py script process
Hammarn Oct 11, 2017
505731f
finished up requirements for FusionInspector
Hammarn Oct 13, 2017
488ecc3
updates
Hammarn Nov 16, 2017
19f83a1
commiting current state fo dockerfile
Hammarn Dec 18, 2017
7be16bb
Merge branch 'master' of github.com:Hammarn/NGI-RNAseq-Fusiondetect
Hammarn Dec 18, 2017
c72ca09
Dockerfile cleanup
Hammarn Dec 18, 2017
1a5c13a
added in missing line continuation
Hammarn Dec 19, 2017
ea5545f
Bring config files in line with other pipelines
Hammarn Dec 19, 2017
e3f8d27
added base config
Hammarn Dec 19, 2017
92c1f1f
Removed extra dockerfile
Hammarn Jan 12, 2018
b409063
config updats for rackham
Hammarn Jan 12, 2018
5d2320a
removed extra params
Hammarn Jan 12, 2018
88972d0
Merge pull request #4 from Hammarn/master
ewels Jan 22, 2018
2ecf628
Started on support for single end
Hammarn Jan 22, 2018
27c871d
Merge branch 'master' of github.com:Hammarn/NGI-RNAseq-Fusiondetect
Hammarn Jan 22, 2018
d41fb46
moved around proccess order to make more sense
Hammarn Jan 23, 2018
26c04c9
added in missing commands
Hammarn Jan 23, 2018
dd401ea
bugfixing
Hammarn Jan 25, 2018
18c21d4
generalized the inputfile grouping
Hammarn Jan 30, 2018
deec0e3
Decreased the size of the image by removing intermediate files
Hammarn Feb 2, 2018
1379382
docker parasms compatibility
Hammarn Feb 5, 2018
e53667e
reference inputs fixing
Hammarn Feb 6, 2018
beaf093
sorted out reference parameters
Hammarn Feb 6, 2018
0700f96
small Readme update
Hammarn Feb 6, 2018
fbe94df
Merge pull request #5 from Hammarn/master
Hammarn Mar 1, 2018
2cf2eab
Removed debugging
Hammarn Mar 1, 2018
65ce7ca
reworked star fusion references checking
Hammarn Mar 1, 2018
033b200
indentation fix
Hammarn Mar 1, 2018
0175771
Merge pull request #6 from Hammarn/master
ewels Mar 1, 2018
a72a58f
Upgrade suggestions
matq007 Sep 12, 2018
a8833f8
Fixed mistakes in config and main.nf
matq007 Sep 13, 2018
b32e9af
Actually works
matq007 Sep 13, 2018
abced98
Updated Dockerfile with conda support
matq007 Sep 13, 2018
5a32a51
Transition to singularity
matq007 Sep 13, 2018
02d5635
Update project to run with singularity
matq007 Sep 13, 2018
24e31e5
Working singularity
matq007 Sep 13, 2018
8fd8874
Test data
matq007 Sep 13, 2018
4973bc6
Merge pull request #11 from matq007/ngi-upgrade
Sep 14, 2018
f6dc1ec
Initial template commit
matq007 Sep 14, 2018
7a711f7
Merged vanilla TEMPLATE branch into master
matq007 Sep 17, 2018
19659d7
Removed test data
matq007 Sep 17, 2018
afc0ad0
Updated Dockerfile with conda
matq007 Sep 17, 2018
30db48c
Compatable Dockerfile with Singularity
matq007 Sep 18, 2018
93a9ef5
Implemented STAR-Fusion
matq007 Sep 19, 2018
f6480d4
Implemented Fusion Catcher
matq007 Sep 19, 2018
54fd934
Rearranged Dockerfile
matq007 Sep 21, 2018
0580035
Cleanup
matq007 Sep 21, 2018
542b44f
Added GRCh38 genome to the list
matq007 Sep 25, 2018
11e9b33
Fixed GRCh38 path
matq007 Sep 25, 2018
b0971d2
Utility for extracting found fusion genes by various tools
matq007 Oct 2, 2018
464da9a
Added Fusion-Inspector
matq007 Oct 2, 2018
f2045d1
Added Fusion-Inspector version number and cleanup
matq007 Oct 2, 2018
87c3238
Updated transformer usage
matq007 Oct 3, 2018
f0a3d5c
Removed IGV folder from Fusion-Inspector
matq007 Oct 3, 2018
5d28c7d
Fixed writting when no fusions are found
matq007 Oct 4, 2018
3638b8a
Changed summary structure
matq007 Oct 5, 2018
0e5f058
Generated pretty MultiQC section
matq007 Oct 5, 2018
84ab12f
Updated pipeline structure
matq007 Oct 5, 2018
0883769
Merge pull request #1 from matq007/master
ewels Oct 5, 2018
8656fe5
Merge pull request #2 from nf-core/dev
ewels Oct 5, 2018
a81eb8d
remove branding and add logo
maxulysse Oct 5, 2018
6e442cf
Merge branch 'dev' into master
maxulysse Oct 5, 2018
99ddb74
Merge pull request #8 from MaxUlysse/master
ewels Oct 5, 2018
1c2270a
Merge pull request #9 from nf-core/dev
ewels Oct 5, 2018
8ed0f3a
Update README.md
maxulysse Oct 5, 2018
cf795dc
Merge pull request #10 from nf-core/MaxUlysse-patch-1
ewels Oct 5, 2018
8e8ea66
Merge pull request #11 from nf-core/dev
ewels Oct 5, 2018
c0e1755
Finished transformer for STAR-Fusion (#16)
matq007 Oct 8, 2018
f49372d
Merge pull request #20 from matq007/issue-16
ewels Oct 8, 2018
c76e774
Lint positive #15 (#21)
matq007 Oct 17, 2018
6e2bbc6
Testing new setup (#26)
matq007 Nov 1, 2018
28a0152
Implemented ericscript
matq007 Nov 2, 2018
b591cd8
Finished Eriscript
matq007 Nov 5, 2018
b381df0
Merge pull request #27 from matq007/issue-4
maxulysse Nov 6, 2018
6504c82
Implemented Pizzly
matq007 Nov 6, 2018
c4bff6b
Merge pull request #28 from matq007/issue-22
maxulysse Nov 7, 2018
e7f50eb
Implemented Squid
matq007 Nov 8, 2018
88ceac9
Fixed withName in docker.config
matq007 Nov 8, 2018
9a0e584
Merge pull request #30 from matq007/issue-29
maxulysse Nov 8, 2018
df71ebb
Removed chimerascan
matq007 Nov 10, 2018
c771649
Removed tophat and tophat-fusion
matq007 Nov 10, 2018
851470e
Cleanup
matq007 Nov 11, 2018
46552a8
Merge pull request #32 from matq007/dev
maxulysse Nov 12, 2018
8f74c15
Boilerplate
matq007 Nov 15, 2018
e171753
Swapped gene fusions are not the same
matq007 Nov 20, 2018
317a87a
Fixed libcrypto issue
matq007 Nov 20, 2018
f0d72c2
Moved files around on Uppmax
matq007 Nov 20, 2018
8bae793
Merge pull request #33 from matq007/dev
maxulysse Nov 22, 2018
372da6e
Merge branch 'dev' of https://github.com/nf-core/rnafusion into issue-23
matq007 Nov 22, 2018
6097c3a
Implementing annotation for squid
matq007 Nov 26, 2018
9ddf472
Implemented squid with annotation
matq007 Nov 26, 2018
27acf21
Removed pandas package from dependencies
matq007 Nov 26, 2018
17af5c1
Merge pull request #35 from matq007/issue-23
maxulysse Nov 27, 2018
b934dd7
Implemented custom summary report with some rich visualization (#36)
matq007 Dec 18, 2018
dc54275
Passing builds (#37)
matq007 Dec 18, 2018
d38838c
Minor fixes before release (#38)
matq007 Jan 5, 2019
2adc2c6
Updated testing parameter (#39)
matq007 Jan 7, 2019
a0ca11f
Added documentation (#40)
matq007 Jan 7, 2019
0456599
Doc improvement and munin configuration (#41)
matq007 Jan 8, 2019
2aa4a1e
Setup for munin (#42)
matq007 Jan 10, 2019
09d08ed
Bumping version to 1.0 (#43)
matq007 Jan 10, 2019
1e5842d
Merge pull request #44 from nf-core/dev
ewels Jan 10, 2019
f4ed4a3
Fixed CPU parameters for ericscript and pizzly (#46)
matq007 Jan 11, 2019
1ae39ba
Merge pull request #47 from nf-core/dev
maxulysse Jan 11, 2019
1cd1f0a
Added author and year to MIT LICENCE
matq007 Jan 14, 2019
726d763
Refactored configurations
matq007 Jan 14, 2019
82714f1
Removing shub:// and replacing it with docker://
matq007 Jan 14, 2019
2e45506
Updated commit tag for gdown.pl
matq007 Jan 14, 2019
5b6a16b
Cleanup of variables
matq007 Jan 14, 2019
f2af8d9
Refactoring singleEnd condition and updated documentation
matq007 Jan 14, 2019
df4be03
Increased time-limit for tools
matq007 Jan 14, 2019
dc89b7a
Improved when condition for singleEnd
matq007 Jan 14, 2019
67d9aff
Condition bug on squid
matq007 Jan 14, 2019
26ad04e
Updated README and usage docs
matq007 Jan 14, 2019
88ac535
Updated time-limits
matq007 Jan 15, 2019
1f88cf3
Fix: smaller report summary
matq007 Jan 15, 2019
fc665a4
Removed old render function
matq007 Jan 15, 2019
2c9659a
Updated singularity image download script with docs
matq007 Jan 15, 2019
40486f9
Added destination parameter to download-references.sh
matq007 Jan 15, 2019
bd52c4c
Renamed NGI-RNAfusion to nfcore/rnafusion in all Dockerfiles
matq007 Jan 16, 2019
cec4f9e
Updated README
matq007 Jan 16, 2019
b9d8329
Added tool cutoff parameter
matq007 Jan 17, 2019
7dde707
Increased max_cpus in base.config, checking for max_cpus in processes…
matq007 Jan 17, 2019
5ef2416
Removing private uppmax-devel.config
matq007 Jan 17, 2019
2d2c0ba
Updated munin configurations
matq007 Jan 17, 2019
e2f79f7
Mentioned nf-core/configs in adding your own configuration
matq007 Jan 17, 2019
ea69dbb
Renamed tools.md to references.md, updated README
matq007 Jan 17, 2019
4883f13
Updated troubleshooting
matq007 Jan 17, 2019
39a756c
Added ASCII nfcore logo into help
matq007 Jan 17, 2019
a86b947
Added tool_cutoff param and updated description
matq007 Jan 18, 2019
326162f
Updated configurations
matq007 Jan 18, 2019
d79e094
Updated test profile to check just for syntax
matq007 Jan 18, 2019
a3e9ad6
Merged summary-report into main container
matq007 Jan 18, 2019
f8fd983
Removing testing Dockerfiles
matq007 Jan 18, 2019
b8d4d04
Huge update on main.nf
matq007 Jan 18, 2019
d6de5c8
Supports only one single/paired-end read, no batch mode for now
matq007 Jan 18, 2019
0afc92f
Parsing filtered output from pizzly
matq007 Jan 20, 2019
b9bf6f2
Added FusionGDB
matq007 Jan 20, 2019
3f68aba
Updated output documentation
matq007 Jan 20, 2019
e507a39
Updated output documentation
matq007 Jan 20, 2019
5c2b7d9
Fixed linting on markdown
matq007 Jan 21, 2019
81e1e73
Updated testing
matq007 Jan 23, 2019
3c79378
Following PEP-8 guidelines
matq007 Jan 23, 2019
76e6fcd
Fixed coloring on ditribution chart
matq007 Jan 23, 2019
a2c9041
Processing filtered fusions from Ericscript
matq007 Jan 23, 2019
90afd3b
Fixed travis
matq007 Jan 23, 2019
95611de
Fixing travis again
matq007 Jan 23, 2019
6d7dc98
Fixing Travis again
matq007 Jan 23, 2019
42071e7
Fixing Travis again
matq007 Jan 23, 2019
7e4a4f4
Fixing Travis again
matq007 Jan 23, 2019
78455b4
Fixed indent on help command
matq007 Jan 23, 2019
b08ad41
Testing markdownlint
matq007 Jan 23, 2019
4f3f0c3
Add sudo permissions to apt-get
matq007 Jan 23, 2019
42c8ac9
Updated CHANGELOG.md
matq007 Jan 24, 2019
fed5fd8
Markdownlint check the whole project not only `docs` folder
matq007 Jan 24, 2019
69fec2e
Added docs to the python code
matq007 Jan 29, 2019
dcce924
Added pylint checking of python scripts
matq007 Jan 29, 2019
96840f9
Added MIT badge
matq007 Jan 31, 2019
d562e9d
Transformer is not throwing error any more just prints error messages
matq007 Jan 31, 2019
a4c35d9
Added process for building STAR index
matq007 Jan 31, 2019
95e5432
Added star_index into test configuration
matq007 Jan 31, 2019
3a6f5c6
Using markdownlint configuration and structure from nf-core/tools sol…
matq007 Jan 31, 2019
3769174
Cleanup
matq007 Jan 31, 2019
10c6922
Updated documentation for usage
matq007 Jan 31, 2019
946a8fa
Added new GUI configuration for parameters
matq007 Jan 31, 2019
c698d9a
Specified params for reads in STAR-Fusion
matq007 Feb 4, 2019
96116b4
Using fusioncatcher from conda hcc channel
matq007 Feb 5, 2019
e198b82
Deleted Singularity file
matq007 Feb 5, 2019
9e5eaea
Add Singularity file back
matq007 Feb 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,15 @@ install:
- sudo ln -s /tmp/nextflow/nextflow /usr/local/bin/nextflow
# Install nf-core/tools
- pip install nf-core
# Reset
# Reset
- mkdir ${TRAVIS_BUILD_DIR}/tests && cd ${TRAVIS_BUILD_DIR}

script:
# Create and download test data
- |
touch tests/genome.fa tests/genes.gtf
wget http://github.com/nf-core/test-datasets/raw/rnafusion/testdata/human/reads_1.fq.gz -O tests/reads_1.fq.gz
wget http://github.com/nf-core/test-datasets/raw/rnafusion/testdata/human/reads_2.fq.gz -O tests/reads_2.fq.gz
# Lint the pipeline code
- nf-core lint ${TRAVIS_BUILD_DIR}
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker --genome GRCh38
- nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker
20 changes: 18 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,24 @@
FROM nfcore/base

LABEL authors="rickard.hammaren@scilifelab.se, phil.ewels@scilifelab.se, martin.proks@scilifelab.se" \
description="Docker image containing all requirements for NGI-RNAfusion pipeline"
description="Docker image containing all requirements for nfcore/rnafusion pipeline"

COPY environment.yml /
RUN conda env create -f /environment.yml && conda clean -a
ENV PATH /opt/conda/envs/nf-core-rnafusion-1.0/bin:$PATH
ENV PATH /opt/conda/envs/nf-core-rnafusion-1.0/bin:$PATH

WORKDIR /script-db
# Download FusionGDB
RUN apt-get update && apt-get install -y wget \
&& wget --no-check-certificate https://ccsm.uth.edu/FusionGDB/tables/TCGA_ChiTaRS_combined_fusion_information_on_hg19.txt -O TCGA_ChiTaRS_combined_fusion_information_on_hg19.txt \
&& wget --no-check-certificate https://ccsm.uth.edu/FusionGDB/tables/TCGA_ChiTaRS_combined_fusion_ORF_analyzed_gencode_h19v19.txt -O TCGA_ChiTaRS_combined_fusion_ORF_analyzed_gencode_h19v19.txt \
&& wget --no-check-certificate https://ccsm.uth.edu/FusionGDB/tables/uniprot_gsymbol.txt -O uniprot_gsymbol.txt \
&& wget --no-check-certificate https://ccsm.uth.edu/FusionGDB/tables/fusion_uniprot_related_drugs.txt -O fusion_uniprot_related_drugs.txt \
&& wget --no-check-certificate https://ccsm.uth.edu/FusionGDB/tables/fusion_ppi.txt -O fusion_ppi.txt \
&& wget --no-check-certificate https://ccsm.uth.edu/FusionGDB/tables/fgene_disease_associations.txt -O fgene_disease_associations.txt \
&& ln -s /opt/conda/envs/nf-core-rnafusion-1.0/bin/python3 /bin/python3

COPY ./FusionGDB.sql .
RUN sqlite3 fusions.db < FusionGDB.sql

WORKDIR /
File renamed without changes.
35 changes: 25 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,33 +8,48 @@
![Singularity Container available](
https://img.shields.io/badge/singularity-available-7E4C74.svg)

### Introduction

**nfcore/rnafusion** uses RNA-seq data to detect fusions genes.

The workflow processes raw single-read (limited tools available) or paired-end data from FastQ input ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)), detect fusion genes ([STAR-Fusion](https://github.com/STAR-Fusion/STAR-Fusion), [Fusioncatcher](https://github.com/ndaniel/fusioncatcher), [Ericscript](https://sites.google.com/site/bioericscript/), [Pizzly](https://github.com/pmelsted/pizzly), [Squid](https://github.com/Kingsford-Group/squid)), visualizes the fusions ([FusionInspector](https://github.com/FusionInspector/FusionInspector)), performs quality-control on the results ([MultiQC](http://multiqc.info)) and finally generates custom summary report.
The workflow processes RNA-sequencing data from FastQ files. It runs quality control on the raw data ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)), detects fusion genes ([STAR-Fusion](https://github.com/STAR-Fusion/STAR-Fusion), [Fusioncatcher](https://github.com/ndaniel/fusioncatcher), [Ericscript](https://sites.google.com/site/bioericscript/), [Pizzly](https://github.com/pmelsted/pizzly), [Squid](https://github.com/Kingsford-Group/squid)), gathers information ([FusionGDB](https://ccsm.uth.edu/FusionGDB/index.html)), visualizes the fusions ([FusionInspector](https://github.com/FusionInspector/FusionInspector)), performs quality-control on the results ([MultiQC](http://multiqc.info)) and finally generates custom summary report.

![Final summary report](docs/images/example-summary-report.png)

**Note: Make sure to read the installation guide before running the pipeline.**
The pipeline works with both single-end and paired-end data, though not all fusion detection tools work with single-end data (Ericscript, Pizzly, Squid and FusionInspector).

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.

| Tool | Single-end reads | CPU (recommended) | RAM (recommended) |
| --------------- |:----------------:|:-----------------:|:-----------------:|
| [Star-Fusion](https://github.com/STAR-Fusion/STAR-Fusion/wiki) | Yes | >=16 cores | ~30GB |
| [Fusioncatcher](https://github.com/ndaniel/fusioncatcher/blob/master/doc/manual.md) | Yes | >=16 cores | ~60GB |
| [Ericscript](https://sites.google.com/site/bioericscript/getting-started) | **No** | >=16 cores | ~30GB |
| [Pizzly](https://github.com/pmelsted/pizzly) | **No** | >=16 cores | ~30GB |
| [Squid](https://github.com/Kingsford-Group/squid) | **No** | >=16 cores | ~30GB |
| [FusionInspector](https://github.com/FusionInspector/FusionInspector/wiki) | **No** | >=16 cores | ~30GB |

> **TL;DR:** Make sure to download all required references for each tool. More details can be found in section [tools](docs/tools.md).

```bash
nextflow run nf-core/rnafusion --reads '*_R{1,2}.fastq.gz' --genome GRCh38 -profile docker --star_fusion --fusioncatcher --ericscript --pizzly --squid --fusion_inspector
```

### Documentation
For available parameters or help run:

```bash
nextflow run nf-core/rnafusion --help
```

## Documentation

The nf-core/rnafusion pipeline comes with documentation about the pipeline, found in the `docs/` directory:

1. [Installation](docs/installation.md)
2. Pipeline configuration
* [Tools](docs/tools.md)
* [Download references for tools](docs/references.md)
* [Local installation](docs/configuration/local.md)
* [Swedish UPPMAX cluster](docs/configuration/uppmax.md)
* [Adding your own system](docs/configuration/adding_your_own.md)
3. [Running the pipeline](docs/usage.md)
4. [Output and how to interpret the results](docs/output.md)
5. [Troubleshooting](docs/troubleshooting.md)

### Final summary output

![Final summary report](docs/images/example-summary-report.png)
Use predefined configuration for desired Institution cluster provided at [nfcore/config](https://github.com/nf-core/configs) repository.
65 changes: 45 additions & 20 deletions bin/create_mqc_section.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
#!/usr/bin/env python3
from collections import OrderedDict
from yaml import dump
import argparse
import yaml
import sys
import os
import yaml
from yaml import dump


OUTPUT = 'fusion_genes_config_mqc.yaml'
TEMPLATE = OrderedDict([
('id', 'fusion_genes'),
('s', 'Fusion genes'),
('section_name', 'Fusion genes'),
('description', 'Number of fusion genes found by various tools'),
('plot_type', 'bargraph'),
('pconfig', {
Expand All @@ -24,22 +25,24 @@ def findings(p_yaml, p_sample_name):
result = {}

if p_yaml is None:
return
return None

# Counts per tool
for tool, fusions in p_yaml.items():
result[tool] = len(fusions) if fusions is not None else 0

# If only one tool was found, there is no need to make intercept
if len(result) == 1:
template['data'] = { p_sample_name: result }
return OrderedDict(template)
template['data'] = {p_sample_name: result}
else:
# Intersect
result['together'] = len(
set.intersection(*map(set, [fusions for _, fusions in p_yaml.items()]))
)

# Group results
template['data'] = {p_sample_name: result}

# Intersect
result['together'] = len(set.intersection(*map(set, [fusions for _, fusions in p_yaml.items()])))

# Group results
template['data'] = { p_sample_name: result }
return OrderedDict(template)

def summary(p_input, p_sample_name):
Expand All @@ -49,22 +52,44 @@ def summary(p_input, p_sample_name):
with open(p_input, 'r') as stream, open(OUTPUT, 'w') as out_file:
yaml_data = yaml.safe_load(stream)
# Conversion to nice yaml file
yaml.add_representer(OrderedDict, lambda dumper, data: dumper.represent_mapping('tag:yaml.org,2002:map', data.items()))
yaml.add_representer(
OrderedDict,
lambda dumper, data:
dumper.represent_mapping('tag:yaml.org,2002:map', data.items())
)
# Find and store
out_file.write(dump(findings(yaml_data, p_sample_name), default_flow_style=False, allow_unicode=True))
out_file.write(
dump(
findings(yaml_data, p_sample_name),
default_flow_style=False,
allow_unicode=True
)
)
stream.close()
out_file.close()
except IOError as error:
sys.exit(error)
except yaml.YAMLError as error:
sys.exit(error)
except Exception as error:
sys.exit(error)

def main():
parser = argparse.ArgumentParser(
description='Tool for generating Fusion MultiQC section'
)
parser.add_argument(
'-i', '--input',
help='Input file',
type=str,
required=True
)
parser.add_argument(
'-s', '--sample',
help='Sample name',
type=str,
required=True
)
args = parser.parse_args()
summary(args.input, args.sample)

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="""Utility for generating data structure for MultiQC""")
parser.add_argument('-i', '--input', nargs='?', help='Input file', type=str, required=True)
parser.add_argument('-s', '--sample', nargs='?', help='Sample name', type=str, required=True)
args = parser.parse_args()
summary(args.input, args.sample)
main()
Loading