Workflow to SQLite files of MeSH.db, MeSH.AOR.db, MeSH.PCR.db, and MeSH.XXX.eg.db-type packages.
- Bash: GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)
- Snakemake: 6.0.5
- Singularity: 3.5.3
- NCBI API Key: For the detail, check A General Introduction to the E-utilities.
- USEARCH: This workflow needs ublast command. Download and install from USEARCH download.
- config.yaml:
- UBLAST_PATH: Set the path you downloaded USEARCH
- THIS_YEAR: Update when the year changes
- METADATA_VERSION: Update like v001 -> v002 -> ...and so on.
- MESH_VERSION: Update as needed (check the latest NLM MeSH)
- BIOC_VERSION: Set next version of Bioconductor
The workflow consists of seven snakemake workflows.
In local machine:
export NCBI_API_KEY=ABCDE12345 # Your API Key
snakemake -s workflow/download.smk -j 4 --use-singularity
snakemake -s workflow/ublast.smk -j 4 --use-singularity
snakemake -s workflow/preprocess.smk -j 4 --use-singularity
snakemake -s workflow/categorize.smk -j 4 --use-singularity
snakemake -s workflow/sqlite.smk -j 4 --use-singularity
snakemake -s workflow/metadata.smk -j 4 --use-singularity
snakemake -s workflow/plot.smk -j 4 --use-singularity
In parallel environment (GridEngine):
export NCBI_API_KEY=ABCDE12345
snakemake -s workflow/download.smk -j 4 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/ublast.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/categorize.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/sqlite.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/metadata.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/plot.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
In parallel environment (Slurm):
export NCBI_API_KEY=ABCDE12345
snakemake -s workflow/download.smk -j 4 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/ublast.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/categorize.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/sqlite.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/metadata.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/plot.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
Copyright (c) 2021 Koki Tsuyuzaki and RIKEN Bioinformatics Research Unit Released under the Artistic License 2.0.
- Koki Tsuyuzaki
- Manabu Ishii
- Itoshi Nikaido