Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis for Manuscript #1

Open
16 of 19 tasks
rx32940 opened this issue May 15, 2020 · 2 comments
Open
16 of 19 tasks

Analysis for Manuscript #1

rx32940 opened this issue May 15, 2020 · 2 comments

Comments

@rx32940
Copy link
Collaborator

rx32940 commented May 15, 2020

  • kneaddata: sequence quality check, remove contamination and host
  • kraken2 mini database (most up_to_date version (minikraken_8GB_20200312) includes human genome)
    • absolute bar plot
  • kraken2 w/standard database
    • absolute bar plot
  • kraken2 w/custom database (?)
    • phylum
    • genus
    • species
  • CLARK
    • phylum
    • genus
    • species
  • CLARK-S
    • phylum
    • genus
    • species
  • PCA (with results from custom databases)
  • relative abundance
@rx32940
Copy link
Collaborator Author

rx32940 commented May 15, 2020

new kraken2 analysis outputs in:

/scratch/rx32940/kraken2_052020

kneaddata

  • script in for kneaddate found in kraken2_pipeline.sh
  • reference for R.rattus (GCF_011064425.1) and R.norvegicus (GCF_000001895.5) from refseq
    • reference database (bowtie2-build):
/scratch/rx32940/kraken2_052020/kneaddata/ref_db
  • pairend output sequences after trimming and host cleaning (base on each sample's specific host)
/scratch/rx32940/kraken2_052020/kneaddata/hostclean_seq

kraken2 databases

kraken2 database found in:

/scratch/rx32940/kraken2_052020/kraken2/kraken2_db
  • build standard database
    • Do I need to do custom database?
  • download minikraken_8GB_202003.tgz pre-built database from JHU ftp link (wget)
    • human genome included
    • most up_to_date database

kraken2 databases

Kraken2 results found in:

/scratch/rx32940/kraken2_052020/kraken2/(mini_output/standard_output)
  • download kreport for absolute reads analysis
  • local absolute reads data store in:
/Users/rx32940/Dropbox/5.Rachel-projects/Metagenomic_Analysis/final_analysis/kraken2/(minikraken/standard)/absolute

Rmarkdown scrip for barplot visualization with minikraken db

Rmarkdown scrip for barplot visualization with standard kraken2 db

Rmarkdown scrip for barplot visualization with custom kraken2 db

@rx32940
Copy link
Collaborator Author

rx32940 commented Jul 24, 2020

new clark analysis outputs in:

/project/lslab/Rachel/Metagenomics/clark_0613

Final input for CLARK analysis:

clark_0613/hostclean_seq/{$SAMPLE_ID}_1_kneaddata_paired_{1/2}.fastq.paired.fq

CLARK database

  • CLARK standard database matching KRAKEN2 standard database (with UniVec_Core.fasta and no Rattus reference sequences)
  • CLARK Custom database matching KRAKEN2 custom database built in the last thread
  • CLARK-S Spaced database built based on the CLARK custom database
    Dir for all three databases: clark_0613/database/standard
    - CLARK analysis will produce lineage information for each taxon, thus can only set-target for species classification to save time

CLARK output

standard database: clark_0613/output_genus(phylum)
custom database: clark_0613/output_species_rat
spaced database: output_species_rat_spaced

script for abundance visualization

Rmarkdown scrip for barplot visualization with standard db
Rmarkdown scrip for barplot visualization with custom db
Rmarkdown scrip for barplot visualization with spaced db

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant