Introduction

ACE Queue

Getting Data Into Galaxy

Can import the following histories by clicking on link - makes for transparent, reproducible, shareable data AND analysis

  1. Input data https://usegalaxy.eu/u/poorani/h/input-data-ace-winter-2021mar
    • 004-2_*.fastq - original sample data from tutorial
    • 004-2_subs_*.fastq - subsampled to 25k reads
  2. Full history https://usegalaxy.eu/u/poorani/h/full-workflow-ace-winter-2021
    • full history up to jbrowse
    • uses the original sample data (not subsampled)

Tutorial

  1. TB-Profiler profile Tool: with the following parameters
  1. When snippy is run with Genbank format input it prepends GENE_ to gene names in the VCF annotation. This causes a problem for TB Variant Report, so we need to edit the output with sed.

  2. TB Variant Report Tool: with the following parameters

    • "Input SnpEff annotated M.tuberculosis VCF(s)": Text transformation on data xx Make sure you use the transformed TB Variant Filter data that you just made
    • "TBProfiler Drug Resistance Report (Optional)": TB-Profiler Profile on data XX: Results.json

Analysis Notes

  1. Always choose option to output log file!!
  2. Click on name of output of tool, then (i) details icon.
    • This gives parameters and run details of the job.
    • Scroll down and see what resources (CPU, memory) were requested and granted. The resource request is not controlled by the user currently. Could analyses be made more efficient by requesting different resources?
  3. Look at formula for fastqc FastQC formula https://github.com/galaxyproject/tools-iuc/tree/master/tools/fastqc
  4. Switch to MultiQC version 1.9
  5. Make dataset pair for input to trimmomatic
  6. Used Minikraken database for kraken2 for speed purposes. Works ok just to find contaminants.
  7. Make sure to use snippy version 4.5.0
  8. String together snippy step and the TB variant filter step using a Galaxy Workflow.
    • Snippy and TB Variant Filter workflow I made in the class. Can import into your own workspace!
    • See progress of running workflows (and record of finished ones): User menu at the top > Workflow Invocations
  9. Under Interactive Tools on left, try out some of the visualizations like bam.iobio and vcf.iobio
    • Can also open an RStudio instance - when you're done, don't forget to close it (User > Active InteractiveTools > select tool and click Stop) and delete it from history
 

A work by Poorani Subramanian