Skip to content

Junkdnalab/hgg_2022

Repository files navigation

Intro to Computational Genomics

Course Abstract

The goal of this course is to provide students with an overview of the fields of bioinformatics and computational genomics, and provide space to acquire basic computer literacy needed to perform analyses of next generation sequencing experiments. We will attend fundamental concepts in the analysis and interpretation of genomics data from bulk to single-cell technologies, and explore in depth the structure properties of key data types and their associated file formats. While we acknowledge that 6 weeks is far to short a time for this course to be considered comprehensive, it is our goal to provide students with basic skills to empower themselves to advocate for and analyze their own data.

Course Format

The course takes place over 6 weeks with 1.5 hour lectures on Tuesdays and Thursdays 1:00 PM - 2:30 PM for a total of 12 lectures and workshops. Most lectures are accompanied by either readings or some light homework or both. Any assignments must be completed on time for full credit, no exceptions. There is no final exam.

Special considerations for COVID-19

Due to social distancing protocols the course will take place online on MS Teams. Links to be provided by email invite.

Expectations: Attendance, Homework & Grading

The course is worth 120 points. Late assignments will be given a maximum of 50% of full credit up to one lecture after the original due date. When homework is not assigned, points will be given for attendance. Readings are indicated on the syllabus. Each question set is worth 10 points and will be graded based on completeness. Links to lecture slides and readings will be provided in the syllabi below.

Attendance is required no exceptions. You may obtain permission ahead of time or with extenuating circumstances after the fact from the graduate school (email to Emma Yates Kassler). Homework assignments are required on time (see schedule below) regardless of attendance or for half credit one lecture late. Unexcused absences result in an incomplete grade.

Lecture & assignment schedule:

Tuesday, March 15 Bioinformatics (Hazelett)

Lecture topics:

  • Introduction
  • Course overview
  • History of bioinformatics
  • Introduction to Unix

Lecture slides:

Lecture 01

readings:

Thursday, March 17 The Human Genome (Hazelett)

Lecture topics:

  • The Human Genome
  • More basic unix and tour the unix file system

Lecture slides:

Lecture 02

Homework Assignment 1:

Due March 29th

Tuesday, March 22 Genomics File Formats (Coetzee)

Lecture topics:

  • Understanding NGS file formats
  • Understanding NGS quality assessment

Lecture slides:

Lecture 03

Homework Assignment 2:

Due March 31st

[Due March]

Thursday, March 24

Working with data on the command line: Searching NGS File Formats (Coetzee)

Lecture topics:

  • Search for characters or patterns in a text file using the grep command
  • Write to and append a file using output redirection
  • Use the pipe | character to chain together commands

Lecture slides:

Lecture 04

Tuesday, March 29 Working with data on the command line: awk & bedtools - MORE* Searching NGS File Formats*

Lecture topics:

  • Search for characters or patterns in a column specific manner using awk.
  • Learn to use bedtools to accomplish genome arithmetic.

Lecture slides:

Lecture 05

Homework Assignment 3:

Due April 5

Thursday, March 31 Experimental Design, Enrichment and GO (Hazelett)

Lecture topics:

  • Experimental Design
  • Management of Big Data Projects
  • Biological Enrichment
  • Use (and Misuse) of Ontologies and Their Significance

Lecture slides:

Lecture 06

readings:

Homework Assignment 4:

Due April 14

Tuesday, April 5 RNA seq analysis (Coetzee)

Lecture topics:

  • Understanding computational context of RNA-Seq.
  • Learning to judge data for quality metrics of RNA-Seq.
  • Quick differential expression analysis.

Lecture 07

Example DE

Example MultiQC

Thursday, April 7Single Cell analysis (Coetzee)

Lecture 08

analysis

Readings and websites:

Tuesday, April 12 Bioinformatics for bench scientists (Guest: Lawrenson)

Lecture topics:

TBD

Thursday, April 14 Microbiome (Vujkovic-Cvijin)

Lecture topics:

16S rRNA sequence data processing and analysis

Homework Assignment : Follow and complete the tutorial for dada2 prior to April 14. Please also install the following R packages: vegan, phyloseq, lmerTest, lme4, ggplot2, dplyr, ape, reshape2

Tuesday, April 19 Project management and Git (Hazelett)

  • What is project management?
  • Git Basics; creating and cloning a repo
  • Adding, committing, and pull requests

Lecture 11

Homework Assignment :

Due April 26: Create a private git repo and populate with all prior homework assignments. See lecture slides.

Thursday, April 21 (Lecture 12) Automated Machine Learning in Biomedicine: AutoMLPipe-BC (Urbanowicz)

  • What is machine learning?
  • Biomedical data challenges
  • Elements of machine learning analysis pipeline
  • Automated Machine Learning
  • Demonstration of AutoMLPipe-BC

Lecture 12

Pre-Lecture: Follow instructions on slides 109-111 to install AutoMLPipe-BC on your Google-Drive.

Schedule and Due Dates:

day date lecturer hmwk due
Tue 03/15 HAZELETT
Thu 03/17 HAZELETT
Tue 03/22 COETZEE hw1 03/29
Thu 03/24 COETZEE hw2 03/31
Tue 03/29 COETZEE hw3 04/05
Thu 03/31 HAZELETT hw4 04/14
Tue 04/05 COETZEE
Thu 04/07 COETZEE
Tue 04/12 LAWRENSON 04/21
Thu 04/14 VUJKOVIC-CVIJIN dada2 tutorial 04/14
Tue 04/19 HAZELETT see lecture slides 04/26
Thu 04/21 URBANOWICZ

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages