Home Page  • Contact email

Learn Bioinformatics in 100 hours

Your progress in the course: 0%

Important note

This course is being replaced by the course

Please follow the course above for the most up to date content

Course information

The course represents all the training materials for the BMMB:852 Applied Bioinformatics course offered at Penn State in 2017.

The course offers a structured path through the Biostar Handbook. Various sections of the book are presented via smaller, logical consistent units. We recommend learning two-four units per week.

The lectures consist of slides, links to various chapters, links to supporting materials and homework. There are no videos.

Please consult the synopsis for details on what is covered and how to learn the materials.

Note: This book follows the 1st edition of the Handbook and will not match the content of the 2nd Edtion. There may be links and content that refer to sections that have been moved. For up to date content see Applied Bioinformatics (2020)

Lecture Your Score
Lecture 1: How is Bioinformatics practiced?

Course structure. How is bioinformatics practiced. Computer setup.

Test
Lecture 2: How do I use the command line?

Unix command line use. Find help on commands. Flag system.

Test
Lecture 3: How are Unix commands used for data analysis?

Examples of processing biological data from the command line.

Test
Lecture 4: What do the words mean?

How to make sense of terminology. Sequence and gene ontologies.

Test
Lecture 5: How to interpret a list of genes?

Functional enrichment, functional over-representation.

Lecture 6: How to access published data from the command line

Reproducibility. Data repositories. Entrez Direct

Test
Lecture 7: Data formats. Genbank, FASTA and FASTQ

Accessing and manipulating sequencing data.

Test
Lecture 8: Quality control of high throughput sequencing data

Quality visualization. Improving data quality. Adapter removal.

Test
Lecture 9: Advanced quality control of FASTQ data

Sequence duplication, read merging, MultiQC, error correction.

Lecture 10: Sequencing concepts, methods, coverage formula

Single end and paired-end sequencing, computing sequencing depth

Test
Lecture 11: Scripting and Automation

Automating tasks. Make analyses reproducible.

Test
Lecture 12: Accessing the Short Read Archive

Short read archive, fastq-dump, repeating commands

Lecture 13: Sequence Alignments

Alignment scoring, global, local alignments

Lecture 14: BLAST, Basic Local Alignment Search Tool

Using blast online and at the command line

Lecture 15: BLAST databases

Make blast databases. BLAST search tasks.

Test
Lecture 16: Short Read Aligners

What is short read alignment. How to run bwa and bowtie2.

Test
Lecture 17: Sequence Alignment Maps (SAM)

SAM/BAM the workhorse of high throughput sequencing

Test
Lecture 18: Paired end reads in BAM files.

Create and filter BAM files.

Test
Test
Lecture 20: Visualizing Large Genomic Variation

Large insertions, deletions, copy number variations

Test
Lecture 21: Filtering SAM files

Select alignments by their attributes

Test
Lecture 22: Processing SAM/BAM files

Picard tools. Unaligned BAM files.

Test
Lecture 23: Short Genomic Variations

First steps in detecting short variations

Lecture 24: Let's call some SNPs

SNP calling with bcftools and freebayes

Lecture 25: The Variant Call Format

Understand the VCF format.

Test
Lecture 26: Making sense of variants

variant effect prediction, interval datatypes, BED, GFF

Lecture 27: Sequencing Application Domains

Re-sequencing, assembly, classification

Lecture 28: Quantifying with sequencing

Functional assays, computing coverages over intervals

Course Synopsis: How does this course work?

What is the structure and purpose of this course.

Test