Why bioinformatics?

In the modern world, it often seems that the age of exploration is over. We have been on the Moon, there are research bases in Antarctica, and we can look at atoms. You may ask yourself are there wonders waiting to be discovered? After all, it may seem that all we are left with is to refine and improve on prior observations.

I have good news for you all. The information age has opened up an entirely new domain of science, that of understanding what life itself consists of. We are all beginners in this new science, we are all starting from the same knowledge and principles. Everyone has the chance to make unique and fascinating discoveries using little more than their own computer.

What is this book about?

The answers to some of the greatest questions of life lie within ourselves. Bioinformatics is a new science created from the marriage of Data Science and Biology. Through this emerging field of study, scientists can find and decode hidden information in our very own genes, allowing us to understand what none before us have known.

This book teaches you practical skills that will allow you to enter this fast-expanding industry. Beginning with fundamental concepts such as understanding data formats and how analysis is done in general and what conclusions can be drawn from data, the Handbook eases you into the limitless world of possibilities.

With the help of the book, you will be reproducing results from realistic data analysis scenarios such as genome assembly or gene expression analysis. The methods and tools you will find in the Handbook were refined in world-class research facilities and will allow you to break into this new field and tackle some of the most significant challenges we are facing in the scientific frontier of the 21st century.

What is covered in the book?

The Handbook is divided into several sections. We cover both the foundations and their applications to realistic data analysis scenarios.

Bioinformatics analysis concepts:

Data formats and repositories.
Sequence alignments.
Data visualization.
Unix command line usage.

Bioinformatics protocols:

Genome variation and SNP calling.
RNA-seq and gene expression analysis
Genome assembly
Metagenomics classification
ChIP-Seq analysis

Software tool usage:

Using short read aligners
Using quality control tools
Manipulating sequence data

The table of contents on the left allows you to jump to the corresponding sections.