Skip to content
Jan 15 11

BLAST+ (blastall) Tutorial

by Jared
blastall

This is a tutorial for NCBI’s BLAST+ tools (formerly blastall) which allows users to run various BLAST tools from their own machines. BLAST+ can be downloaded from NCBI’s website.

The BLAST algorithm and tool is possibly one of the most popular and important bioinformatics tools. BLAST is an algorithm that allows researchers to search a database of sequences (DNA or Amino Acid) for matches or similarities to their query sequence. BLAST is a heuristic algorithm so it isn’t as optimal as the Smith-Waterman algorithm but it is much faster and therefore generally more useful. For a detailed look into the BLAST algorithm check out the BLAST Wikipedia entry.

The first step is to download and install the BLAST+ tools onto your machine. Once you have done that you’ll find there are a number of different tools in the package. The main one that you will use to perform different blasts is blastall. Here is an example of it’s usage:

$ blastall -p blastn -d nr -e 10 -i inputfile -o outputfile

Common options are: read more…

Nov 1 10

BioPerl Modules

by Jared
BioPerl Logo

Perl has long been a popular programming language for bioinformatics programmers due in part to its exceptional text search/manipulation properties. It is also an easy to use, yet powerful, scripting language. No doubt, anyone who has done any bioinformatics programming has done a bit of Perl programming and hopefully used BioPerl.

BioPerl is a great set of open source modules for Perl programming. These modules simplify many of the common tasks that bioinformatics programmers regularly deal with. BioPerl saves the programmer lots of time so it is worth putting in a little bit of effort to become familiar with the modules.

BioPerl provides modules for many common bioinformatics tasks. Here are some of the features that BioPerl has modules for: read more…

Jul 25 10

What is Bioinformatics?

by Jared
Gene Expression Heat Map

I frequently get asked What is Bioinformatics? This is not a question that has an exact, easy answer due to the size and complexity of the Bioinformatics field. I find that very often I give different answers each time this question is asked. I will attempt to answer this question in a way that won’t be too complex but also not too shallow. However, I will inevitably be forced to leave complex parts of Bioinformatics out of this answer.

The basic answer is that Bioinformatics is the field where Computer Science, Biology, and Statistics meet. But even for a basic answer that is still too vague. I like to think of it as using Computer Science and Statistics to find and solve biological problems. Personally I like to think of Bioinformatics having two main focuses: read more…