Introduction
This package of tools encompasses many of the common pieces of software required for the analysis of short read sequences produced by the Second Generation DNA sequencing machines. (eg. Illumina/Solexa sequencers, ABI SOLiD and 454). The focus of this project is on post-alignment analysis. This project includes interpreters for most common aligners and SNP callers and is able to use input from a wide variety of formats.
There are several major components to this package:
- FindPeaks, a Peak Finder/Analysis application for the ChIP-Seq or RNA-Seq experiments.
- Variations Database, a stand alone database schema and API for the analysis of variations across large sets of variant calls and sequencing experiments.
- Utilities for NGS, tools for sorting, separating, parsing or otherwise manipulating common input aligned sequence formats.
This package also includes work used by Anthony Fejes in fulfilling the requirements of his Graduate Thesis work at the University of British Columbia/BC Cancer's Genome Science Centre.
The Vancouver Short Read Analysis Project is hosted by SourceForge.net. The project page is here. The source codes are available from the download page. You can check out the latest source code with:
svn co https://vancouvershortr.svn.sourceforge.net/svnroot/vancouvershortr
Publications
- Fejes, AP, Robertson, G, Bilenky, M, Varhol, R, Bainbridge, M, Jones, SJ (2008). FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics, 24, 15:1729-30.[PMID: 18599518]
- Fejes, AP, Hadj Khodabakhshi A, Birol I, Jones, SJM. Human Variation Database: An open source database template for genomic discovery - submitted