English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Poster

RGASP evaluation of RNA-Seq read alignment algorithms

MPS-Authors
/persons/resource/persons85598

Kahles,  A
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons85601

Bohnert,  R
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons85272

Behr,  J
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84153

Rätsch,  G
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Kahles, A., Bohnert, R., Ribeca, P., Behr, J., & Rätsch, G. (2011). RGASP evaluation of RNA-Seq read alignment algorithms. Poster presented at 4th Berlin Summer Meeting: “Computational & Experimental Molecular Biology Meet”, Berlin, Germany.


Cite as: https://hdl.handle.net/21.11116/0000-000C-2933-D
Abstract
As the amount of high throughput sequencing (HTS) data is rapidly growing, the need for its fast and accurate analysis becomes increasingly important. Inside a wide spectrum of algorithms developed to align reads from RNA-Seq experiments, algorithms capable of performing spliced alignments form a particularly interesting subgroup. The results of these techniques are very valuable for downstream transcriptome analyses. Unfortunately, most of the original publications were not accompanied by a comparison of alignment performance and result quality. The RNASeq Genome Annotation Assessment Project (RGASP, carried out by the Wellcome Trust Sanger Institute) was launched to assess the current progress of automatic gene building using RNA-Seq as its primary dataset. Its goal was to assess the success of computational methods to correctly map RNA-Seq data onto the genome, assemble transcripts, and quantify their abundance in particular datasets. The input data originated from three model organisms (Human, Drosophila, and C. elegans) and comprised data from different sequencing platforms (Illumina, SOLiD and Helicos). As part of RGASP, also alignments of a variety of different methods, including BLAT, GEM, PALMapper, SIBsim4, TopHat, and GSnap, were submitted; here we present the results of the analysis we performed on these submissions. Besides different descriptive statistical criteria, as sensitivity and precision of intron recognition, mismatch and indel distribution, we also compared the alignments among each other, e.g., with respect to the agreement of intron predictions and multiple mappings of reads. We further investigated the influence of different alignment filtering strategies to the alignment performance in general but also respective to downstream analyses as transcript prediction and quantification. Our comparisons showed a great diversity in the behavior of the different alignment strategies, with surprisingly small agreement between a subset of methods. We can show that different filtering strategies influence the performance significantly and can drastically increase the precision of transcript prediction and transcript quantification. Additionally, the evaluation of the transcript annotations derived from these alignments allows us to correlate alignment accuracy with the precision of exon, transcript, and gene prediction. We will discuss specific features of the different alignment strategies that most influence the success of subsequent analysis steps. The tools developed for this analysis are incorporated into the Galaxy instance at
http://galaxy.fml.mpg.de.