The R package otu2ot for implementing the entropy decomposition of nucleotide 
variation in sequence data

Ramette, A.; Buttigieg, P.

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data

MPG-Autoren

/persons/resource/persons210701

Ramette, A.
HGF MPG Joint Research Group for Deep Sea Ecology & Technology, Max Planck Institute for Marine Microbiology, Max Planck Society;

/persons/resource/persons210306

Buttigieg, P.
HGF MPG Joint Research Group for Deep Sea Ecology & Technology, Max Planck Institute for Marine Microbiology, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Ramette14.pdf
(Verlagsversion), 3MB

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Ramette, A., & Buttigieg, P. (2014). The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data. Frontiers in Microbiology, 5: 601, pp. 1-9.

Zitierlink: https://hdl.handle.net/21.11116/0000-0001-C4D8-E

Zusammenfassung

Oligotyping is a novel, supervised computational method that classifies closely related sequences into "oligotypes" (OTs) based on subtle nucleotide variation (Eren et al., 2013). Its application to microbial datasets has helped reveal ecological patterns which are often hidden by the way sequence data are currently clustered to define operational taxonomic units (OTUs). Here, we implemented the OT entropy decomposition procedure and its unsupervised version, Minimal Entropy Decomposition (MED; Eren et al., 2014c), in the statistical programming language and environment, R. The aim of this implementation is to facilitate the integration of computational routines, interactive statistical analyses, and visualization into a single framework. In addition, two complementary approaches are implemented: (1) An analytical method (the broken stick model) is proposed to help identify OTs of low abundance that could be generated by chance alone and (2) a one-pass profiling (OP) method, to efficiently identify those OTUs whose subsequent oligotyping would be most promising to be undertaken. These enhancements are especially useful for large datasets, where a manual screening of entropy analysis results and the creation of a full set of OTs may not be feasible. The package and procedures are illustrated by several tutorials and examples.