Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

The mutual information: Detecting and evaluating dependencies between variables

MPG-Autoren
/persons/resource/persons97117

Daub,  C. O.
BioinformaticsCRG, Cooperative Research Groups, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

/persons/resource/persons97409

Selbig,  J.
BioinformaticsCRG, Cooperative Research Groups, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Steuer, R., Kurths, J., Daub, C. O., Weise, J., & Selbig, J. (2002). The mutual information: Detecting and evaluating dependencies between variables. In European Conference on Computational Biology (ECCB 2002) (pp. S231-S240).


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0014-2E79-B
Zusammenfassung
Motivation: Clustering co-expressed genes usually requires the definition of `distance' or `similarity' between measured datasets, the most common choices being Pearson correlation or Euclidean distance. With the size of available datasets steadily increasing, it has become feasible to consider other, more general, definitions as well. One alternative, based on information theory, is the mutual information, providing a general measure of dependencies between variables. While the use of mutual information in cluster analysis and visualization of large-scale gene expression data has been suggested previously, the earlier studies did not focus on comparing different algorithms to estimate the mutual information from finite data. Results: Here we describe and review several approaches to estimate the mutual information from finite datasets. Our findings show that the algorithms used so far may be quite substantially improved upon. In particular when dealing with small datasets, finite sample effects and other sources of potentially misleading results have to be taken into account.