Analysis, interpretation, and visualisation of DamID-seq experiments
View/ Open
Date
06/07/2019Author
Ashmore, James
Metadata
Abstract
DNA adenine methyltransferase identification with sequencing (abbreviated DamIDseq)
is a technique that can measure protein-DNA interactions in the genome. Unlike
chromatin immunoprecipitation with sequencing (abbreviated ChIP-seq), this technique
does not require validated antibodies, precipitation steps, or chemical crosslinking,
and can be used with minimal numbers of cells. Although the technique was
first developed in Drosophila nearly two decades ago, due to technical limitations only
a handful of experiments using mammalian cells have been published. The optimisation
of mammalian DamID-seq in our lab has highlighted the need to survey potential
sources of bias, develop accurate analysis methods, and investigate the similarities
and differences with ChIP-seq for detecting protein-DNA interactions. Here, I describe
several variables that influence the accuracy of DamID-seq experiments, present
the Daim software package (pronounced “Dime”) for the comprehensive analysis of
DamID-seq data, and assess the sensitivity and specificity of DamID-seq compared
with competing techniques. In particular, I show that differences in the experimental
procedure (polymerase usage and restriction digest) and features in the sequencing
data (fragment length and nucleotide content) generate systematic bias and technical
variation. I also demonstrate that DamID-seq data can be re-purposed to measure
Dam-accessible DNA in the genome, comparable with other chromatin accessibility
techniques (ATAC-seq, DNase-seq, and FAIRE-seq). To analyse DamID-seq data, I
developed the Daim software package which incorporates methods for preprocessing,
normalisation, and identification of DNA binding and accessibility sites. Several
options for functional and sequence analysis of results are also included. The use of
Daim was demonstrated using data for transcription factors Oct4 and Sox2 in mouse
embryonic stem cells, embryonic fibroblast cells, and neural stem cells from a range
of cell numbers. Finally, I show that DNA binding and accessibility sites vary substantially
between and within techniques, yet no clear reason for these differences has
been detected, prompting careful consideration of any biological conclusions. These
results show that Daim can be successfully used for the analysis, interpretation, and
visualisation of DamID-seq experiments, and that to achieve comprehensive results,
different techniques should be treated as complementary rather than competing.