The Danko lab studies how our DNA sequences control complex programs of gene transcription. Our work is primarily focused on understanding how natural genetic differences between species affect the various steps in the RNA polymerase II transcription cycle. Our work provides insight into the molecular basis behind phenotypic differences between species. In addition, studying the millions of naturally occurring random genetic mutation “experiments” also presents an exciting opportunity to understand the fundamental principles by which our DNA sequences encode gene expression.
We are an interdisciplinary research group making extensive use of both computational and molecular tools. Our specialty is developing statistical and machine-learning approaches to analyze functional genomic sequencing data prepared using Hi-C, ATAC-seq, PRO-seq, RNA-seq, and related assays. Our tools borrow a variety of ideas from the fields of statistics and machine-learning, including recent uses of hidden Markov models, support vector machines, and artificial neural networks. We also run an active wet-lab that has made great strides in developing and using run-on and sequencing technologies to map the location of RNA polymerase, including PRO-seq and ChRO-seq. More recently we have begun using Hi-C/ Hi-ChIP, single cell RNA-seq, and CRISPR epigenome editing technologies.
Understanding the chain of molecular events that link DNA sequence ‘genotype’ to organism ‘phenotype’ is one of the most exciting frontiers in modern genetics. DNA sequences located in non-coding regions of the genome are critical drivers of phenotypic differences, both between and within species.
Our objective is to discover the fundamental rules by which transcriptional changes arise from differences in DNA sequence and chromatin packaging within the nucleus. To achieve this goal we integrate genomic data collected using a combination of molecular assays (PRO-seq, RNA-seq, ATAC-seq, and Hi-C). Most of our work focuses on CD4+ T-cells, a lynchpin in the adaptive immune system undergoing rapid evolutionary changes that are relevant to autoimmune and allergic disorders.
Our most recent work has found that although changes in distal regulatory elements arise rapidly, these changes frequently do not lead to measurable differences in the transcription of nearby genes. We found evidence that gene transcription is stabilized by multiple compensatory changes acting across ensembles of distal enhancers. This finding suggests a model of regulatory evolution in which changes in regulatory activities arise rapidly, and gene expression is held constant through widespread compensation between regulatory elements targeting each gene.
Detecting biochemically active DNA sequences in a cell (one of the common definitions of a cell's “epigenome”) is a major challenge in genomics. Many approaches rely on using dozens of separate experimental assays, making the analysis of new cell systems expensive and time-consuming. We have recently demonstrated that RNA polymerase marks a surprisingly broad variety of functional elements across the genome. These functional elements can be recognized based on their characteristic “shapes” extracted from PRO-seq data using machine learning tools.
Our objective is to develop a computational toolkit that deconvolves a single PRO-seq assay into a rich source of information about multiple ‘layers’ of functional elements that are active in our genomes. We have developed a machine learning tool called dREG which identifies the location of active regulatory DNA sequence elements using PRO-seq data as input. More recently, we have trained discriminative support vector machines to identify transcription factor binding sites with a high accuracy, and we have begun using transcription to guess or ‘impute’ the abundance of covalent modifications to core histones. These technologies allow comprehensive annotation of active functional elements in mammalian genomes using PRO-seq data alone.
Finally, a core mission of the wet-lab is to extend run-on and sequencing assays to map the location of RNA polymerase across the genome in a wider range of biological conditions. We have recently introduced a new run-on and sequencing variant called ChRO-seq to address the key problem with PRO-seq: namely that it requires a nuclear isolation, which can be challenging in complex tissue samples such as muscle or brain. We have also made substantial progress on strategies to multiplex the PRO-seq and ChRO-seq assays using a 96-well plate format. Taken together, our efforts significantly expand the scope and range of applications in which PRO-seq can be applied.
Identification of regulatory elements from nascent transcription using dREG.
Wang Z, Chu T, Choate LA, Danko CG.
Genome Research (2018).
Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme.
Chu T, Rice EJ, Booth GT, Salamanca HH, Wang Z, Core LJ, Longo SL, Corona RJ, Chin LS, List JT, Kwak H, Danko CG.
Nature Genetics (2018).
Dynamic evolution of regulatory element ensembles in primate CD4+ T cells.
Danko CG, Choate LA, Marks BA, Rice EJ, Wang Z, Chu T, Martins AL, Dukler N, Coonrod SA, Tait-Wojno E, List JT, Kraus WL, Siepel A.
Nature Ecology & Evolution (2018).
A unified architecture of transcriptional regulatory elements.
Andersson R, Sandelin A, Danko CG.
Trends in Genetics (2015).
Identification of active transcriptional regulatory elements from GRO-seq data.
Danko CG, Hyland SL, Core LJ, Martins AL, Waters CT, Lee HW, Cheung VG, Kraus WL, Lis JT, and Siepel A.
Nature Methods (2015).