The transcription unit architecture of the Escherichia coli genome

Nat Biotechnol. 2009 Nov;27(11):1043-9. doi: 10.1038/nbt.1582. Epub 2009 Nov 1.

Abstract

Bacterial genomes are organized by structural and functional elements, including promoters, transcription start and termination sites, open reading frames, regulatory noncoding regions, untranslated regions and transcription units. Here, we iteratively integrate high-throughput, genome-wide measurements of RNA polymerase binding locations and mRNA transcript abundance, 5' sequences and translation into proteins to determine the organizational structure of the Escherichia coli K-12 MG1655 genome. Integration of the organizational elements provides an experimentally annotated transcription unit architecture, including alternative transcription start sites, 5' untranslated region, boundaries and open reading frames of each transcription unit. A total of 4,661 transcription units were identified, representing an increase of >530% over current knowledge. This comprehensive transcription unit architecture allows for the elucidation of condition-specific uses of alternative sigma factors at the genome scale. Furthermore, the transcription unit architecture provides a foundation on which to construct genome-scale transcriptional and translational regulatory networks.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Binding Sites
  • DNA-Directed RNA Polymerases / metabolism
  • Escherichia coli / genetics*
  • Gene Expression Profiling
  • Gene Expression Regulation, Bacterial
  • Genome, Bacterial / genetics*
  • High-Throughput Screening Assays
  • Molecular Sequence Data
  • Open Reading Frames / genetics
  • Transcription Initiation Site
  • Transcription, Genetic*

Substances

  • DNA-Directed RNA Polymerases

Associated data

  • GEO/GSE15534
  • GEO/GSE15588