Gene length and proximity to neighbors affect genome-wide expression levels

Genome Res. 2003 Dec;13(12):2602-8. doi: 10.1101/gr.1169203. Epub 2003 Nov 12.

Abstract

Steady-state levels of mRNA in cells theoretically depend on the rate and efficiency of transcription and posttranscriptional processing, on mRNA stability, on transcriptional interference from other genes, and on poorly defined long-range chromatin effects. Although each of these cellular processes has been studied in detail for a few genes, it is not possible to predict expression levels by simply examining gene sequences. In this report, we have used a bioinformatics approach to identify critical factors that influence expression levels. To simplify the problem, we have limited our analysis to the collection of genes expressed in all tissues, because such genes provide a unique opportunity to distinguish the role of general genomic features that constrain gene expression from the effect of tissue-specific factors. Using correlation and regression techniques, we have investigated the dependence between expression level and morphological parameters (distance to neighbors, gene, mRNA or 3'-UTR length, number of exons, etc.) that can be directly related to transcription, posttranscriptional processing, mRNA stability, or transcriptional interference. We found that, on a genome-wide scale, highly expressed genes are significantly farther from their closest neighboring genes, are smaller, contain a moderate number of exons, and produce shorter mRNAs with shorter 3'-UTRs. This confirms that transcriptional and posttranscriptional processes are highly interrelated and implies that transcriptional interference plays a role in determining steady-state levels of mRNA in cells.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Computational Biology / methods
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / genetics*
  • Gene Order / genetics*
  • Genome, Human*
  • Humans