Genome size, protein-coding gene number, total gene number, and gene-coding percentage for the sequenced genomes of eukaryotes, bacteria, archaea, viruses, mitochondria, and chloroplasts

Range Table - link
Organism Various
Reference Hou Y, Lin S. Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes. PLoS One. 2009 Sep 14 4(9):e6978. doi: 10.1371/journal.pone.0006978. Supporting Information p.7 Supplemental table S1PubMed ID19750009
Primary Source See refs beneath table on p.34
Method P.2 left column 2nd paragraph: "Methods: Data collection: Data up to date by February 2009 were retrieved from the Reference Sequence (RefSeq) collection in the National Center for Biotechnology information (NCBI link ), the Integrated Microbial Genomes (IMG) system in DOE Joint Genome Institute (JGI link ), and peer-reviewed publications (Supplemental Table S1). Dataset included total number of nucleotide base pairs (i.e. genome size), number of protein-coding genes, and total number of genes (including protein-coding, rRNA, and tRNA), gene-coding percentage (percent of DNA bases that codes for genes in a genome) for 55 completely sequenced eukaryotic genomes and 1055 non-eukaryotic genomes including prokaryotes (478 from bacteria and 60 from archaea), viruses (260), and organelles (231 from mitochondria and 26 from chloroplasts). For gene-coding percentage, only data published in peer-reviewed articles were used in the analysis as data from JGI included introns and other untranslated regions and significantly overestimated gene-coding percentage in large eukaryotic genomes (Supplemental Table S1). Incomplete or draft genome sequence data were excluded from this study to avoid potential errors."
Comments P.2 right column 2nd paragraph: "On the contrary, the gene-coding fraction of the genome, i.e., gene-coding percentage, showed a different trend against genome size than the gene number trend (Figure 1B, 2B). In eukaryotes, the gene-coding percentage declined from 81.6% to 1.2% as the genome size increased (Figure 2B, Supplemental Table S1). The gene-coding percentage in non-eukaryotes was generally higher (97%–47%) and varied markedly less with genome size (Figure 1B, 2B) than in eukaryotes. The only exceptions were the organellar genomes, which exhibited a substantially lower gene-coding percentage than prokaryotes and viruses, indicating disproportionate loss of coding sequences during organellar genome reduction." See notes beneath table on pp.33-34
Entered by Uri M
ID 112819