Number of genes

Value 3767 unitless Range: Table - link unitless
Organism Bacteria Caulobacter crescentus
Reference Nierman et al., Complete genome sequence of Caulobacter crescentus. Proc Natl Acad Sci U S A. 2001 Mar 27 98(7):4136-41 p. 4138 table 1 & left column 2nd paragraphPubMed ID11259647
Method "ORFs were identified by using GLIMMER (6). Annotation of the identified ORFs was accomplished by manual curation of the outputs of a variety of similarity searches. Searches of the predicted coding regions were performed with BLASTP, as previously described (7). The protein– protein matches are aligned with blastoextendorepraze, a modified Smith–Waterman (8) algorithm that maximally extends regions of similarity across frameshifts. Gene identification is facilitated by searching against a database of nonredundant bacterial proteins (nraa) developed at TIGR and curated from the public archives GenBank, Genpept, PIR, and SwissProt. Searches matching entries in nraa have the corresponding role, gene common name, percent identity and similarity of match, pairwise sequence alignment, and taxonomy associated with the match assigned to the predicted coding region and stored in the database." "The genome sequence of C. crescentus CB15 was determined by the whole genome random sequencing method (Heidelberg et al., 2000 PMID 10952301)." (Numbers in brackets point to refs in article).
Comments "A total of 3,767 predicted ORFs were identified, of which 2,030 (53.9%) are assigned putative functions, 725 (19.2%) have matches to hypothetical proteins, and 1,012 (26.9%) have no database match (Fig. 4, published as supplemental data on the PNAS web site, www.pnas.org). Coding regions comprise 90.6% of the chromosome. Approximately 1/2 of the proteins (1,801) are members of 678 paralogous families."
Entered by Uri M
ID 105498