Structural assignments to the Mycoplasma genitalium proteins show extensive gene duplications and domain rearrangements

Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14658-63. doi: 10.1073/pnas.95.25.14658.

Abstract

The parasitic bacterium Mycoplasma genitalium has a small, reduced genome with close to a basic set of genes. As a first step toward determining the families of protein domains that form the products of these genes, we have used the multiple sequence programs PSI-BLAST and GEANFAMMER to match the sequences of the 467 gene products of M. genitalium to the sequences of the domains that form proteins of known structure [Protein Data Bank (PDB) sequences]. PDB sequences (274) match all of 106 M. genitalium sequences and some parts of another 85; thus, 41% of its total sequences are matched in all or part. The evolutionary relationships of the PDB domains that match M. genitalium are described in the structural classification of proteins (SCOP) database. Using this information, we show that the domains in the matched M. genitalium sequences come from 114 superfamilies and that 58% of them have arisen by gene duplication. This level of duplication is more than twice that found by using pairwise sequence comparisons. The PDB domain matches also describe the domain structure of the matched sequences: just over a quarter contain one domain and the rest have combinations of two or more domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics*
  • Gene Duplication*
  • Gene Rearrangement*
  • Genes, Bacterial
  • Mycoplasma / genetics*

Substances

  • Bacterial Proteins