||Amoeba Dictyostelium discoideum
||Eichinger et al., The genome of the social amoeba Dictyostelium discoideum. Nature. 2005 May 5 435(7038):43-57. p. 47 right column 2nd paragraphPubMed ID15875012
||"Full details are provided in the Supplementary Information. Briefly, automated gene
prediction was performed using a combination of programs that had been trained on
well-characterized D. discoideum genes, and the results integrated with reference to
D. discoideum complementary DNA sequences and homology to genes in other species.
Other features in the predicted proteins, and other sequence features, were identified using
a variety of software packages."
||"Of the 13,541 predicted proteins, 47.5%
are represented by qualified ESTs [expressed sequence tags], reflecting the inevitable bias in
EST sampling. Among the shortest predicted proteins, fewer are
represented by ESTs (for example, 21% of those of <60 amino
acids) this is at least partly due to a higher level of overprediction.
On the basis of the simplifying assumption that 50% of all genes
coding for proteins of <100 amino acids are mis-predictions, researchers
estimate the true number of genes at roughly 12,500. This number is
closer to that seen in multicellular organisms rather than in most
unicellular eukaryotes (Table 2)."