Abstract :
The draft of the human genome sequence is still incomplete. The outstanding tasks include filling in some gaps, finalizing the assembly of short sequences, improving sequence accuracy and correctly identifying coding regions. However, a closely related problem that receives little attention is the substantial number of incorrect annotations that have penetrated some of the widely used databases. This article illustrates this problem using the example of ubiquitin genes, and draws some conclusions that apply to false annotations in other short open reading frames (ORFs). Although the focus is on the human genome, other genomes are equally prone to similar propagation of false annotations.