• DocumentCode
    3307387
  • Title

    Segmentation of the Yellow Pages

  • Author

    Fischer, S. ; Amin, A. ; Drivas, D.

  • Author_Institution
    Sch. of Comput. Sci. & Eng., New South Wales Univ., Sydney, NSW, Australia
  • Volume
    2
  • fYear
    1995
  • fDate
    14-16 Aug 1995
  • Firstpage
    605
  • Abstract
    We present a fully automated process to scan the Australian Telecom Yellow Pages and produce a text document consisting only of the business entries, while removing the advertisements, graphics and notes about the Yellow Pages. The system contains four major components: digitisation and thresholding, skew detection, segmentation (removal of unwanted parts of the image), and finally the recognition engine utilising the principles of mathematical morphology. This paper presents the current research, which consists of the process described above up to image segmentation. All the algorithms are written in C on a 5000/20 DEC workstation. We have tested more than 30 images with extremely promising results
  • Keywords
    DEC computers; business data processing; document image processing; image recognition; image segmentation; mathematical morphology; optical character recognition; 5000/20 DEC workstation; Australian Telecom Yellow Pages; C language; OCR; advertisements; business entries; digitisation; document image segmentation; document scanning; graphics; image recognition; mathematical morphology; research; skew detection; testing; text document; thresholding; Australia; Business; Computer graphics; Computer science; Data mining; IEEE news; Image recognition; Image segmentation; Optical character recognition software; Telecommunications;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    0-8186-7128-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.1995.601969
  • Filename
    601969