• DocumentCode
    3149599
  • Title

    A new method of document structure extraction using generic layout knowledge

  • Author

    Yashiro, Hiroshi ; Murakami, Tatsuya ; Shima, Yoshihiro ; Nakano, Yasuaki ; Fujisawa, Hiromichi

  • Author_Institution
    Hitachi Ltd., Tokyo, Japan
  • fYear
    1989
  • fDate
    10-12 Apr 1989
  • Firstpage
    282
  • Lastpage
    287
  • Abstract
    A method of document structure extraction using generic layout knowledge is described. With this method, it is possible to translate images of multimedia documents, i.e. documents that include pictures, graphics, and color information, to hypertext. Hypertext consists of decomposed elements linked with each other through some logical relationship. The principal components of the method are extraction of logical structure elements using a rectangular set operation and generation of hierarchical links of the logical structure between the extracted document elements. It is shown experimentally that the logical structure of a technical paper can be extracted
  • Keywords
    computerised pattern recognition; hypermedia; color information; decomposed elements; document structure extraction; generic layout knowledge; graphics; hierarchical links; hypertext; logical relationship; logical structure elements; multimedia documents; pictures; rectangular set operation; Data mining; Graphics; Image retrieval; Image sequence analysis; Industrial relations; Information retrieval; Laboratories; Layout; Text analysis; Writing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Applications of Machine Intelligence and Vision, 1989., International Workshop on
  • Conference_Location
    Tokyo
  • Type

    conf

  • DOI
    10.1109/MIV.1989.40564
  • Filename
    40564