DocumentCode :
1331281
Title :
Modified fractal signature (MFS): a new approach to document analysis for automatic knowledge acquisition
Author :
Tang, Yuan Y. ; Ma, Hong ; Xi, Dihua ; Mao, Xiaogang ; Suen, Ching
Author_Institution :
Dept. of Comput. Studies, Hong Kong Baptist Univ., Kowloon, Hong Kong
Volume :
9
Issue :
5
fYear :
1997
Firstpage :
747
Lastpage :
762
Abstract :
One of the key technologies related to knowledge and data engineering is the acquisition of knowledge and data in the development and utilization of information system and the strategies to capture new knowledge and data. Actually, millions of documents, including technical reports, government files, newspapers, books, magazines, letters, bank checks, etc., have to be processed every day, and knowledge has to be acquired from them. This paper presents a new approach to document analysis for automatic knowledge acquisition. The traditional approaches have two major disadvantages: (1) They are not effective for processing documents with high geometrical complexity. Specially, the top-down approach can process only the simple documents which have specific format or contain some a priori information. (2) The top-down approach needs to split large components into small ones iteratively, while the bottom-up approach needs to merge small components into large ones iteratively. They are time consuming. This new approach is based on modified fractal signature. It can overcome the above weaknesses
Keywords :
computational complexity; database management systems; document handling; knowledge acquisition; automatic knowledge acquisition; bottom-up approach; data engineering; document analysis; information system; knowledge acquisition; knowledge engineering; modified fractal signature; top-down approach; Books; Data engineering; Fractals; Government; Information analysis; Information systems; Knowledge acquisition; Knowledge engineering; Text analysis; Writing;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/69.634753
Filename :
634753
Link To Document :
بازگشت