DocumentCode :
2465922
Title :
Metadata Extraction Based on Mutual Information in Digital Libraries
Author :
Liu, Lizhen ; He, Guoqiang ; Shi, Xuling ; Song, Hantao
Author_Institution :
Capital Normal Univ., Beijing
fYear :
2007
fDate :
23-25 Nov. 2007
Firstpage :
209
Lastpage :
212
Abstract :
As the main infrastructure of Internet-two, digital library have had a rapidly development and received a lot of harvest in recent years. But one of the key problems is how to help users to find satisfied resources more efficiently among the affluent contents in heterogeneous repositories of digital libraries. Metadata as a kind of structure data about data can describe the content, semantics and services of data. Metadata, which is a foundation of defining and organizing the resources in digital library, plays a pivotal role in constructing resources. Therefore, metadata extraction, semantic retrieval and semantic annotate in metadata automatic management are challengeable research tasks. Each kind of metadata could be regarded as a classification. Therefore, metadata extraction is just as the classifying work for every document block. The paper focused on the research of automatic metadata extraction based on mutual information which is a widely used information theoretic measure, in a descriptive way, to compute the stochastic dependency of discrete random variables. Metadata extraction has been performed using max-mutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably in digital library systems.
Keywords :
Internet; digital libraries; entropy; information resources; information retrieval; meta data; Internet-two; digital libraries; entropy; linear feature conversions; metadata automatic management; metadata extraction; mutual information; nonlinear feature conversions; semantic annotate; semantic retrieval; Data mining; Educational institutions; Internet; Mutual information; Research and development; Resource description framework; Semantic Web; Software libraries; Vocabulary; XML; Digital Library; Metadata Extraction; Mutual Information;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technologies and Applications in Education, 2007. ISITAE '07. First IEEE International Symposium on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-1386-7
Electronic_ISBN :
978-1-4244-1386-7
Type :
conf
DOI :
10.1109/ISITAE.2007.4409272
Filename :
4409272
Link To Document :
بازگشت