Title :
Extracting semantic metadata for effective spreadsheet search
Author :
Chatvichienchai, Somchai
Author_Institution :
Dept. of Inf. & Media Studies, Univ. of Nagasaki, Nishisonogi, Japan
Abstract :
Metadata is an essential part of modern information system since it helps people to find relevant documents from disparate repositories. This paper describes my effort to develop a system that automatically generates semantic metadata from large, diverse, and evolving spreadsheet collections. Semantic metadata is known as metadata that describes contextually relevant or domain-specific information about content based on an industry-specific or enterprise-specific custom metadata model. In order to simplify semantic metadata generation problem, spreadsheet collections are categorized by layout similarity. A set of properties and semantic metadata extraction rules of a categorized spreadsheet collection is defined from a sample spreadsheet selected from the spreadsheet collection. Category of a given spreadsheet is justified by checking its properties with the property sets of registered collections. Semantic metadata generation of the given spreadsheet is based on semantic metadata extraction rules of the category to which the spreadsheet belongs. The hierarchical structure of semantic metadata of this paper enables end-users to define the meanings of search keywords. This capability allows end users to search the relevant spreadsheets efficiently.
Keywords :
meta data; semantic networks; spreadsheet programs; categorized spreadsheet collection; contextually relevant information; domain-specific information; enterprise-specific custom metadata model; industry-specific custom metadata model; layout similarity; semantic metadata extraction; semantic metadata generation problem; spreadsheet collections; spreadsheet search; Companies; Crawlers; Layout; Semantics; Syntactics; XML; Binding; Metadata; Schema; Semantic; XML;
Conference_Titel :
Computer Science and Engineering Conference (ICSEC), 2013 International
Conference_Location :
Nakorn Pathom
Print_ISBN :
978-1-4673-5322-9
DOI :
10.1109/ICSEC.2013.6694747