Title :
Internal Filtering Approach toward Efficiency Optimization of Matching Large Scale XML Schemas
Author :
Alqarni, Ahmad Abdullah ; Pardede, Eric
Author_Institution :
Dept. of Comput. Sci. & Comput. Eng., La Trobe Univ., Melbourne, VIC, Australia
Abstract :
XML Schema matching plays a significant role in the integration of different XML Schemas by finding similar corresponding elements. XML Schema elements´ properties and their relation to surrounding elements play significant role in improving the quality of matching process. Investigating all measures for each element in two schemas can result in a long execution time, which reduces the performance of the matching process. The feasibility of performance is becoming significant in particular in large scale XML Schema with all that features and surroundings. Since internal features of an element represents between 40-60% of the total similarity value, it should be utilised to filter elements that yield lower internal similarity value based on a predefined threshold. Thus, we propose to use element´s internal features as a filter to exclude any element that is lower to certain predefined threshold. We also present an optimum threshold that can be used in the filtering approach. The idea is to detect using the internal features the elements that are highly likely to be dissimilar and excluded them from the next phase of element´s context (element´s surroundings) investigations. The outcome of imposing this approach is promising not only for improving the matching efficiency per see, but also for maintaining an acceptable quality results that are very close to non-filter approach.
Keywords :
XML; information filtering; information filters; query processing; Extensible Markup Language; XML schema element properties; execution time; filter elements; internal feature detection; internal filtering approach; large-scale XML schema matching efficiency optimization; matching process performance; matching process quality improvement; nonfilter approach; optimum threshold; total internal similarity value; Context; Educational institutions; Matched filters; Optimization; Weight measurement; XML; XML Schema; efficieny; matching; quality;
Conference_Titel :
Network-Based Information Systems (NBiS), 2013 16th International Conference on
Conference_Location :
Gwangju
Print_ISBN :
978-1-4799-2509-4
DOI :
10.1109/NBiS.2013.77