DocumentCode :
2349943
Title :
Learning Structure and Schemas from Heterogeneous Domains in Networked Systems: A Survey
Author :
Biba, Marenglen ; Xhafa, Fatos
Author_Institution :
Dept. of Comput. Sci., Univ. of New York Tirana, Tirana, Albania
fYear :
2010
fDate :
24-26 Nov. 2010
Firstpage :
222
Lastpage :
229
Abstract :
The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.
Keywords :
Internet; data mining; digital libraries; document handling; learning (artificial intelligence); data mining; digital libraries; heterogeneous domains; internet-based technologies; machine learning; networked systems; repositories; structure documents; data mining; distributed systems; heterogeneous data; machine learning; structure learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Networking and Collaborative Systems (INCOS), 2010 2nd International Conference on
Conference_Location :
Thessaloniki
Print_ISBN :
978-1-4244-8828-5
Electronic_ISBN :
978-1-4244-4278-2
Type :
conf
DOI :
10.1109/INCOS.2010.63
Filename :
5702099
Link To Document :
بازگشت