DocumentCode :
3099541
Title :
Entity resolution for high velocity streams using semantic measures
Author :
Priya, P. Anu ; Prabhakar, S. ; Vasavi, S.
Author_Institution :
Comput. Sci. & Eng., VR Siddhartha Eng. Coll., Vijayawada, India
fYear :
2015
fDate :
12-13 June 2015
Firstpage :
35
Lastpage :
40
Abstract :
Now-a-days large amount of data is generated from various stake holders such as data from sensors and satellites regarding environment and climate, social networking sites about messages, tweets, photos, videos and data from telecommunications etc. This big data, if processed in real-time, helps decision makers to make timely decisions when an event occurred. When source data sets are large (velocity, variety, veracity) traditional ETL (Extract, Transform, Load) is time consuming process. This paves path to extend traditional data management techniques for extracting business value from big data. This paper extends the hadoop framework for performing entity resolution in two phases. In phase 1 MapReduce generate rules for matching two real world objects with similarities. The more the similarity, the objects are similar. Similarity is calculated using domain dependent and independent Natural language processing measures. In Phase 2 these rules are used for matching stream data. Our proposed approach uses 13 semantic measures for resolving entities in stream data. Stream data such as tweets, messages, e-catalogues are used for testing the proposed system.
Keywords :
Big Data; Internet; natural language processing; Hadoop framework; MapReduce; big data; data management techniques; domain dependent natural language processing; e-catalogues; entity resolution; high velocity streams; independent natural language processing; semantic measures; stream data matching; tweets; Accuracy; Erbium; Feeds; Satellites; Big data; entity resolution; stream processing; unstructured data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advance Computing Conference (IACC), 2015 IEEE International
Conference_Location :
Banglore
Print_ISBN :
978-1-4799-8046-8
Type :
conf
DOI :
10.1109/IADCC.2015.7154663
Filename :
7154663
Link To Document :
بازگشت