Title :
Association analysis and case study framework based on the name distinction
Author :
Wu, Bo ; Cai, Wandong ; Li, Yongjun
Author_Institution :
Dept. of Comput. Sci., Northwest Polytech. Univ., Xi´´an, China
Abstract :
The research of distinction of name ambiguity in the field of information retrieval could enhance searching effect. Therefore, it plays an important role to mine the data of name ambiguity in order to obtain useful knowledge. In this paper, we focus on the problem of traditional evaluation and ranking method used in the clustering. Traditional evaluation and ranking method ignores the association among the subinformation and simply considers that pieces of subinformation are mutual independent. We present an effective data mining method framework based on the case study and association analysis. The method framework is evaluated on the dataset of name ambiguity from the database of CDBLP. The dataset includes the information of coauthor name, workplace, publication, years and ranking of the author of papers. The experimental results show that one piece of main sub-information assisted by some minors could form a stronger rule very useful for the distinction of name ambiguity. Also some combinations of pieces of minor sub-information could produce a stronger rule. The association rules selected by the experiment could be easily explained and commonsensible. Considering the association rules coming from the objective data and data mining method, they are more reliable.
Keywords :
data mining; information retrieval; security of data; CDBLP database; association rules analysis; case study framework; coauthor name; data mining; evaluation and ranking method; information retrieval; name ambiguity; name distinction; objective data mining method; Artificial neural networks; associaton anaylsis; case study; data mining; feature extraction; name distinction;
Conference_Titel :
Computer Application and System Modeling (ICCASM), 2010 International Conference on
Print_ISBN :
978-1-4244-7235-2
Electronic_ISBN :
978-1-4244-7237-6
DOI :
10.1109/ICCASM.2010.5620050