Title :
A Discriminant Framework for Detecting Similar Scientific Research Projects Based on Big Data Mining
Author :
Shanqing Li ; Lirong Song ; Hui Zhao
Author_Institution :
Inst. of Sci. & Tech. Inf. of China, Beijing, China
fDate :
June 27 2014-July 2 2014
Abstract :
Scientific research projects play an important role in promoting the science and technology competitiveness of a country. Due to lack of information open and sharing, it is possible to approve similar or duplicated projects by different government departments. In some way, these similar projects are a waste of scientific resources. To avoid such a problem, this paper proposes a discriminant framework for detecting similar projects based on big data mining technologies, providing evidence-based decision making for government departments during the project approval process. Firstly, we construct a big data file associated with officially approved projects, including project titles, principal investigators, research organizations, keywords, and bibliographies of published scholar papers. Secondly, a discriminant framework is proposed to detect similar projects by mining information from the above big data file. Finally, we adopt the Hadoop architecture to speed up the data mining algorithm. To demonstrate the effectiveness and feasibility of the framework, we implement a prototype system for similar projects detection.
Keywords :
Big Data; data mining; file organisation; government data processing; natural sciences computing; public domain software; Hadoop architecture; bibliographies; big data file; big data mining technologies; evidence-based decision making; government departments; information mining; keywords; principal investigators; project approval process; project titles; published scholar papers; research organizations; science and technology competitiveness; Big data; Data mining; Distributed computing; Government; Proposals; Prototypes; Hadoop architecture; big data mining; discriminant framework; similar scientific project detection;
Conference_Titel :
Big Data (BigData Congress), 2014 IEEE International Congress on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5056-0
DOI :
10.1109/BigData.Congress.2014.75