• DocumentCode
    185797
  • Title

    Internet information source discovery based on multi-seeds cocitation

  • Author

    Gao Hui ; Niu Haibo ; Luo Wei

  • Author_Institution
    China Defense Sci. &Technol. Inf. Center, Beijing, China
  • fYear
    2014
  • fDate
    18-19 Oct. 2014
  • Firstpage
    368
  • Lastpage
    371
  • Abstract
    The technology of Internet information source discovery on specific topic is the groundwork of information acquisition in current big data era. This paper presents a multi-seeds cocitation algorithm to find new Internet information sources. The proposed algorithm is based on cocitation, but what difference with the traditional algorithms is that we use multiple websites on specific topic as input seeds. Then we induce Combined Cocitation Degree(CCD) to measure the relevancy of newly found websites, which is that the new websites have higher combined cocitation degree and are more topic related. Finally a websites collection of the biggest CCD is referred to as the new Internet information sources on the specific topic. The experiments show that the proposed method outperforms traditional algorithms in the scenarios we tested.
  • Keywords
    Big Data; Internet; Web sites; citation analysis; data mining; Big Data; CCD; Internet information source discovery; Web sites; combined cocitation degree; information acquisition; multiseeds cocitation; relevancy measurement; Algorithm design and analysis; Big data; Charge coupled devices; Google; Internet; Noise; Web pages; big data; cocitation; information source; related website;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Security, Pattern Analysis, and Cybernetics (SPAC), 2014 International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4799-5352-3
  • Type

    conf

  • DOI
    10.1109/SPAC.2014.6982717
  • Filename
    6982717