• DocumentCode
    606370
  • Title

    Scaling SeerSuite in the Cloud

  • Author

    Teregowda, P. ; Giles, C. Lee

  • Author_Institution
    Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2013
  • fDate
    25-27 March 2013
  • Firstpage
    146
  • Lastpage
    155
  • Abstract
    The Seer Suite digital library search engine framework is used to build tools such as CiteSeerx. It includes a complex metadata extraction system capable of extracting elements, such as author name, title, citations and citation contexts that are crucial bibliometric data and for building a citation graph. The workload faced by the exractor is dynamic in nature and this variability makes CiteSeerx attractive for hosting in a cloud computing environment. Given its application binary dependencies and its reliance on a specialized infrastructure, the current extractor has several limitations. These limitations motivated the design and implementation of the metadata extraction system proposed in this study. A message oriented middleware architecture is used with a publish/subscribe pattern to build a scalable, flexible system that can be deployed across a range of cloud infrastructure. To demonstrate the broad applicability of the proposed system, we evaluate it in terms of its reference implementation across different scenarios of deployment and in regard to its scalability.
  • Keywords
    citation analysis; cloud computing; digital libraries; meta data; middleware; search engines; CiteSeerx; SeerSuite digital library search engine framework; application binary dependencies; bibliometric data; citation graph; cloud computing environment; complex metadata extraction system; message oriented middleware architecture; publish-subscribe pattern; Context; Crawlers; Data mining; Feature extraction; Message-oriented middleware; Portable document format; Cloud Computing; Information Extraction; Information Retrieval; Message Oriented Middleware;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Engineering (IC2E), 2013 IEEE International Conference on
  • Conference_Location
    Redwood City, CA
  • Print_ISBN
    978-1-4673-6473-7
  • Type

    conf

  • DOI
    10.1109/IC2E.2013.41
  • Filename
    6529279