• DocumentCode
    2731384
  • Title

    Web-Based Document Classification Using a Trie-Based Index Structure

  • Author

    Park, Jeahyun ; Park, Juyoung ; Choi, Joongmin

  • Author_Institution
    Hanyang Univ., Ansan
  • fYear
    2007
  • fDate
    5-12 Nov. 2007
  • Firstpage
    52
  • Lastpage
    55
  • Abstract
    An automatic document classification system is useful to manage the massive quantities of documents such as the Web document collection. However, its complicated process of classification has become a serious problem when applying it to general services. In this paper, we suggest an efficient data structure for the document classification and develop a classification system based on a trie-based index structure. This efficient data structure reduces overheads for the task of document classification using naive Bayesian probabilistic models and makes it possible to implement commercial applications. In our system, both learning and classification are performed in a Web-based user interface rather than by a remote application, which contributes to achieve easy control of the classification process and the flexibility of diverse document provision.
  • Keywords
    Bayes methods; Internet; document handling; tree data structures; user interfaces; Bayesian probabilistic models; Web-based document classification; Web-based user interface; data structure; trie-based index structure; Bayesian methods; Computer science; Conference management; Data structures; Equations; Frequency; Intelligent agent; Intelligent structures; Technology management; User interfaces; document classificationWeb-based classification interfacetrie index structure;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology Workshops, 2007 IEEE/WIC/ACM International Conferences on
  • Conference_Location
    Silicon Valley, CA
  • Print_ISBN
    0-7695-3028-1
  • Type

    conf

  • DOI
    10.1109/WI-IATW.2007.70
  • Filename
    4427538