• DocumentCode
    1644656
  • Title

    Hierarchical semantic model for objectionable Web text content detection

  • Author

    Duan, Jiangjiao ; Zeng, Jianping ; Zhang, Shiyong

  • Author_Institution
    Sch. of Econ., Fudan Univ., Shanghai, China
  • fYear
    2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Objectionable Web text content becomes popular in many web sites on the Internet recently. Since it has been shown that the kind of text content is very harmful to young children, several measures have been taken to detect the objectionable text content. Unlike current methods, a scene-based method is proposed to recognize the objectionable text with aim at improving the performance, especially in the semantic detection. A scene which is defined by a set of sentences is assigned as the topics of objectionable content. Then, a hierarchical semantic model that can describe the scene from different granularity is learnt from the sentence set. Objectionable Web text detection is performed based on the similarity between the text and the model. Experiments are done on real world text sets which come from Web forums, and the results show that the proposed method can achieve better performance than that of keyword-based method with semantic feature selection. The ability in detecting semantic objectionable text is studied by varying several key parameters of the model.
  • Keywords
    Internet; Web sites; text analysis; Internet; Web forums; Web sites; hierarchical semantic model; keyword-based method; objectIonable Web text content detection; real world text sets; scene-based method; semantic feature selection; Computational modeling; Discrete cosine transforms; Filtering; Mathematical model; Matrix converters; Semantics; Web pages; content filtering; objectionable text; semantic model; topic granularity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Anti-Counterfeiting, Security and Identification (ASID), 2012 International Conference on
  • Conference_Location
    Taipei
  • ISSN
    2163-5048
  • Print_ISBN
    978-1-4673-2144-0
  • Electronic_ISBN
    2163-5048
  • Type

    conf

  • DOI
    10.1109/ICASID.2012.6325325
  • Filename
    6325325