• DocumentCode
    1844706
  • Title

    A Method for Text Categorization Using BP Network Based on Hadoop

  • Author

    Jia Yusheng ; Zhu Qing

  • Author_Institution
    Sch. of Software Eng., Beijing Univ. of Technol., Beijing, China
  • fYear
    2013
  • fDate
    21-23 June 2013
  • Firstpage
    818
  • Lastpage
    821
  • Abstract
    Based on the analysis of the Hadoop open source distributed computing platform as well as the parallel training methods for the BP network, for the disadvantage of time-consuming when using large amounts of texts to train the BP network, we designed a BP network text categorization model based on data parallel method on Hadoop platform using the MapReduce programming model. The model uses the method of batch training, it adjusts the network weights after getting the accumulated error by summing every sample training error on each node, and the categorization of text is done in parallel. The method based on Hadoop platform improves the training speed of BP network and efficiency of text categorization, and achieves good categorization performance.
  • Keywords
    backpropagation; neural nets; parallel algorithms; public domain software; text analysis; BP network; Hadoop open source distributed computing platform; MapReduce programming model; batch training; data parallel method; parallel training methods; text categorization; Neural networks; Neurons; Program processors; Support vector machine classification; Text categorization; Training; Vectors; BP Neural Network; Hadoop; Text Categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational and Information Sciences (ICCIS), 2013 Fifth International Conference on
  • Conference_Location
    Shiyang
  • Type

    conf

  • DOI
    10.1109/ICCIS.2013.219
  • Filename
    6643135