• DocumentCode
    62845
  • Title

    Parallelized Jaccard-based learning method and MapReduce implementation for mobile devices recognition from massive network data

  • Author

    Liu Jun ; Li Yinzhou ; Cuadrado, Felix ; Uhlig, S. ; Lei Zhenming

  • Author_Institution
    Beijing Key Lab. of Network Syst. Archit. & Convergence, Beijing Univ. of Posts & Telecommun., Beijing, China
  • Volume
    10
  • Issue
    7
  • fYear
    2013
  • fDate
    Jul-13
  • Firstpage
    71
  • Lastpage
    84
  • Abstract
    The ability of accurate and scalable mobile device recognition is critically important for mobile network operators and ISPs to understand their customers´ behaviours and enhance their user experience. In this paper, we propose a novel method for mobile device model recognition by using statistical information derived from large amounts of mobile network traffic data. Specifically, we create a Jaccard-based coefficient measure method to identify a proper keyword representing each mobile device model from massive unstructured textual HTTP access logs. To handle the large amount of traffic data generated from large mobile networks, this method is designed as a set of parallel algorithms, and is implemented through the MapReduce framework which is a distributed parallel programming model with proven low-cost and high-efficiency features. Evaluations using real data sets show that our method can accurately recognise mobile client models while meeting the scalability and producer-independency requirements of large mobile network operators. Results show that a 91.5% accuracy rate is achieved for recognising mobile client models from 2 billion records, which is dramatically higher than existing solutions.
  • Keywords
    consumer behaviour; hypermedia; learning (artificial intelligence); mobile computing; parallel algorithms; statistical analysis; transport protocols; ISP; Jaccard based coefficient measure method; MapReduce framework; MapReduce implementation; customers behaviours; distributed parallel programming model; massive network data; mobile device model; mobile devices recognition; mobile network operators; mobile network traffic data; parallel algorithms; parallelized Jaccard-based learning method; statistical information; textual HTTP access logs; Computational modeling; Data models; Distributed processing; Mobile communication; Mobile computing; Mobile handsets; Object recognition; Jaccard coefficient measurement; MapReduce; data mining; distributed computing; mobile device recognition;
  • fLanguage
    English
  • Journal_Title
    Communications, China
  • Publisher
    ieee
  • ISSN
    1673-5447
  • Type

    jour

  • DOI
    10.1109/CC.2013.6571290
  • Filename
    6571290