• DocumentCode
    2247649
  • Title

    Online topic detection and tracking of financial news based on hierarchical clustering

  • Author

    Dai, Xiang-ying ; Chen, Qing-cai ; Wang, Xiao-long ; Xu, Jun

  • Author_Institution
    Intell. Comput. Res. Center, Harbin Inst. of Technol., Shenzhen, China
  • Volume
    6
  • fYear
    2010
  • fDate
    11-14 July 2010
  • Firstpage
    3341
  • Lastpage
    3346
  • Abstract
    In this paper, we apply TDT technology to the vertical search engine in the financial field. The returned results are grouped into several topics with the stock as the unit. Then we show the topics to the users in time series order. As a result, users can easily learn about the important events which belong to a stock. Moreover, the causes and the effects of these events can also be found out easily. We improve the common agglomerative hierarchical clustering algorithm based on average-link method, which is then used to implement the retrospective topic detection and the online topic detection of news stories of the stocks. Additionally, the improved single pass clustering algorithm is employed to accomplish topic tracking. We consider that the feature terms which occur in the title of a news story contribute more during the similarity calculation and increase their corresponding weights. Experiments are performed on two datasets which are annotated by human judgment. The results show that the proposed method can effectively detect and track the online financial topics.
  • Keywords
    information retrieval; pattern clustering; portals; search engines; stock markets; text analysis; time series; TDT technology; agglomerative hierarchical clustering algorithm; average-link method; financial news; online topic detection; online topic tracking; retrospective topic detection; single pass clustering algorithm; stock news; time series; topic tracking; vertical search engine; Clustering algorithms; Clustering methods; Computational modeling; Cybernetics; Machine learning; Measurement; Web pages; Agglomerative Hierarchical Clustering; Topic Detection and Tracking; Vector Space Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4244-6526-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2010.5580677
  • Filename
    5580677