• DocumentCode
    2726196
  • Title

    Document Clustering for Event Identification and Trend Analysis in Market News

  • Author

    Dey, Lipika ; Mahajan, Anuj ; Haque, SK Mirajul

  • Author_Institution
    Innovation Labs., Tata Consultancy Services Ltd., Delhi
  • fYear
    2009
  • fDate
    4-6 Feb. 2009
  • Firstpage
    103
  • Lastpage
    106
  • Abstract
    In this paper we have proposed a stock market analysis system that analyzes financial news items to identify and characterize major events that impact the market. The events have been identified using Latent Dirichlet Allocation(LDA) based topic extraction mechanism. The topic-document data is then clustered using kernel k means algorithm. The clusters are analyzed jointly with the SENSEX raw data to extract major events and their effects. The system has been implemented on capital market news about the Indian share market of the past three years.
  • Keywords
    data analysis; data mining; financial data processing; pattern clustering; stock markets; text analysis; document text clustering; event extraction system; event identification system; financial data analysis; financial market news analysis; kernel k means algorithm; latent dirichlet allocation; stock market analysis system; text mining system; topic extraction mechanism; topic-document data; trend analysis; Data analysis; Data mining; Economic forecasting; Information analysis; Kernel; Pattern analysis; Portfolios; Stock markets; Text mining; Time series analysis; Document Clustering; Event Analysis; Financial News; Stock Market; Topic Identification; Trend Analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Pattern Recognition, 2009. ICAPR '09. Seventh International Conference on
  • Conference_Location
    Kolkata
  • Print_ISBN
    978-1-4244-3335-3
  • Type

    conf

  • DOI
    10.1109/ICAPR.2009.84
  • Filename
    4782752