Title :
Document Clustering for Event Identification and Trend Analysis in Market News
Author :
Dey, Lipika ; Mahajan, Anuj ; Haque, SK Mirajul
Author_Institution :
Innovation Labs., Tata Consultancy Services Ltd., Delhi
Abstract :
In this paper we have proposed a stock market analysis system that analyzes financial news items to identify and characterize major events that impact the market. The events have been identified using Latent Dirichlet Allocation(LDA) based topic extraction mechanism. The topic-document data is then clustered using kernel k means algorithm. The clusters are analyzed jointly with the SENSEX raw data to extract major events and their effects. The system has been implemented on capital market news about the Indian share market of the past three years.
Keywords :
data analysis; data mining; financial data processing; pattern clustering; stock markets; text analysis; document text clustering; event extraction system; event identification system; financial data analysis; financial market news analysis; kernel k means algorithm; latent dirichlet allocation; stock market analysis system; text mining system; topic extraction mechanism; topic-document data; trend analysis; Data analysis; Data mining; Economic forecasting; Information analysis; Kernel; Pattern analysis; Portfolios; Stock markets; Text mining; Time series analysis; Document Clustering; Event Analysis; Financial News; Stock Market; Topic Identification; Trend Analysis;
Conference_Titel :
Advances in Pattern Recognition, 2009. ICAPR '09. Seventh International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4244-3335-3
DOI :
10.1109/ICAPR.2009.84