Title :
A probabilistic approach to multi-document summarization for generating a tiled summary
Author :
Saravanan, M. ; Raman, S. ; Ravindran, B.
Author_Institution :
Indian Inst. of Technol., Madras, India
Abstract :
Due to data overload and time-critical nature of information need, automatic summarization of documents plays a significant role in information retrieval and text data mining. This paper discusses the design of a multi-document summarizer that uses Katz´s K-mixture model for term distribution. The model helps in ranking the sentences by a modified term weight assignment. The system has been evaluated against the frequently occurring sentences in the summaries generated by a set of human subjects. Our system outperforms other auto-summarizers at different extraction levels of summarization with respect to the ideal summary, and is close to the ideal summary at 40% extraction level.
Keywords :
data mining; information retrieval; text analysis; Katz K-mixture model; automatic multidocument summarization; information need; information retrieval; modified term weight assignment; probabilistic approach; term distribution; text data mining; tiled summary generation; Computational intelligence; Data mining; Frequency; Humans; Information retrieval; Internet; Measurement standards; Natural languages; Time factors; Writing;
Conference_Titel :
Computational Intelligence and Multimedia Applications, 2005. Sixth International Conference on
Print_ISBN :
0-7695-2358-7
DOI :
10.1109/ICCIMA.2005.8