DocumentCode :
3188943
Title :
Bit Sequences and Biclustering of Text Documents
Author :
Mimaroglu, Selim ; Uehara, Kuniaki
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
51
Lastpage :
56
Abstract :
We propose a new technique for clustering of text doc- uments that relies on a biclustering structure constructed on terms and documents. Our approach makes use of a greedy algorithm applied to bit sequences associated with each group of synonym terms. The use of bit sequences al- lows us to achieve superior time performance. Additionally, our algorithm provides meaningful cluster descriptions.
Keywords :
Clustering algorithms; Data mining; Dictionaries; Frequency; Matrix decomposition; Natural languages; Singular value decomposition; Terminology; Time division multiplexing; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
Print_ISBN :
978-0-7695-3019-2
Electronic_ISBN :
978-0-7695-3033-8
Type :
conf
DOI :
10.1109/ICDMW.2007.38
Filename :
4476646
Link To Document :
بازگشت