Title :
One approach to combination of FCA-based local conceptual models for text analysis — grid-based approach
Author :
Butka, P. ; Sarnovsky, M. ; Bednar, P.
Author_Institution :
Center for Inf. Technol., Tech. Univ. of Kosice, Kosice
Abstract :
Formal concept analysis (FCA) is one of the approaches which can be applied in process of conceptual modeling in domain of text documents. FCA can be used for formal analysis of data tables and identification of similar objects - clusters (concepts). Extension of classic FCA (binary table data) is one-sided fuzzy version that works with real values in the object-attribute table (document- term matrix in case of vector representation of textual documents). Computational complexity of creation of concept lattices from large context is considerable. This paper describes one simple approach to creation of simple hierarchy of concepts. Starting set of documents is decomposed to smaller sets of similar documents with the use of clustering algorithm. Then one concept lattice is built upon every cluster using FCA method and these FCA-based models are combined to simple hierarchy of concept lattices using agglomerative clustering algorithm. For our experiments we used GHSOM algorithm for finding of appropriate clusters, then ´upper neighbors´ FCA algorithm was used for building of particular concept lattices. Finally, particular FCA models were labeled by some characteristic terms and simple agglomerative algorithm was used for clustering of local models, with the metric based on these characteristic lattices terms. We have used this approach also to extend it for Grid-based solution in which time consuming building of several local models is distributed between grid computing nodes and then it leads to speed-up of computing of final model.
Keywords :
computational complexity; fuzzy set theory; grid computing; text analysis; agglomerative clustering algorithm; binary table data; computational complexity; concept lattice; formal concept analysis; grid-based approach; local conceptual models; text analysis; Buildings; Clustering algorithms; Data analysis; Distributed computing; Fuzzy sets; Grid computing; Informatics; Information technology; Lattices; Text analysis;
Conference_Titel :
Applied Machine Intelligence and Informatics, 2008. SAMI 2008. 6th International Symposium on
Conference_Location :
Herlany
Print_ISBN :
978-1-4244-2105-3
Electronic_ISBN :
978-1-4244-2106-0
DOI :
10.1109/SAMI.2008.4469150