مرکز منطقه ای اطلاع رساني علوم و فناوري - Clustering techniques for rule extraction from unstructured text fragments

DocumentCode :

2643859

Title :

Clustering techniques for rule extraction from unstructured text fragments

Author :

Clark, Alan ; Filev, Dimitar

Author_Institution :

Adv. Manuf. Technol. Dev., Ford Motor Co., Dearborn, MI, USA

fYear :

2005

fDate :

26-28 June 2005

Firstpage :

793

Lastpage :

798

Abstract :

This paper focuses on techniques for clustering unstructured text fragments which are generated from a rule extraction agent. The text fragments represent paragraphs of text containing potential rules. The latent semantic indexing method is applied to map the unstructured text into a linear vector space. Similar text fragments are identified based on the similarity between their vector representations. The problem of clustering based on general similarity measures that are different than the conventional distance based measures is discussed. A new version of the mountain clustering method is developed to address the problem of identifying groups of similar vectors that correspond to documents with analogous content. Several clustering algorithms are compared in their ability to satisfactorily cluster these text fragments into sets of related concepts. An intelligent agent algorithm for extraction of rules from text documents is proposed and demonstrated.

Keywords :

pattern clustering; programming language semantics; software agents; text analysis; intelligent agent; latent semantic indexing; linear vector space; mountain clustering; rule extraction; text document; unstructured text fragment clustering; Clustering algorithms; Clustering methods; Indexing; Information retrieval; Intelligent agent; Knowledge based systems; Manufacturing processes; Rain; Technology planning; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Fuzzy Information Processing Society, 2005. NAFIPS 2005. Annual Meeting of the North American

Print_ISBN :

0-7803-9187-X

Type :

conf

DOI :

10.1109/NAFIPS.2005.1548641

Filename :

1548641

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2643859