DocumentCode :
423547
Title :
Linguistic feature extraction using independent component analysis
Author :
Honkela, Timo ; Hyvarinen, Aapo
Author_Institution :
Neural Networks Res. Center, Helsinki Univ. of Technol., Finland
Volume :
1
fYear :
2004
fDate :
25-29 July 2004
Lastpage :
284
Abstract :
Our aim is to find syntactic and semantic relationships of words based on the analysis of corpora. We propose the application of independent component analysis, which seems to have clear advantages over two classic methods: latent semantic analysis and self-organizing maps. Latent semantic analysis is a simple method for automatic generation of concepts that are useful, e.g., in encoding documents for information retrieval purposes. However, these concepts cannot easily be interpreted by humans. Self-organizing maps can be used to generate an explicit diagram which characterizes the relationships between words. The resulting map reflects syntactic categories in the overall organization and semantic categories in the local level. The self-organizing map does not, however, provide any explicit distinct categories for the words. Independent component analysis applied on word context data gives distinct features which reflect syntactic and semantic categories. Thus, independent component analysis gives features or categories that are both explicit and can easily be interpreted by humans. This result can be obtained without any human supervision or tagged corpora that would have some predetermined morphological, syntactic or semantic information.
Keywords :
feature extraction; independent component analysis; linguistics; self-organising feature maps; independent component analysis; latent semantic analysis; linguistic feature extraction; self-organizing maps; Feature extraction; Frequency; Humans; Independent component analysis; Information analysis; Information retrieval; Laboratories; Matrix decomposition; Neural networks; Self organizing feature maps;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
ISSN :
1098-7576
Print_ISBN :
0-7803-8359-1
Type :
conf
DOI :
10.1109/IJCNN.2004.1379914
Filename :
1379914
Link To Document :
بازگشت