DocumentCode :
525680
Title :
Feature extraction and clustering-based retrieval for mathematical formulas
Author :
Ma, Kai ; Hui, Siu Cheung ; Chang, Kuiyu
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
fYear :
2010
fDate :
23-25 June 2010
Firstpage :
372
Lastpage :
377
Abstract :
Mathematical formulas or expressions are essential for presenting scientific knowledge in many research documents in academic areas such as physics and mathematics. Searching for related mathematical formulas is an important but challenging problem as formulas contain both structural and semantic information. Such information is hidden inside the mathematical expressions of the formulas. To support effective formula search, it is necessary to extract the structural and semantic features from the mathematical presentation of the formulas faithfully. In this paper, we propose an effective approach for formula feature extraction. To evaluate the proposed approach, the extracted features are tested with three popular clustering algorithms, namely K-means, Self Organizing Map (SOM), and Agglomerative Hierarchical Clustering (AHC), for formula retrieval. The performance of the clustering-based retrieval is measured based on a dataset of 881 formulas and promising results have been achieved.
Keywords :
feature extraction; information retrieval; mathematics computing; pattern clustering; self-organising feature maps; K-mean clustering algorithms; agglomerative hierarchical clustering; clustering-based retrieval; mathematical formulas; self organizing map clustering algorithms; semantic feature extraction; semantic information; structural feature extraction; structural information; Automatic testing; Clustering algorithms; Data mining; Feature extraction; Information retrieval; Knowledge engineering; Mathematics; Organizing; Physics computing; Search engines; clustering; feature extracction; formula search; information retrieval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering and Data Mining (SEDM), 2010 2nd International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-7324-3
Electronic_ISBN :
978-89-88678-22-0
Type :
conf
Filename :
5542894
Link To Document :
بازگشت