DocumentCode
525680
Title
Feature extraction and clustering-based retrieval for mathematical formulas
Author
Ma, Kai ; Hui, Siu Cheung ; Chang, Kuiyu
Author_Institution
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
fYear
2010
fDate
23-25 June 2010
Firstpage
372
Lastpage
377
Abstract
Mathematical formulas or expressions are essential for presenting scientific knowledge in many research documents in academic areas such as physics and mathematics. Searching for related mathematical formulas is an important but challenging problem as formulas contain both structural and semantic information. Such information is hidden inside the mathematical expressions of the formulas. To support effective formula search, it is necessary to extract the structural and semantic features from the mathematical presentation of the formulas faithfully. In this paper, we propose an effective approach for formula feature extraction. To evaluate the proposed approach, the extracted features are tested with three popular clustering algorithms, namely K-means, Self Organizing Map (SOM), and Agglomerative Hierarchical Clustering (AHC), for formula retrieval. The performance of the clustering-based retrieval is measured based on a dataset of 881 formulas and promising results have been achieved.
Keywords
feature extraction; information retrieval; mathematics computing; pattern clustering; self-organising feature maps; K-mean clustering algorithms; agglomerative hierarchical clustering; clustering-based retrieval; mathematical formulas; self organizing map clustering algorithms; semantic feature extraction; semantic information; structural feature extraction; structural information; Automatic testing; Clustering algorithms; Data mining; Feature extraction; Information retrieval; Knowledge engineering; Mathematics; Organizing; Physics computing; Search engines; clustering; feature extracction; formula search; information retrieval;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering and Data Mining (SEDM), 2010 2nd International Conference on
Conference_Location
Chengdu
Print_ISBN
978-1-4244-7324-3
Electronic_ISBN
978-89-88678-22-0
Type
conf
Filename
5542894
Link To Document