• DocumentCode
    525680
  • Title

    Feature extraction and clustering-based retrieval for mathematical formulas

  • Author

    Ma, Kai ; Hui, Siu Cheung ; Chang, Kuiyu

  • Author_Institution
    Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2010
  • fDate
    23-25 June 2010
  • Firstpage
    372
  • Lastpage
    377
  • Abstract
    Mathematical formulas or expressions are essential for presenting scientific knowledge in many research documents in academic areas such as physics and mathematics. Searching for related mathematical formulas is an important but challenging problem as formulas contain both structural and semantic information. Such information is hidden inside the mathematical expressions of the formulas. To support effective formula search, it is necessary to extract the structural and semantic features from the mathematical presentation of the formulas faithfully. In this paper, we propose an effective approach for formula feature extraction. To evaluate the proposed approach, the extracted features are tested with three popular clustering algorithms, namely K-means, Self Organizing Map (SOM), and Agglomerative Hierarchical Clustering (AHC), for formula retrieval. The performance of the clustering-based retrieval is measured based on a dataset of 881 formulas and promising results have been achieved.
  • Keywords
    feature extraction; information retrieval; mathematics computing; pattern clustering; self-organising feature maps; K-mean clustering algorithms; agglomerative hierarchical clustering; clustering-based retrieval; mathematical formulas; self organizing map clustering algorithms; semantic feature extraction; semantic information; structural feature extraction; structural information; Automatic testing; Clustering algorithms; Data mining; Feature extraction; Information retrieval; Knowledge engineering; Mathematics; Organizing; Physics computing; Search engines; clustering; feature extracction; formula search; information retrieval;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering and Data Mining (SEDM), 2010 2nd International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4244-7324-3
  • Electronic_ISBN
    978-89-88678-22-0
  • Type

    conf

  • Filename
    5542894