مرکز منطقه ای اطلاع رساني علوم و فناوري - Statistical Learning Algorithm for Tree Similarity

DocumentCode :

3166836

Title :

Statistical Learning Algorithm for Tree Similarity

Author :

Takasu, Atsuhiro ; Fukagawa, Daiji ; Akutsu, Tatsuya

Author_Institution :

Nat. Inst. of Inf., Tokyo

fYear :

2007

fDate :

28-31 Oct. 2007

Firstpage :

667

Lastpage :

672

Abstract :

Tree edit distance is one of the most frequently used distance measures for comparing trees. When using the tree edit distance, we need to determine the cost of each operation, but this is a labor-intensive and highly skilled task. This paper proposes an algorithm for learning the costs of tree edit operations from training data consisting of pairs of similar trees. To formalize the cost learning problem, we define a probabilistic model for tree alignment that is a variant of tree edit distance. Then, the parameters of the model are estimated using the expectation maximization (EM) technique. In this paper, we develop an algorithm for parameter learning that is polynomial in time (O{mn²d⁶)) and space (O{n²d⁴)) where n, d, and m represent the size of the trees, the maximum degree of trees, and the number of training pairs of trees, respectively.

Keywords :

computational complexity; expectation-maximisation algorithm; learning (artificial intelligence); trees (mathematics); cost learning problem; distance measures; expectation maximization technique; probabilistic model; statistical learning algorithm; tree edit distance; tree similarity; Classification tree analysis; Costs; Data mining; Filtering algorithms; Filters; Informatics; Polynomials; Statistical learning; Training data; XML;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on

Conference_Location :

Omaha, NE

ISSN :

1550-4786

Print_ISBN :

978-0-7695-3018-5

Type :

conf

DOI :

10.1109/ICDM.2007.38

Filename :

4470308

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3166836