DocumentCode
3716494
Title
Automatic Classification and Taxonomy Generation for Semi-structured Data
Author
Bernardo Pereira Nunes;Giseli Rabello Lopes;Marco Antonio Casanova
Author_Institution
Dept. of Inf., UNIRIO Rio de Janeiro, Rio de Janeiro, Brazil
fYear
2015
Firstpage
207
Lastpage
214
Abstract
The problem of data classification goes back to the definition of taxonomies covering knowledge areas. With the advent of the Web, the amount of data available increased several orders of magnitude, making manual data classification impossible. This work presents an approach based on the prototype theory to automatically classify semi-structured data, represented by frames, without any previous knowledge about structured classes. Our approach uses a variation of the K-Means algorithm that organizes a set of frames into classes, structured as a strict hierarchy.
Keywords
"Prototypes","Taxonomy","Libraries","Informatics","Electronic mail","Colon","XML"
Publisher
ieee
Conference_Titel
Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/CIT/IUCC/DASC/PICOM.2015.30
Filename
7363072
Link To Document