DocumentCode :
1856020
Title :
HT2X[ML]: An HTML converter
Author :
Baghdadi, Hossein Shahsavand ; Ranaivo-Malançon, Bali
Author_Institution :
Fac. of Inf. Technol., Multimedia Univ., Cyberjaya, Malaysia
Volume :
1
fYear :
2010
fDate :
1-3 Aug. 2010
Abstract :
Capturing specific data among an HTML file and encapsulate it somehow to be usable for other tools, is a significant challenge in web mining. This paper is going to introduce HT2X[ML] which is a tool to extract customized information from HTML files in both user-customized and automatic way and convert them into well-formed XML and plain text format. The result would be suitable to use by other tools in any purposes.
Keywords :
XML; data encapsulation; data mining; hypermedia markup languages; text analysis; HT2X[ML]; HTML converter; automatic way; data capturing; data encapsulation; information extract; plain text format; user-customized way; web mining; well-formed XML; Converters; Data mining; Graphical user interfaces; HTML; Web pages; XML; HTML; Plain Text; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics and Information Engineering (ICEIE), 2010 International Conference On
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-7679-4
Electronic_ISBN :
978-1-4244-7681-7
Type :
conf
DOI :
10.1109/ICEIE.2010.5559899
Filename :
5559899
Link To Document :
بازگشت