Title : 
HT2X[ML]: An HTML converter
         
        
            Author : 
Baghdadi, Hossein Shahsavand ; Ranaivo-Malançon, Bali
         
        
            Author_Institution : 
Fac. of Inf. Technol., Multimedia Univ., Cyberjaya, Malaysia
         
        
        
        
        
            Abstract : 
Capturing specific data among an HTML file and encapsulate it somehow to be usable for other tools, is a significant challenge in web mining. This paper is going to introduce HT2X[ML] which is a tool to extract customized information from HTML files in both user-customized and automatic way and convert them into well-formed XML and plain text format. The result would be suitable to use by other tools in any purposes.
         
        
            Keywords : 
XML; data encapsulation; data mining; hypermedia markup languages; text analysis; HT2X[ML]; HTML converter; automatic way; data capturing; data encapsulation; information extract; plain text format; user-customized way; web mining; well-formed XML; Converters; Data mining; Graphical user interfaces; HTML; Web pages; XML; HTML; Plain Text; XML;
         
        
        
        
            Conference_Titel : 
Electronics and Information Engineering (ICEIE), 2010 International Conference On
         
        
            Conference_Location : 
Kyoto
         
        
            Print_ISBN : 
978-1-4244-7679-4
         
        
            Electronic_ISBN : 
978-1-4244-7681-7
         
        
        
            DOI : 
10.1109/ICEIE.2010.5559899