DocumentCode
456360
Title
Multilingual Web Documents: the system Hyperling
Author
Nguyen, Tuan-Dang ; Zreik, Khaldoun
Author_Institution
GREYC, Caen Univ.
Volume
1
fYear
0
fDate
0-0 0
Firstpage
578
Lastpage
582
Abstract
Hyperling is a formal, language independent, system dealing with hyperdocuments (Web sites). It observes that links structure and context embed crucial information for both hyperdocument retrieving and hyperdocument mining process. For this we suggest a clustering Hyperling that deals with multilingual hyperdocuments (Web sites). In order to determine the number and frontiers between the different used languages, we adopt a distributional approach to pre process the hyperdocument structure before clustering it. Our main hypothesis considers links related to the same language be regrouped together in a cluster. From this we can conclude that the more important generated clusters represent the dominant languages
Keywords
Web sites; document handling; natural languages; Hyperling; Web sites; hyperdocuments; multilingual Web documents; Clustering algorithms; Data mining; Frequency; Information retrieval; Laboratories; Machine learning; Magnetohydrodynamics; Research and development; Statistics; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Communication Technologies, 2006. ICTTA '06. 2nd
Conference_Location
Damascus
Print_ISBN
0-7803-9521-2
Type
conf
DOI
10.1109/ICTTA.2006.1684435
Filename
1684435
Link To Document