• DocumentCode
    456360
  • Title

    Multilingual Web Documents: the system Hyperling

  • Author

    Nguyen, Tuan-Dang ; Zreik, Khaldoun

  • Author_Institution
    GREYC, Caen Univ.
  • Volume
    1
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    578
  • Lastpage
    582
  • Abstract
    Hyperling is a formal, language independent, system dealing with hyperdocuments (Web sites). It observes that links structure and context embed crucial information for both hyperdocument retrieving and hyperdocument mining process. For this we suggest a clustering Hyperling that deals with multilingual hyperdocuments (Web sites). In order to determine the number and frontiers between the different used languages, we adopt a distributional approach to pre process the hyperdocument structure before clustering it. Our main hypothesis considers links related to the same language be regrouped together in a cluster. From this we can conclude that the more important generated clusters represent the dominant languages
  • Keywords
    Web sites; document handling; natural languages; Hyperling; Web sites; hyperdocuments; multilingual Web documents; Clustering algorithms; Data mining; Frequency; Information retrieval; Laboratories; Machine learning; Magnetohydrodynamics; Research and development; Statistics; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technologies, 2006. ICTTA '06. 2nd
  • Conference_Location
    Damascus
  • Print_ISBN
    0-7803-9521-2
  • Type

    conf

  • DOI
    10.1109/ICTTA.2006.1684435
  • Filename
    1684435