• DocumentCode
    3207584
  • Title

    A Theme-based Search Technique

  • Author

    Al-Chalabi, Nida ; Shihab, Khalil

  • Author_Institution
    Sultan Qaboos Univ., Muscat
  • fYear
    2007
  • fDate
    23-26 July 2007
  • Firstpage
    699
  • Lastpage
    707
  • Abstract
    The current search engines usually return a large number of irrelevant documents for a certain query. As a result, accessing such information and filtering out these documents can cause frustration and often result in waste of time and effort for the users while surfing the web. This is mainly because of the underlying techniques used in these engines. These techniques are mostly based in the frequency of the keywords of the query in the HTML code. In addition, issues such as dealing with classifying the pages found for a query according to previous visits along with features needed to make intelligent decisions regarding the access patterns of the users are not considered. This work presents an intelligent search engine, called ORCA that returns the most relevant documents for user´s queries. This search engine analyses the queries and builds themes (models) to be used when the engine is confronted with similar queries. The intelligent component is used for constructing a model of the user behavior and using that model to fetch and even prefetch information and documents considered of interest to the user. It uses both latent semantic analysis and web page feature selection for clustering web pages. Latent semantic analysis is used to find the semantic relations between keywords, and between documents.
  • Keywords
    classification; decision making; information filtering; knowledge based systems; query processing; search engines; ORCA intelligent search engine; Web page classification; Web page clustering; Web page feature selection; document filtering; intelligent decision making; latent semantic analysis; theme-based search technique; Computer science; Explosives; Frequency; HTML; Information filtering; Information filters; Information retrieval; Search engines; Uniform resource locators; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    E-Commerce Technology and the 4th IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services, 2007. CEC/EEE 2007. The 9th IEEE International Conference on
  • Conference_Location
    Tokyo
  • Print_ISBN
    0-7695-2913-5
  • Type

    conf

  • DOI
    10.1109/CEC-EEE.2007.15
  • Filename
    4285288