• DocumentCode
    120690
  • Title

    Content based hidden web ranking algorithm(CHWRA)

  • Author

    Batra, Nikhil ; Kumar, Ajit ; Singh, D. ; Rajotia, R.N.

  • Author_Institution
    Dept. of IT, MRIU, Faridabad, India
  • fYear
    2014
  • fDate
    21-22 Feb. 2014
  • Firstpage
    586
  • Lastpage
    589
  • Abstract
    The World Wide Web consists of millions of interconnected web pages that provide information to the user present in any part of the world. The World Wide Web is expanding and growing in size and the complexity of the web pages. That is why it is necessary to retrieve the best or the web pages that are more relevant in terms of information for the query entered by the user in the search engine. To extract the relevancy of a web page, the search engine requires applying retrieval or a ranking module that applies to a ranking algorithm on the web to fetch the web pages in order of the importance of the information entered by the user in the query. The ranking algorithm is much efficient to rank the surface web, i.e. the web pages that can be indexed by the search engine, as well as the hidden web, i.e. the web pages that cannot be indexed by the search engine. This Paper proposed an algorithm consists of: 1) PageRank Algorithm, 2) Term Weighting Technique, 3) Feedback (Likes/Dislikes) and 4) Visitor Count.
  • Keywords
    Internet; Web sites; content-based retrieval; feedback; indexing; search engines; CHWRA; PageRank algorithm; World Wide Web; content based hidden Web ranking algorithm; feedback; indexing; interconnected Web pages; query processing; search engine; term weighting technique; Algorithm design and analysis; Computers; Crawlers; Databases; Educational institutions; Search engines; Web pages; Hidden Web; PageRank;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advance Computing Conference (IACC), 2014 IEEE International
  • Conference_Location
    Gurgaon
  • Print_ISBN
    978-1-4799-2571-1
  • Type

    conf

  • DOI
    10.1109/IAdCC.2014.6779390
  • Filename
    6779390