Title :
Context driven approach for extracting relevant documents from WWW
Author :
Sarika ; Chaudhary, Meena
Author_Institution :
Manav Rachna Coll. of Eng., Faridabad, India
Abstract :
The information world WWW has more than 3 billion HTML pages and these web pages gain access through search engines only. Search engine is a program that searches the document for specified set of keywords and returns a list of documents where any or all of the specified keywords were found. As more information becomes available on the web, it is more difficult to provide effective search services for internet users. It is assumed that the user do not always formulate search queries using the best terms. This leads to increase in irrelevant search results. Moreover synonyms for the query terms are not searched for. Another problem is improper indexing of web documents. This leads to problem in information retrieval as the query terms does not correspond to words by which documents are indexed. Thus indexing of web documents affects its relevancy as well as web latency. A promising approach to overcoming these problems is Latent Semantic Indexing. This indexing scheme uses Singular Value Decomposition (SVD) to find the underlying latent semantic structure and relevant pages as a result.
Keywords :
Internet; Web sites; indexing; query formulation; search engines; singular value decomposition; HTML pages; Internet users; SVD; WWW; Web document indexing; Web latency; Web pages; context driven approach; information world; latent semantic indexing; latent semantic structure; relevant document extraction; search engines; search query formulation; singular value decomposition; Context; Crawlers; Indexing; Matrix decomposition; Search engines; Semantics; Latent Semantic Indexing; Singular Value Decomposition; Word wide web; relevant pages;
Conference_Titel :
Computing, Communication & Automation (ICCCA), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8889-1
DOI :
10.1109/CCAA.2015.7148491