• DocumentCode
    2916640
  • Title

    Optimised phrase querying and browsing of large text databases

  • Author

    Bahle, Dirk ; Williams, Hugh E. ; Zobel, Justin

  • Author_Institution
    Dept. of Comput. Sci., R. Melbourne Inst. of Technol., Vic., Australia
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    11
  • Lastpage
    19
  • Abstract
    Most search systems for querying large document collections, e.g., Web search engines, are based on well-understood information retrieval principles. These systems are both efficient and effective in finding answers to many user information needs, expressed through informal ranked or structured Boolean queries. Phrase querying and browsing are additional techniques that can augment or replace conventional querying tools. The authors propose optimisations for phrase querying with a nextword index, an efficient structure for phrase based searching. We show that careful consideration of which search terms are evaluated in a query plan and optimisation of the order of evaluation of the plan can reduce query evaluation costs by more than a factor of five. We conclude that, for phrase querying and browsing with nextword indexes, an ordered query plan should be used for all browsing and querying. Moreover, we show that optimised phrase querying is practical on large text collections
  • Keywords
    database indexing; query processing; text analysis; very large databases; Web search engines; information retrieval principles; large document collection querying; large text collections; large text database browsing; nextword index; optimised phrase querying; order of evaluation; ordered query plan; phrase based searching; phrase querying; query evaluation costs; query plan; search systems; search terms; structured Boolean queries; user information needs; Computer science; Cost function; Data structures; Databases; Information retrieval; Query processing; Search engines; Testing; Web search; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science Conference, 2001. ACSC 2001. Proceedings. 24th Australasian
  • Conference_Location
    Gold Coast, Qld.
  • ISSN
    1530-0900
  • Print_ISBN
    0-7695-0963-0
  • Type

    conf

  • DOI
    10.1109/ACSC.2001.906618
  • Filename
    906618