Title :
Concurrency and recovery in full-text indexing
Author :
Soisalon-Soininen, Eljas ; Widmayer, Peter
Author_Institution :
Dept. of Comput. Sci. & Eng., Helsinki Univ. of Technol., Espoo, Finland
Abstract :
An important feature of a document database system is that the documents can be retrieved by searching for words from their contents. In a full-text index, each word of the stored documents can be used as a search key. Inserting a new document into the database automatically triggers a transaction that inserts the words together with their occurrence information into the index. We present solutions to problems that arise when full-text indexing is applied for constantly changing document data, such as WWW pages. We present and analyze an algorithm for full-text indexing with the following properties: concurrent searches are possible and efficient, and the algorithm can be designed such that several indexing processes can be performed concurrently. Moreover, the algorithm allows efficient recovery of the index after failures that can occur while the index is modified. This is important for large indices, because when not prepared for failures, the index may need to be reconstructed from original documents
Keywords :
document handling; full-text databases; indexing; information retrieval; system recovery; WWW pages; concurrent searches; constantly changing document data; document database system; full-text indexing; indexing processes; large indices; occurrence information; search key; stored documents; Algorithm design and analysis; Concurrent computing; Content based retrieval; Database systems; Indexes; Indexing; Information retrieval; Spatial databases; Transaction databases; World Wide Web;
Conference_Titel :
String Processing and Information Retrieval Symposium, 1999 and International Workshop on Groupware
Conference_Location :
Cancun
Print_ISBN :
0-7695-0268-7
DOI :
10.1109/SPIRE.1999.796595