Abstract :
As text databases become very large, conventional computers cannot provide satisfactory performance and the use of unconventional hardware organisations are necessary. Textract is a hardware accelerator used in conjunction with a normal computer. When such systems are connected with fast local area networks, large parallel systems can be built which can provide dramatic improvements in performance. Textract runs under UNIX on the SUN range of VME based workstations to provide an integrated text retrieval and management system. The system takes advantage of SUN´s Network File System to allow distributed databases to be unified by initiating parallel, network wide searches. Textract is an information retrieval engine which has been specifically designed for fast linear searching, achieving high performance by using parallel search term processors