DocumentCode
559641
Title
Text mining: Finding right documents from large collection of unstructured documents
Author
Amarakoon, Savidu ; Caldera, Amitha
Author_Institution
Sch. of Comput., Univ. of Colombo., Colombo, Sri Lanka
fYear
2011
fDate
24-26 Oct. 2011
Firstpage
5
Lastpage
10
Abstract
In our day to day life we come across unstructured data in many forms. These include books journals, audio / video files and unstructured text such as emails, web pages and documents. And these data can be a vital source in order to make informed decisions. For example in any company there is a set of people who can be identified as the paramount from among its workforce. Identifying what is common among them and identifying others like them would undoubtedly improve the output of the company. This is the basis on which this research was carried out. The central aspect of the research was to use text mining techniques to mine the data in a set of documents and identify what are the common characteristics among them and then to identify other documents which contains these characteristics.
Keywords
data mining; text analysis; data mining; right document finding; text mining techniques; unstructured document large collection; Indexing; Java; Libraries; Portable document format; Text mining; Data Mining; Document-based Searching; Lucene; Text Mining; Unstructured Data;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining and Intelligent Information Technology Applications (ICMiA), 2011 3rd International Conference on
Conference_Location
Macao
Print_ISBN
978-1-4673-0231-9
Type
conf
Filename
6108390
Link To Document