DocumentCode
2756727
Title
Analysis and evaluation of unstructured data: text mining versus natural language processing
Author
Gharehchopogh, F.S. ; Khalifelu, Z.A.
Author_Institution
Dept. of Comput. Eng., Hacettepe Univ., Ankara, Turkey
fYear
2011
fDate
12-14 Oct. 2011
Firstpage
1
Lastpage
4
Abstract
Nowadays, most of information saved in companies are as unstructured models. Retrieval and extraction of the information is essential works and importance in semantic web areas. Many of these requirements will be depend on the storage efficiency and unstructured data analysis. Merrill Lynch recently estimated that more than 80% of all potentially useful business information is unstructured data. The large number and complexity of unstructured data opens up many new possibilities for the analyst. We analyze both structured and unstructured data individually and collectively. Text mining and natural language processing are two techniques with their methods for knowledge discovery form textual context in documents. In this study, text mining and natural language techniques will be illustrated. The aim of this work comparison and evaluation the similarities and differences between text mining and natural language processing for extraction useful information via suitable themselves methods.
Keywords
business data processing; computational complexity; data analysis; data mining; information retrieval; information storage; semantic Web; text analysis; business information; document handling; information extraction; information retrieval; knowledge discovery; natural language processing; semantic Web; storage efficiency; text mining; textual context; unstructured data analysis; unstructured data complexity; unstructured data evaluation; Databases; Information retrieval; Natural language processing; Pragmatics; Text categorization; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Application of Information and Communication Technologies (AICT), 2011 5th International Conference on
Conference_Location
Baku
Print_ISBN
978-1-61284-831-0
Type
conf
DOI
10.1109/ICAICT.2011.6111017
Filename
6111017
Link To Document