Title :
A Machine Learning Based Topic Exploration and Categorization on Surveys
Author :
George, C.P. ; Wang, Doris Z. ; Wilson, J.N. ; Epstein, L.M. ; Garland, Philip ; Suh, A.
Author_Institution :
Dept. of Comput. & Inf. Sci. & Eng., Univ. of Florida, Gainesville, FL, USA
Abstract :
This paper describes an automatic topic extraction, categorization, and relevance ranking model for multi-lingual surveys and questions that exploits machine learning algorithms such as topic modeling and fuzzy clustering. Automatically generated question and survey categories are used to build question banks and category-specific survey templates. First, we describe different pre-processing steps we considered for removing noise in the multilingual survey text. Second, we explain our strategy to automatically extract survey categories from surveys based on topic models. Third, we describe different methods to cluster questions under survey categories and group them based on relevance. Last, we describe our experimental results on a large group of unique, real-world survey datasets from the German, Spanish, French, and Portuguese languages and our refining methods to determine meaningful and sensible categories for building question banks. We conclude this document with possible enhancements to the current system and impacts in the business domain.
Keywords :
document handling; fuzzy set theory; learning (artificial intelligence); pattern clustering; French languages; German languages; Portuguese languages; Spanish languages; automatically generated question; category-specific survey templates; fuzzy clustering; machine learning based topic exploration; multilingual survey text; multilingual surveys; question banks; relevance ranking model; survey categories; survey categorization; topic modeling; Buildings; Clustering algorithms; Computational modeling; Education; Large scale integration; Noise; Vocabulary; categorization; fuzzy clustering; survey clustering; topic modeling;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2012 11th International Conference on
Conference_Location :
Boca Raton, FL
Print_ISBN :
978-1-4673-4651-1
DOI :
10.1109/ICMLA.2012.132