Title :
Tuning semantic association for modelling textual data
Author :
Otero, Juan M. ; Rodriguez, Ansel Y. ; Medina-Pagola, José E.
Author_Institution :
Dept. of Appl. Math., Havana Univ., Havana, Cuba
Abstract :
Text information processing depends critically on the proper representation of documents. Traditional models, like the vector space model, have significant limitations because they do not consider semantic relations amongst terms. Global Association Distance Model (GADM) is an alternative that includes this consideration for document representation, assuming basically that two documents should be closer if the shortest formal distances amongst terms in each document are similar. The association strength function used to model the semantic relations among terms, based on its formal distances is a critical feature of GADM. In this paper the association strength function is analyzed, a family of piecewise association strength functions is proposed and a Simulated Annealing algorithm is used to tune it and to obtain an optimal model of semantic relation. We evaluate this significance for topic classification task.
Keywords :
document handling; sensor fusion; simulated annealing; association strength function; document representation; formal distances; global association distance model; simulated annealing algorithm; text information processing; textual data modelling; topic classification task; tuning semantic association; Annealing; Semantics; Simulated annealing; association strength function; representation of documents; simulated annealing; vector space model;
Conference_Titel :
Research Challenges in Information Science (RCIS), 2011 Fifth International Conference on
Conference_Location :
Gosier
Print_ISBN :
978-1-4244-8670-0
Electronic_ISBN :
2151-1349
DOI :
10.1109/RCIS.2011.6006821