Title :
Generating Cohesive Semantic Topics from Latent Factors
Author :
Viana Bicalho, Paulo ; De Oliveira Cunha, Tiago ; Jesus Mourao, Fernando Henrique ; Lobo Pappa, Gisele ; Meira, Wagner
Author_Institution :
Comput. Sci., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil
Abstract :
Extracting topics from posts in social networks is a challenging and relevant computational task. Traditionally, topics are extracted by analyzing syntactic properties in the messages, assuming a high correlation between syntax and semantics. This work proposes SToC, a new method for generating more cohesive and meaningful semantic topics within a context. SToC post-processes the output of a Non-Negative Matrix Factorization (NMF) method in order to determine which latent factors should be further merged to improve cohesion. Based on NMF´s output, SToC defines a topics transition graph and uses Markovian theory to merge pairs of topics mutually reachable in this graph. Experiments on two real data sample from Twitter demonstrate that is statistically better than fair baselines in supervised scenarios and able to determine cohesive and semantically valid topics in unsupervised scenarios.
Keywords :
Markov processes; data mining; graph theory; matrix decomposition; social networking (online); text analysis; Markovian theory; SToC; Twitter; latent factor; nonnegative matrix factorization; semantic topic; social network; syntactic property; topics transition graph; Context; Entropy; Measurement; Merging; Nominations and elections; Observatories; Semantics; Latent Factors; Merging Topics; Semantic Topics; Social Networks;
Conference_Titel :
Intelligent Systems (BRACIS), 2014 Brazilian Conference on
Conference_Location :
Sao Paulo
DOI :
10.1109/BRACIS.2014.56