DocumentCode :
3703551
Title :
Document similarity analysis via involving both explicit and implicit semantic couplings
Author :
Qianqian Chen;Liang Hu;Jia Xu;Wei Liu;Longbing Cao
Author_Institution :
Advanced Analytics Institute, University of Technology Sydney, Australia
fYear :
2015
Firstpage :
1
Lastpage :
10
Abstract :
Document similarity analysis is increasingly critical since roughly 80% of big data is unstructured. Accordingly, semantic couplings (relatedness) have been recognized valuable for capturing the relationships between terms (words or phrases). Existing work focuses more on explicit relatedness, with respective models built. In this paper, we propose a comprehensive semantic similarity measure: Semantic Coupling Similarity (SCS), which (1) captures intra-term pair couplings within term pairs represented by patterns of explicit term co-occurrences in a document set, (2) extracts inter-term pair couplings between term pairs indicated by implicit couplings between term pairs through indirectly linked terms and paths between terms after term connections are converted to a graph presentation; and (3) semantic coupling similarity, integrating intra- and inter-term pair couplings towards a comprehensive capturing of explicit and implicit couplings between terms across documents. SCS caters for both synonymy and polysemy, and outperforms baseline methods consistently on all real data sets.
Keywords :
"Semantics","Couplings","Text analysis","Information retrieval","Context modeling","Context","Probabilistic logic"
Publisher :
ieee
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Print_ISBN :
978-1-4673-8272-4
Type :
conf
DOI :
10.1109/DSAA.2015.7344832
Filename :
7344832
Link To Document :
بازگشت