Title :
Constructing Topical Hierarchies in Heterogeneous Information Networks
Author :
Chi Wang ; Danilevsky, Marina ; Jialu Liu ; Desai, Narayan ; Heng Ji ; Jiawei Han
Author_Institution :
Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Abstract :
A digital data collection (e.g., scientific publications, enterprise reports, news, and social media) can often be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality concept hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high quality, multi-typed topical hierarchies.
Keywords :
data acquisition; data mining; network theory (graphs); pattern clustering; text analysis; clustering algorithm; digital data collection; heterogeneous information network data; high quality multityped topical hierarchies; high-quality concept hierarchies; multityped topical hierarchies; ranking algorithm; textual phrases; topical pattern mining; topical pattern ranking; Algorithm design and analysis; Clustering algorithms; Data mining; Data models; Distributed databases; Inference algorithms; Query processing; heterogeneous network; information network; topic hierarchy;
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
DOI :
10.1109/ICDM.2013.53