Title of article :
An ontology-based technique for preserving user preferences in document-category evolutions
Author/Authors :
Yen-Hsien Lee1، نويسنده , , Chih-Ping Wei2، نويسنده , , Paul Jen-Hwa Hu3، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2011
Pages :
14
From page :
507
To page :
520
Abstract :
Influxes of new documents over time necessitate reorganization of document categories that a user has created previously. As documents are available in increasing quantities and accelerating frequencies, the manual approach to reorganizing document categories becomes prohibitively tedious and ineffective, thus making a system-oriented approach appealing. Previous research (Larsen & Aone, 1999; Pantel & Lin, 2002) largely has followed the category-discovery approach, which groups documents by using a document-clustering technique to partition a document corpus. This approach does not consider existing categories a user created previously, which in effect reflect his or her document-grouping preference. A handful of studies (Wei, Hu, & Dong, 2002; Wei, Hu, & Lee, 2009) have taken a category-evolution approach to develop lexicon-based techniques for preserving user preference in document-category reorganizations, but have serious limitations. Responding to the significance of document-category reorganizations and addressing the fundamental problems of salient, lexicon-based techniques, we develop an ontology-based category evolution (ONCE), a technique that first enriches a concept hierarchy by incorporating important concept descriptors (jointly referred to as an ontology) and then employs the resulting enriched ontology to support category evolutions at a concept level rather than analyzing and comparing feature vectors at the lexicon level. We empirically evaluate our proposed technique and compare it with two benchmark techniques: CE2 (a lexicon-based category-evolution technique) and hierarchical agglomerative clustering (HAC; a conventional hierarchical document-clustering technique). Overall, our results show that the ONCE technique is more effective than are CE2 and HAC, across all the scenarios studied. Furthermore, the completeness of a concept hierarchy has important impacts on the performance of the proposed technique. Our results have some important implications for further research.
Journal title :
Journal of the American Society for Information Science and Technology
Serial Year :
2011
Journal title :
Journal of the American Society for Information Science and Technology
Record number :
994404
Link To Document :
بازگشت