DocumentCode :
1088553
Title :
Subtopic Segmentation for Small Corpus Using a Novel Fuzzy Model
Author :
Chang, Tao-Hsing ; Lee, Chia-Hoang
Author_Institution :
Nat. Chiao Tung Univ., Hsinchu
Volume :
15
Issue :
4
fYear :
2007
Firstpage :
699
Lastpage :
709
Abstract :
Subtopic segmentation is a critical task in numerous applications, including information retrieval, automatic summarization, essay scoring, and others. Although several approaches have been developed, many are ineffective for specific domains with a small corpus because of the fuzziness of the semantics of words and sentences in the corpus. This paper explores the problem of subtopic segmentation by proposing a fuzzy model for the semantics of both words and sentences. The model has three characteristics. First, it can deal with the uncertainty in the semantics of words and sentences. Secondly, it can measure the fuzzy similarity between the fuzzy semantics of sentences. Thirdly, it can develop a fuzzy algorithm for segmenting a text into several subtopic segments. The experiments, especially for a short text with a small corpus in a specific domain, indicate that the method can efficiently increase the accuracy of subtopic segmentation over previous methods.
Keywords :
fuzzy set theory; information retrieval; text analysis; word processing; automatic summarization; corpus subtopic segmentation; essay scoring; fuzzy algorithm; fuzzy semantics; fuzzy similarity; information retrieval; text segmentation; Broadcasting; Computer errors; Computer science; Councils; Fuzzy set theory; Fuzzy sets; Information retrieval; Internet; Multidimensional systems; Uncertainty; Fuzzy modeling; fuzzy semantics; semantic similarity measurement; small corpus; topic segmentation;
fLanguage :
English
Journal_Title :
Fuzzy Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6706
Type :
jour
DOI :
10.1109/TFUZZ.2006.889911
Filename :
4286978
Link To Document :
بازگشت