DocumentCode
2839468
Title
Analysis of paraphrased corpus and lexical-based approach to Chinese paraphrasing
Author
Zhang, Yan ; Kashioka, Hideki
Author_Institution
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
fYear
2004
fDate
15-18 Dec. 2004
Firstpage
325
Lastpage
328
Abstract
We firstly analyze the language phenomena and distribution characteristics of Chinese spontaneous utterances already paraphrased by other approaches. Based on the information obtained from a corpus, our lexical-based approach is proposed to paraphrase Chinese spoken language. Our purpose is to transform various expressions into simplified expressions with the same meanings. Chinese verbs are the main constituents in sentences, and with their flexibility they play an important role in expressing structures, especially for transitive verbs. Furthermore, negative verb expressions also appear frequently to express enquiries in question utterances. Therefore, we design four types of paraphrasing templates based on lexical information and the characteristics of the corpus: (1) synonym replacement; (2) Chinese transitive verbs; (3) verbs with two objects; (4) the transformation of negative expressions. Our experiment found that the lexical-based approach is effective for Chinese paraphrasing.
Keywords
linguistics; natural languages; speech processing; speech recognition; Chinese paraphrasing; Chinese spontaneous utterances; Chinese transitive verbs; language distribution characteristics; language phenomena; lexical-based approach; negative verb expressions; paraphrased corpus; paraphrasing templates; simplified expressions; spoken language translation; synonym replacement; transitive verbs; Cities and towns; Databases; Information analysis; Laboratories; Natural languages; Tagging;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN
0-7803-8678-7
Type
conf
DOI
10.1109/CHINSL.2004.1409652
Filename
1409652
Link To Document