Title :
Combining discriminative re-ranking and co-training for parsing Mandarin speech transcripts
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA
Abstract :
Discriminative re-ranking has been able to significantly improve parsing performance, and co-training has proven to be an effective weakly supervised learning algorithm to bootstrap parsers from a small in-domain seed labeled corpus using a large amount of unlabeled in-domain data. In this paper, we present systematic investigations on combining discriminative re-ranking and co-training, including co-training re-ranked parsers and co-training re-rankers. We show that combining discriminative re-ranking and co-training could improve the F-measure by 1.8%-2% absolute compared to co-training two state-of-the-art Chinese parsers without re-ranking, for parsing Mandarin broadcast news and conversation transcripts.
Keywords :
grammars; learning (artificial intelligence); natural language processing; speech processing; Chinese parsers; Mandarin broadcast news; Mandarin speech transcripts; bootstrap parsers; conversation transcripts; discriminative co-training; discriminative re-ranking; in-domain seed labeled corpus; parsing performance; supervised learning algorithm; Automatic speech recognition; Broadcasting; Degradation; Gold; Laboratories; Natural languages; Parameter estimation; Speech analysis; State estimation; Supervised learning; Mandarin speech; co-training; conversational speech; discriminative re-ranking; parsing;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960681