Title :
Performance maximization for question classification by subset tree kernel using support vector machines
Author :
Rahman, Muhammad Arifur ; Scurtu, Vitalie
Author_Institution :
Dept. of Phys., Jahangirnagar Univ., Dhaka
Abstract :
Question answering systems use information retrieval (IR) and information extraction (IE) methods to retrieve documents containing a valid answer. Question classification plays an important role in the question answer frame to reduce the gap between question and answer. This paper presents our research work on automatic question classification through machine learning approaches. We have experimented with machine learning algorithms Support Vector Machines (SVM) using kernel methods. An effective way to integrate syntactic structures for question classification in machine learning algorithms is the use of tree kernel (TK) functions. Here we use SubSet Tree kernel with Bag of words. Trade-off between training error and margin, Cost-factor and the decay factor has significant impact when we use SVM for the mentioned kernel type. The experiments determined the individual impact for Trade-off between training error and margin, Cost-factor and the decay factor and later the combined effect for Trade-off between training error and margin, Cost-factor. Depending on these result we also figure out some hyperplanes which can maximize the performance. Based on some standard data set outcomes of our experiment for question classification is promising.
Keywords :
classification; error statistics; information retrieval; learning (artificial intelligence); optimisation; support vector machines; trees (mathematics); automatic question classification; cost-factor; decay factor; document retrieval; information extraction; information retrieval; machine learning approach; maximisation; question answering system; subset tree kernel function; support vector machine; training error statistics; Classification tree analysis; Computer interfaces; Information retrieval; Kernel; Machine learning; Machine learning algorithms; Optical computing; Support vector machine classification; Support vector machines; Text categorization; Precision; Question Answering; Question Classification; Recall; SST; SVM; kernel;
Conference_Titel :
Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
Conference_Location :
Khulna
Print_ISBN :
978-1-4244-2135-0
Electronic_ISBN :
978-1-4244-2136-7
DOI :
10.1109/ICCITECHN.2008.4802979