DocumentCode :
1798837
Title :
POS weighted TF-IDF algorithm and its application for an MOOC search engine
Author :
Ruilin Xu
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign (UIUC) Urbana, Urbana, IL, USA
fYear :
2014
fDate :
7-9 July 2014
Firstpage :
868
Lastpage :
873
Abstract :
Term Frequency-Inverse Document Frequency (TF-IDF) has been one of the most highly used information retrieval methods for many years. Although there are several variants of TF-IDF optimizing for solving various problems, very few of them considered the properties of the query terms themselves. We found that there could be a big potential for improvement. When people type out a query, usually the verbs and the nouns are the primary keywords that directly define the query. The adjectives and adverbs are generally the secondary keywords, which describe the query more accurately. Other terms might not be as important as the terms just mentioned and could be the tertiary keywords. Based on this fact, this paper proposes an algorithm improved upon the original TF-IDF algorithm - POS Weighted TF-IDF algorithm. This algorithm takes every query term´s part of speech (POS) into account and assigns each query term frequency a different weight value according to the POS of that term. Based on the POS Weighted TF-IDF Algorithm, we developed COURSES, a massive open online courses (MOOC) search engine, and achieved very positive results, which shows the effectiveness of the proposed algorithm.
Keywords :
courseware; query processing; search engines; COURSES; MOOC search engine; POS weighted TF-IDF algorithm; information retrieval methods; massive open online courses search engine; query term frequency; query term part of speech; term frequency-inverse document frequency; Algorithm design and analysis; Information filters; Search engines; User interfaces; XML; Information Retrieval; MOOC search engine; POS Weighted; TF-IDF;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Audio, Language and Image Processing (ICALIP), 2014 International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4799-3902-2
Type :
conf
DOI :
10.1109/ICALIP.2014.7009919
Filename :
7009919
Link To Document :
بازگشت