Title :
Recommending Documents for Complex Question Exploration by Analyzing Collective Browsing Behavior
Author :
Asarina, Alya ; Simek, Olga
Author_Institution :
MIT Lincoln Lab., Lexington, MA, USA
Abstract :
Vast amounts of data available online and in other digital repositories make it challenging for users to find the right sources of information. In this paper, we present a novel approach for recommending documents to users by analyzing user browsing behavior, and demonstrate the effectiveness of our methods using an original data set. We conducted a study to collect a novel data set of document browsing behavior observed as users research complex questions. Based on this dataset, we developed machine learning algorithms to predict which documents are useful and should therefore be recommended. Following the intuition that useful documents are likely to be similar to other documents that are useful for the same task, we incorporated features based on measures of similarity between pairs of documents. Accurately computing similarity between documents is challenging due to the high dimensionality of bag-of-words representations (tens of thousands of dimensions). We therefore used several dimensionality reduction techniques, including Canonical Correlation Analysis, in order to project bag-of-words representations into meaningfully reduced spaces. We show that our algorithms significantly outperform baseline approaches on three different prediction tasks. Our work thus lays out a new direction for recommendation algorithms based on a relatively small amount of labeled data.
Keywords :
data reduction; document handling; information retrieval; learning (artificial intelligence); recommender systems; bag-of-word representations; canonical correlation analysis; digital repositories; dimensionality reduction techniques; document similarity; machine learning algorithms; recommendation algorithms; user document browsing behavior analysis; Algorithm design and analysis; Correlation; History; Prediction algorithms; Recommender systems; Semantics; Vectors; browsing history; dimensionality reduction; document prioritization; question answering; recommender systems;
Conference_Titel :
System Sciences (HICSS), 2015 48th Hawaii International Conference on
Conference_Location :
Kauai, HI
DOI :
10.1109/HICSS.2015.152