DocumentCode :
3106745
Title :
Similarity of Temporal Query Logs Based on ARIMA Model
Author :
Liu, Ning ; Nong, Shuzhen ; Yan, Jun ; Zhang, Benyu ; Chen, Zheng ; Li, Ying
Author_Institution :
Microsoft Research Asia, China
fYear :
2006
fDate :
Dec. 2006
Firstpage :
975
Lastpage :
979
Abstract :
A challenging issue faced by modern information retrieval is that of determining and satisfying users¿ requirements relying only on very short text queries. In this paper, we propose an algorithm to find out related queries based on Auto-Regressive Integrated Moving Average (ARIMA) Model. First, we select and estimate ARIMA model of the temporal query logs. And then each query is denoted by a sequence of coefficients. We use the correlation of ARIMA coefficients as the similarity measurement. We call it as the ARIMA Temporal Similarity (ARIMA TS). This similarity describes how strongly two time series are linearly related. On the other hand, the ARIMA model could also be treated as a dimensionality reduction procedure. It can save storage space for a large database of the query logs. In addition, ARIMA model could be used as a tool to predict the trend of a query. The experimental results on two query logs of MSN search engine 1 demonstrate that the proposed approach can achieve better similarity measurement efficiently.
Keywords :
Asia; Content based retrieval; Data mining; Databases; Euclidean distance; Frequency; Information retrieval; Predictive models; Search engines; Stochastic processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong, China
ISSN :
1550-4786
Print_ISBN :
0-7695-2701-7
Type :
conf
DOI :
10.1109/ICDM.2006.144
Filename :
4053138
Link To Document :
بازگشت