DocumentCode
2334207
Title
Distributed Web mining using Bayesian networks from multiple data streams
Author
Chen, R. ; Sivakumar, K. ; Kargupta, H.
Author_Institution
Washington State Univ., Pullman, WA, USA
fYear
2001
fDate
2001
Firstpage
75
Lastpage
82
Abstract
We present a collective approach to mining Bayesian networks from distributed heterogenous Web-log data streams. In this approach we first learn a local Bayesian network at each site using the local data. Then each site identifies the observations that are most likely to be evidence of coupling between local and non-local variables and transmits a subset of these observations to a central site. Another Bayesian network is learnt at the central site using the data transmitted from the local site. The local and central Bayesian networks are combined to obtain a collective Bayesian network that models the entire data. We applied this technique to mining multiple data streams, where data centralization is difficult because of large response time and scalability issues. Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented
Keywords
belief networks; data mining; distributed algorithms; information resources; learning (artificial intelligence); collective Bayesian network; collective approach; data centralization; distributed Web mining; distributed heterogenous Web-log data streams; local Bayesian network; local variables; multiple data streams; nonlocal variables; response time; scalability issues; Bayesian methods; Data mining; Delay; Design optimization; Network servers; Web design; Web mining; Web server; Web sites; World Wide Web;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location
San Jose, CA
Print_ISBN
0-7695-1119-8
Type
conf
DOI
10.1109/ICDM.2001.989503
Filename
989503
Link To Document