DocumentCode
1802170
Title
Estimating Top N Hosts in Cardinality Using Small Memory Resources
Author
Ishibashi, Keisuke ; Mori, Tatsuya ; Kawahara, Ryoichi ; Hirokawa, Yutaka ; Kobayashi, Atsushi ; Yamamoto, Kimihiro ; Sakamoto, Hitoaki
Author_Institution
NTT Corporation
fYear
2006
fDate
2006
Firstpage
29
Lastpage
29
Abstract
We propose a method to find N hosts that have the N highest cardinalities, where cardinality is the number of distinct items such as the number of flows, ports, or peer hosts. The method also estimates their cardinalities. While existing algorithms to find the top N frequent items can be directly applied to find N hosts that send the N largest numbers of packets through packet data stream, finding hosts that have the N highest cardinalities requires tables of previously seen items for each host to check whether an item of an arrival packet is new, which requires a lot of memory. Even if we use the existing cardinality estimation methods, we still need to have cardinality information about each host. In this paper, we use the property of cardinality estimation, in which the cardinality of intersections of multiple data sets can be estimated with cardinality information of each data set. Using the property, we propose an algorithm that does not need to maintain tables for each host, but only for partitioned addresses of a host and estimate the cardinality of a host as the intersection of cardinalities of partitioned addresses. We also propose a method to find top N hosts in cardinalities which is to be monitored to detect anomalous behavior in networks. We evaluate our algorithm through actual backbone traffic data. While the estimation accuracy of our scheme degrades for small cardinalities, as for the top 100 hosts, the accuracy of our algorithm with 4, 096 tables is almost the same as having tables of every hosts.
Keywords
Communication system traffic control; Data security; Degradation; Hardware; Laboratories; Monitoring; Network servers; Partitioning algorithms; Spine; Telecommunication traffic;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering Workshops, 2006. Proceedings. 22nd International Conference on
Conference_Location
Atlanta, GA, USA
Print_ISBN
0-7695-2571-7
Type
conf
DOI
10.1109/ICDEW.2006.56
Filename
1623824
Link To Document