Author/Authors :
Rukhin، نويسنده , , Andrey and Priebe، نويسنده , , Carey E.، نويسنده ,
Abstract :
Let p , s ∈ ( 0 , 1 ] with s > p , let m , n ∈ N with 1 < m < n , and define V={1,…,n}. Let ER(n,p) denote the random graph model on V where each edge is independently included in the graph with probability p. Let κ ( n , p , m , s ) denote the random graph model on V where each edge among the m vertices {1,…,m} is independently included in the graph with probability s and all other edges are independently included with probability p. We view graphs from the ER(n,p) model as “homogeneous”: the probability of the presence of an edge is the same throughout such a graph. On the other hand, we view a graph generated by the κ model as “anomalous”; such a graph possesses increased edge probability among a certain subset of its vertices.
ference setting is to determine whether an observed graph G is “homogeneous” (with some known p) or “anomalous”. In this article, we analyze the statistical power β of the size invariant | E ( G ) | (the number of edges in the graph) and the maximum degree invariant Δ ( G ) in detecting such anomalies. In particular, we demonstrate an interesting phenomenon when comparing the powers of these statistics: the limit theory can be at odds with the finite-sample evidence even for astronomically large graphs. For example, under certain values of p,s and m=m(n), we show that the maximum degree statistic is more powerful ( β Δ > β | E | ) for n ⩽ 10 24 while lim n → ∞ β Δ / β | E | < 1 .