DocumentCode :
941923
Title :
The complexity of information extraction
Author :
Abu-Mostafa, Yaser S.
Volume :
32
Issue :
4
fYear :
1986
fDate :
7/1/1986 12:00:00 AM
Firstpage :
513
Lastpage :
525
Abstract :
How difficult are decision problems based on natural data, such as pattern recognition? To answer this question, decision problems are characterized by introducing four measures defined on a Boolean function f of N variables: the implementation cost C(f) , the randomness R(f) , the deterministic entropy H(f) , and the complexity K(f) . The highlights and main results are roughly as follows, l) C(f) \\approx R(f) H(f) \\approx K(f) , all measured in bits. 2) Decision problems based on natural data are partially random (in the Kolmogorov sense) and have low entropy with respect to their dimensionality, and the relations between the four measures translate to lower and upper bounds on the cost of solving these problems. 3) Allowing small errors in the implementation of f saves a lot in the iow entropy case but saves nothing in the high-entropy case. If f is partially structured, the implementation cost is reduced substantially.
Keywords :
Boolean functions; Decision making; Boolean functions; Computational complexity; Cost function; Data mining; Decision making; Entropy; Helium; Pattern recognition; Upper bound;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.1986.1057209
Filename :
1057209
Link To Document :
بازگشت