Title : 
Optimal stopping and effective machine complexity in learning
         
        
            Author : 
Wang, Changfeng ; Venkatesh, Santosh S. ; Judd, J. Stephen
         
        
            Author_Institution : 
Dept. of Electr. Eng., Pennsylvania Univ., Philadelphia, PA, USA
         
        
        
        
        
            Abstract : 
We study learning in a general class of machines which return a (variable) linear form of a (fixed) set of nonlinear transformations of points in an input space. A fixed machine in this class accepts inputs X from an arbitrary input space and produces scalar outputs Y=Σi=1dψi(X)wi *+ξ=ψ(X)´w*+ξ (1). Here, w*=(w1*,…,w d*)´ is a fixed vector of real weights representing the target concept to be learned, for each i, ψi(X) is a fixed real function of the inputs, with ψ(X)=(ψ1(X),…ψd(X))´ the corresponding vector of functions, and ξ is a random noise term. We suppose that the learner receives an i.i.d., random sample of examples (X1,Y1),…,(Xn,Yn) generated according to the joint distribution on input-output pairs (X,Y) induced through the medium of the (unknown) relation (1) and a fixed (unknown) distribution on input-noise pairs (X,ξ). The goal of the learner is to infer a hypothesis w=(w1,…wd )´ with small (mean-square) generalisation error E(Y-ψ(X)´w) 2 on future random examples (X,Y) generated independently of the training sample from the same underlying distribution. Here E denotes expectation with respect to the underlying probability distribution generating the examples
         
        
            Keywords : 
computational complexity; generalisation (artificial intelligence); learning (artificial intelligence); minimisation; neural nets; random processes; transforms; effective machine complexity; fixed machine; generalisation error; input space; input-output pairs; joint distribution; learner; learning; nonlinear transformations; optimal stopping; probability distribution; random noise term; random sample; real weights; scalar outputs; H infinity control; Machine learning; Probability distribution; Vectors;
         
        
        
        
            Conference_Titel : 
Information Theory, 1995. Proceedings., 1995 IEEE International Symposium on
         
        
            Conference_Location : 
Whistler, BC
         
        
            Print_ISBN : 
0-7803-2453-6
         
        
        
            DOI : 
10.1109/ISIT.1995.531518