• DocumentCode
    5413
  • Title

    Efficient Methods for Overlapping Group Lasso

  • Author

    Lei Yuan ; Jun Liu ; Jieping Ye

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Arizona State Univ., Tempe, AZ, USA
  • Volume
    35
  • Issue
    9
  • fYear
    2013
  • fDate
    Sept. 2013
  • Firstpage
    2104
  • Lastpage
    2116
  • Abstract
    The group Lasso is an extension of the Lasso for feature selection on (predefined) nonoverlapping groups of features. The nonoverlapping group structure limits its applicability in practice. There have been several recent attempts to study a more general formulation where groups of features are given, potentially with overlaps between the groups. The resulting optimization is, however, much more challenging to solve due to the group overlaps. In this paper, we consider the efficient optimization of the overlapping group Lasso penalized problem. We reveal several key properties of the proximal operator associated with the overlapping group Lasso, and compute the proximal operator by solving the smooth and convex dual problem, which allows the use of the gradient descent type of algorithms for the optimization. Our methods and theoretical results are then generalized to tackle the general overlapping group Lasso formulation based on the eq norm. We further extend our algorithm to solve a nonconvex overlapping group Lasso formulation based on the capped norm regularization, which reduces the estimation bias introduced by the convex penalty. We have performed empirical evaluations using both a synthetic and the breast cancer gene expression dataset, which consists of 8,141 genes organized into (overlapping) gene sets. Experimental results show that the proposed algorithm is more efficient than existing state-of-the-art algorithms. Results also demonstrate the effectiveness of the nonconvex formulation for overlapping group Lasso.
  • Keywords
    cancer; concave programming; gradient methods; group theory; learning (artificial intelligence); breast cancer gene expression dataset; capped norm regularization; convex dual problem; feature selection; gradient descent type; nonconvex overlapping group Lasso formulation; nonoverlapping group structure; optimization; overlapping group Lasso penalized problem; proximal operator; smooth problem; Acceleration; Algorithm design and analysis; Convergence; Convex functions; Indexes; Optimization; Silicon; Sparse learning; difference of convex programming; overlapping group Lasso; proximal operator; Algorithms; Breast Neoplasms; Computer Simulation; Female; Gene Expression Profiling; Humans; Pattern Recognition, Automated;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2013.17
  • Filename
    6409353