【读论文】Multiple Kernel Learning, Conic Duality, and the SMO Algorithm(2004)
多核学习、圆锥对偶和SMO算法
Francis R. Bach,Gert R. G. Lanckriet
DOI: 10.1145/1015330.1015424
摘要:
While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM),and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadratically-constrained quadratic program(QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in large-scale implementations of the SVM cannot be applied because the cost function is non-differentiable.We propose a novel dual formulation of the QCQP as a second-order cone programming problem, and show how to exploit the technique of Moreau-Yosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMO-based algorithm is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.
虽然经典的基于核的分类器是基于单个核的,但在实践中,通常希望分类器基于多个核的组合。Lanckriet等人(2004)考虑了用于支持向量机(SVM)的核矩阵的圆锥组合,并证明了这种组合的系数的优化简化为称为二次约束二次规划(QCQP)的凸优化问题。不幸的是,目前的凸优化工具箱只能对少量的核和少量的数据点解决这个问题。此外,在SVM的大规模实现中必不可少的序列最小优化(SMO)技术不能被应用,因为代价函数是不可微分的。我们提出了QCQP作为二阶锥规划问题的一个新的对偶形式, 并展示了如何利用Moreau-Yosida正则化技术来产生可以应用SMO技术的公式。我们给出的实验结果表明,我们的基于SMO的算法比当前优化工具箱中可用的通用内点方法更有效。
结论:
We have presented an algorithm for efficient learning of kernels for the support vector machine. Our algorithm is based on applying sequential minimization techniques to a smoothed version of a convex nonsmooth optimization problem. The good scaling with respect to the number of data points makes it possible to learn kernels for large scale problems, while the good scaling with respect to the number of basis kernels opens up the possibility of application to largescale feature selection, in which the algorithm selects kernels that define non-linear mappings on subsets of input features.
我们提出了一种有效学习支持向量机核的算法。我们的算法基于将序列极小化<技术应用于凸非光滑优化问题的平滑版本。与数据点数量相关的良好伸缩性使得学习大规模问题的核成为可能,而与基础核数量相关的良好伸缩性则为大规模特征选择的应用开辟了可能性,其中,算法选择定义输入特征子集上非线性映射的核。
1.该论文研究了什么?
利用序列最小优化(SMO)解决非光滑优化问题,提出了一个二次约束二次规划(QCQP)作为二阶锥规划问题的一个新的对偶形式, 并展示了如何利用Moreau-Yosida正则化技术来产生可以应用SMO技术的公式。
2.创新点在哪?
真看不明白,原文译文在评论链接里,二次约束二次规划什么的,这道题我不会做,不会做。
3.研究方法是什么?
4.得到的结论是什么?
[1] Bach F R , Lanckriet G R G . Multiple kernel learning, conic duality, and the SMO algorithm[J]. 2004.
|