SMO序列最小最优化算法中关于解析方法的证明
在SMO这篇文档中,我们已经详细介绍了SMO从0实现的详细步骤,当时在学习到生成
α
2
n
e
w
,
u
n
c
\alpha_2^{new,unc}
α2new,unc?时,我们只给出了定理内容,并没有介绍定理的详细证明。 即如下定理:
在如下最优化问题中
min
?
α
1
,
α
2
???????
W
(
α
1
,
α
2
)
=
1
2
K
11
α
1
2
+
1
2
K
22
α
2
2
+
y
1
y
2
K
12
α
1
α
2
?
(
α
1
+
α
2
)
+
y
1
α
1
∑
i
=
3
N
y
i
α
i
K
i
1
+
y
2
α
2
∑
i
=
3
N
y
i
α
i
K
i
2
s
.
t
.
?????
α
1
y
1
+
α
2
y
2
=
?
∑
i
=
3
N
y
i
α
i
=
k
0
≤
α
i
≤
C
,
?????
i
=
1
,
2
\min\limits_{\alpha_1,\alpha_2}\,\,\,\,\,\,\,W(\alpha_1,\alpha_2)=\frac{1}{2}K_{11}\alpha_1^2+\frac{1}{2}K_{22}\alpha_2^2+y_1y_2K_{12}\alpha_1\alpha_2-\\(\alpha_1+\alpha_2)+y_1\alpha_1\sum\limits_{i=3}^N{y_i\alpha_iK_{i1}}+y_2\alpha_2\sum\limits_{i=3}^Ny_i\alpha_iK_{i2}\\ s.t.\,\,\,\,\,\alpha_1y_1+\alpha_2y_2=-\sum\limits_{i=3}^N{y_i\alpha_i}=k\\ 0\le\alpha_i\le C,\,\,\,\,\,i=1,2
α1?,α2?min?W(α1?,α2?)=21?K11?α12?+21?K22?α22?+y1?y2?K12?α1?α2??(α1?+α2?)+y1?α1?i=3∑N?yi?αi?Ki1?+y2?α2?i=3∑N?yi?αi?Ki2?s.t.α1?y1?+α2?y2?=?i=3∑N?yi?αi?=k0≤αi?≤C,i=1,2 沿着未经剪辑时的解是:
α
2
n
e
w
,
u
n
c
=
α
2
o
l
d
+
y
2
(
E
1
?
E
2
)
η
\alpha_2^{new,unc}=\alpha_2^{old}+\frac{y_2(E_1-E_2)}{\eta}
α2new,unc?=α2old?+ηy2?(E1??E2?)? 其中:
η
=
K
11
+
K
22
?
2
K
12
=
∣
∣
?
(
x
1
)
?
?
(
x
2
)
∣
∣
2
\eta=K_{11}+K_{22}-2K_{12}=||\phi(x_1)-\phi(x_2)||^2
η=K11?+K22??2K12?=∣∣?(x1?)??(x2?)∣∣2
这里只给出了定理的结果,下面将针对这个定理给出详细的证明。 证明: 首先引进记号:
v
i
=
∑
j
=
3
N
α
j
y
j
K
(
x
i
,
x
j
)
=
g
(
x
i
)
?
∑
j
=
1
2
α
j
y
j
K
(
x
i
,
x
j
)
?
b
{v_i} = \sum\limits_{j = 3}^N {{\alpha _j}{y_j}K({x_i},{x_j}) = g({x_i}) - \sum\limits_{j = 1}^2 {{\alpha _j}{y_j}K({x_i},{x_j}) - b} }
vi?=j=3∑N?αj?yj?K(xi?,xj?)=g(xi?)?j=1∑2?αj?yj?K(xi?,xj?)?b 其中
g
(
x
i
)
=
∑
j
=
1
N
α
j
y
j
K
(
x
i
,
x
j
)
+
b
g(x_i)=\sum\limits_{j=1}^N\alpha_jy_jK(x_i,x_j)+b
g(xi?)=j=1∑N?αj?yj?K(xi?,xj?)+b,接着可以把
W
(
α
1
,
α
2
)
W(\alpha_1,\alpha_2)
W(α1?,α2?)进行代换得到:
min
?
α
1
,
α
2
???????
W
(
α
1
,
α
2
)
=
1
2
K
11
α
1
2
+
1
2
K
22
α
2
2
+
y
1
y
2
K
12
α
1
α
2
?
(
α
1
+
α
2
)
+
y
1
α
1
v
1
+
y
2
α
2
v
2
\min\limits_{\alpha_1,\alpha_2}\,\,\,\,\,\,\,W(\alpha_1,\alpha_2)=\frac{1}{2}K_{11}\alpha_1^2+\frac{1}{2}K_{22}\alpha_2^2+y_1y_2K_{12}\alpha_1\alpha_2-\\(\alpha_1+\alpha_2)+y_1\alpha_1v_1+y_2\alpha_2v_2
α1?,α2?min?W(α1?,α2?)=21?K11?α12?+21?K22?α22?+y1?y2?K12?α1?α2??(α1?+α2?)+y1?α1?v1?+y2?α2?v2? 既然我们的目的就是求
α
2
n
e
w
\alpha_2^{new}
α2new?那么我们可以借助等式将目标函数化成只有
α
2
n
e
w
\alpha_2^{new}
α2new?的函数。因此在约束条件中有:
α
1
y
1
=
k
?
α
2
y
2
\alpha_1y_1=k-\alpha_2y_2
α1?y1?=k?α2?y2?及
y
i
2
=
1
y_i^2=1
yi2?=1,于是有
α
1
\alpha_1
α1?:
α
1
=
(
k
?
α
2
y
2
)
y
1
\alpha_1=(k-\alpha_2y_2)y_1
α1?=(k?α2?y2?)y1? 然后将
α
1
\alpha_1
α1?的式子带入
w
(
α
1
,
α
2
)
w(\alpha_1,\alpha_2)
w(α1?,α2?)中得到:
W
(
α
2
)
=
1
2
K
11
(
k
?
α
2
y
2
)
2
+
1
2
K
22
α
2
2
+
y
2
K
12
(
k
?
α
2
y
2
)
α
2
?
(
(
k
?
α
2
y
2
)
y
1
+
α
2
)
+
(
k
?
α
2
y
2
)
v
1
+
y
2
α
2
v
2
W(\alpha_2)=\frac{1}{2}K_{11}(k-\alpha_2y_2)^2+\frac{1}{2}K_{22}\alpha_2^2+y_2K_{12}(k-\alpha_2y_2)\alpha_2-\\((k-\alpha_2y_2)y_1+\alpha_2)+(k-\alpha_2y_2)v_1+y_2\alpha_2v_2
W(α2?)=21?K11?(k?α2?y2?)2+21?K22?α22?+y2?K12?(k?α2?y2?)α2??((k?α2?y2?)y1?+α2?)+(k?α2?y2?)v1?+y2?α2?v2? 上述式子中只有
α
2
\alpha_2
α2?这一个变量,于是对其求导并令其等于0得到:
?
W
?
α
2
=
K
11
α
2
+
K
22
α
2
?
2
K
12
α
2
?
K
11
k
y
2
+
K
12
k
y
2
+
y
1
y
2
?
1
?
v
1
y
2
+
y
2
v
2
=
0
\frac{{\partial W}}{{\partial {\alpha _2}}} = {K_{11}}{\alpha _2} + {K_{22}}{\alpha _2} - 2{K_{12}}{\alpha _2} - {K_{11}}k{y_2} + {K_{12}}k{y_2} + {y_1}{y_2} - 1 - {v_1}{y_2} + {y_2}{v_2}=0
?α2??W?=K11?α2?+K22?α2??2K12?α2??K11?ky2?+K12?ky2?+y1?y2??1?v1?y2?+y2?v2?=0
(
K
11
+
K
22
?
2
K
12
)
α
2
=
y
2
(
y
2
?
y
1
+
k
K
11
?
k
K
12
+
v
1
?
v
2
)
=
y
2
[
y
2
?
y
1
+
k
K
11
?
k
K
12
+
(
g
(
x
1
)
?
∑
j
=
1
2
y
j
α
j
K
1
j
?
b
)
?
(
g
(
x
2
)
?
∑
j
=
1
2
y
j
α
j
K
2
j
?
b
)
]
\begin{aligned} ({K_{11}} + {K_{22}} - 2{K_{12}}){\alpha _2} &= {y_2}({y_2} - {y_1} + k{K_{11}} - k{K_{12}} + {v_1} - {v_2})\\ &={y_2}[{y_2} - {y_1} + k{K_{11}} - k{K_{12}} + (g({x_1}) - \sum\limits_{j = 1}^2 {{y_j}{\alpha _j}{K_{1j}} - b} ) - (g({x_2}) - \sum\limits_{j = 1}^2 {{y_j}{\alpha _j}{K_{2j}} - b} )] \end{aligned}
(K11?+K22??2K12?)α2??=y2?(y2??y1?+kK11??kK12?+v1??v2?)=y2?[y2??y1?+kK11??kK12?+(g(x1?)?j=1∑2?yj?αj?K1j??b)?(g(x2?)?j=1∑2?yj?αj?K2j??b)]? 再将
k
=
α
1
o
l
d
y
1
+
α
2
o
l
d
y
2
k=\alpha_1^{old}y_1+\alpha_2^{old}y_2
k=α1old?y1?+α2old?y2?带入得到:
(
K
11
+
k
22
?
2
K
12
)
α
2
n
e
w
,
u
n
c
=
y
2
(
(
K
11
+
K
22
?
2
K
12
)
α
2
o
l
d
y
2
+
y
2
?
y
1
+
g
(
x
1
)
?
g
(
x
2
)
)
=
(
K
11
+
K
22
?
2
K
12
)
α
2
o
l
d
+
y
2
(
E
1
?
E
2
)
\begin{aligned} ({K_{11}} + {k_{22}} - 2{K_{12}})\alpha _2^{new,unc} &= {y_2}(({K_{11}} + {K_{22}} - 2{K_{12}})\alpha _2^{old}{y_2} + {y_2} - {y_1} + g({x_1}) - g({x_2}))\\ &=({K_{11}} + {K_{22}} - 2{K_{12}})\alpha _2^{old} + {y_2}({E_1} - {E_2}) \end{aligned}
(K11?+k22??2K12?)α2new,unc??=y2?((K11?+K22??2K12?)α2old?y2?+y2??y1?+g(x1?)?g(x2?))=(K11?+K22??2K12?)α2old?+y2?(E1??E2?)? 设
η
=
K
11
+
K
22
?
2
K
12
\eta = {K_{11}} + {K_{22}} - 2{K_{12}}
η=K11?+K22??2K12?带入,于是得到:
α
2
n
e
w
,
u
n
c
=
α
2
o
l
d
+
y
2
(
E
1
?
E
2
)
η
\alpha_2^{new,unc}=\alpha_2^{old}+\frac{y_2(E_1-E_2)}{\eta}
α2new,unc?=α2old?+ηy2?(E1??E2?)? 到这里所有的证明过程已经结束,其实细看证明步骤并不难,就是一些代换和无约束求导求极值的方法,在得到解析方法的证明后,我们就可以放心使用定理了。
|