最小二乘法求解单变量线性回归
已知数据集
(
x
1
,
y
1
)
,
(
x
2
,
y
2
)
.
.
.
(
x
N
,
y
N
)
{(x_1, y_1), (x_2, y_2)...(x_N, y_N)}
(x1?,y1?),(x2?,y2?)...(xN?,yN?) 假设一元线性回归方程为
y
?
=
b
?
x
+
a
y^* = b*x+a
y?=b?x+a,接下来用最小二乘法求解a和b
损
失
函
数
L
(
a
,
b
)
=
Σ
i
=
1
N
(
y
i
?
?
y
i
)
2
=
Σ
i
=
1
N
(
b
?
x
i
+
a
?
y
i
)
2
损失函数\mathfrak{L}(a, b) = \Sigma_{i=1}^N (y_i^*-y_i)^2 = \Sigma_{i=1}^N (b*x_i+a-y_i)^2
损失函数L(a,b)=Σi=1N?(yi???yi?)2=Σi=1N?(b?xi?+a?yi?)2
?
L
?
a
=
Σ
i
=
1
N
2
(
b
?
x
i
+
a
?
y
i
)
=
Σ
i
=
1
N
2
b
x
i
+
2
a
N
?
Σ
i
=
1
N
y
i
=
2
b
N
x
 ̄
+
2
a
N
?
N
y
 ̄
=
2
N
(
b
x
 ̄
+
a
?
y
 ̄
)
\frac {\partial \mathfrak{L}} {\partial a} = \Sigma_{i=1}^N 2(b*x_i+a-y_i) = \Sigma_{i=1}^N 2bx_i + 2aN - \Sigma_{i=1}^N y_i =2bN \overline x +2aN-N \overline y = 2N(b \overline x +a- \overline y)
?a?L?=Σi=1N?2(b?xi?+a?yi?)=Σi=1N?2bxi?+2aN?Σi=1N?yi?=2bNx+2aN?Ny?=2N(bx+a?y?) 令
?
L
?
a
=
0
\frac {\partial \mathfrak{L}} {\partial a} = 0
?a?L?=0,求得
a
=
y
 ̄
?
b
x
 ̄
a = \overline y-b \overline x
a=y??bx,带入
L
(
a
,
b
)
\mathfrak{L}(a, b)
L(a,b)
L
(
a
,
b
)
=
Σ
i
=
1
N
(
b
?
x
i
+
y
 ̄
?
b
x
 ̄
?
y
i
)
2
=
Σ
i
=
1
N
[
b
(
x
i
?
x
 ̄
)
?
(
y
i
?
y
 ̄
)
]
2
\mathfrak{L}(a, b) = \Sigma_{i=1}^N (b*x_i + \overline y - b \overline x - y_i)^2 = \Sigma_{i=1}^N [b(x_i - \overline x) - (y_i - \overline y)]^2
L(a,b)=Σi=1N?(b?xi?+y??bx?yi?)2=Σi=1N?[b(xi??x)?(yi??y?)]2
?
L
?
b
=
Σ
i
=
1
N
2
(
x
i
?
x
 ̄
)
[
b
(
x
i
?
x
 ̄
)
?
(
y
i
?
y
 ̄
)
]
=
Σ
i
=
1
N
[
2
b
(
x
i
?
x
 ̄
)
2
?
(
x
i
?
x
 ̄
)
(
y
i
?
y
 ̄
)
]
=
2
b
Σ
i
=
1
N
(
x
i
?
x
 ̄
)
2
?
Σ
i
=
1
N
(
x
i
?
x
 ̄
)
(
y
i
?
y
 ̄
)
=
2
b
V
a
r
(
x
)
?
C
o
v
(
x
,
y
)
\frac {\partial \mathfrak{L}} {\partial b} = \Sigma_{i=1}^N 2(x_i - \overline x )[b(x_i - \overline x) - (y_i - \overline y)] = \Sigma_{i=1}^N[2b(x_i - \overline x)^2 - (x_i - \overline x)(y_i - \overline y)] = 2b\Sigma_{i=1}^N (x_i - \overline x)^2 - \Sigma_{i=1}^N (x_i - \overline x)(y_i - \overline y) =2bVar(x) - Cov(x, y)
?b?L?=Σi=1N?2(xi??x)[b(xi??x)?(yi??y?)]=Σi=1N?[2b(xi??x)2?(xi??x)(yi??y?)]=2bΣi=1N?(xi??x)2?Σi=1N?(xi??x)(yi??y?)=2bVar(x)?Cov(x,y) 令
?
L
?
b
=
0
\frac {\partial \mathfrak{L}} {\partial b} = 0
?b?L?=0,求得
b
=
C
o
v
(
x
,
y
)
V
a
r
(
x
)
b = \frac {Cov(x, y)} {Var(x)}
b=Var(x)Cov(x,y)?
最小二乘法求解多变量线性回归
上面处理的是
x
i
,
y
i
∈
R
x_i, y_i \in R
xi?,yi?∈R的情况,下面讨论
x
i
∈
R
1
×
D
,
y
i
∈
R
,
x
∈
R
N
×
D
,
y
∈
R
N
\boldsymbol x_i \in R^{1 \times D}, \boldsymbol y_i \in R, \boldsymbol x \in R^{N \times D}, \boldsymbol y \in R^N
xi?∈R1×D,yi?∈R,x∈RN×D,y∈RN的情况,即多变量线性回归 假设线性回归模型为
y
^
=
x
?
θ
\hat \boldsymbol y = \boldsymbol x \cdot \boldsymbol \theta
y^?=x?θ,接下来用最小二乘法求解
θ
∈
R
D
\boldsymbol \theta \in R^D
θ∈RD
损
失
函
数
L
(
θ
)
=
∣
∣
x
θ
?
y
∣
∣
2
=
∣
∣
e
∣
∣
2
=
e
T
e
(
e
=
x
θ
?
y
)
损失函数\mathfrak{L}(\boldsymbol \theta) = || \boldsymbol x \boldsymbol \theta - \boldsymbol y||^2 = ||\boldsymbol e||^2 = \boldsymbol e^\mathrm T \boldsymbol e (\boldsymbol e = \boldsymbol x \boldsymbol \theta - \boldsymbol y)
损失函数L(θ)=∣∣xθ?y∣∣2=∣∣e∣∣2=eTe(e=xθ?y) 根据链式法则
?
L
?
θ
=
?
L
?
e
?
e
?
θ
=
2
e
T
x
=
2
(
x
θ
?
y
)
T
x
=
2
θ
T
x
T
x
?
2
y
T
x
\frac {\partial \mathfrak{L}} {\partial \boldsymbol \theta} = \frac {\partial \mathfrak{L}} {\partial \boldsymbol e} \frac {\partial \boldsymbol e} {\partial \boldsymbol \theta}= 2\boldsymbol e^\mathrm T\boldsymbol x = 2(\boldsymbol x \boldsymbol \theta - \boldsymbol y)^\mathrm T \boldsymbol x = 2\boldsymbol \theta^\mathrm T \boldsymbol x^\mathrm T \boldsymbol x - 2\boldsymbol y^\mathrm T \boldsymbol x
?θ?L?=?e?L??θ?e?=2eTx=2(xθ?y)Tx=2θTxTx?2yTx
令
?
L
?
θ
=
0
\frac {\partial \mathfrak{L}} {\partial \boldsymbol \theta} = 0
?θ?L?=0,得到
θ
T
x
T
x
=
y
T
x
\boldsymbol \theta^\mathrm T \boldsymbol x^\mathrm T \boldsymbol x = \boldsymbol y^\mathrm T \boldsymbol x
θTxTx=yTx,两边同时转置,得到
x
T
x
θ
=
x
T
y
\boldsymbol x^\mathrm T \boldsymbol x \boldsymbol \theta = \boldsymbol x^\mathrm T \boldsymbol y
xTxθ=xTy 注意
x
T
x
∈
R
D
×
D
\boldsymbol x^\mathrm T \boldsymbol x \in R^{D \times D}
xTx∈RD×D是一个半正定对称矩阵,可逆。因此,最终的解为
θ
=
(
x
T
x
)
?
1
x
T
y
\boldsymbol \theta = (\boldsymbol x^\mathrm T \boldsymbol x )^{-1}\boldsymbol x^\mathrm T \boldsymbol y
θ=(xTx)?1xTy
http://detexify.kirelabs.org/symbols.html mathematics for machine learning
|