IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 人工智能 -> 矩阵求导-1——分子布局与分母布局 -> 正文阅读

[人工智能]矩阵求导-1——分子布局与分母布局

主要参考文章1
主要参考文章2

output = function(input)
input和output都有标量、向量(本文中的向量均为列向量,其转置为行向量)、矩阵三种形式,input用x,xX表示,output用f,fF表示,共9种情况。即:
f(x),f(x),f(X),f(x),f(x),f(X),F(x),F(x),F(X)
每种情况又有 分子布局(numerator layout)和分母布局(denominator layout)两种表示方式。
分子布局:分子为列向量,分母为行向量
分母布局:分母为列向量,分子为行向量
x = [ x 1 x 2 . . . x n ] X = [ x 11 x 12 . . . x 1 n x 21 x 22 . . . x 2 n . . . . . . . . . . . . x m 1 x m 2 . . . x m n ] f = [ f 1 f 2 . . . f n ] F = [ f 11 f 12 . . . f 1 n f 21 f 22 . . . f 2 n . . . . . . . . . . . . f m 1 f m 2 . . . f m n ] v e c ( X ) = [ x 11 , x 21 , . . . , x m 1 , x 12 , x 22 , . . . , x m 2 , . . . , x 1 n , x 2 n , . . . , x m n ] T \pmb{x}=\begin{bmatrix} x_1\\x_2\\...\\x_n \end{bmatrix}\\ \pmb{X}=\begin{bmatrix} x_{11}&x_{12}&...&x_{1n}\\x_{21}&x_{22}&...&x_{2n}\\...&...&...&...\\x_{m1}&x_{m2}&...&x_{mn} \end{bmatrix}\\ \pmb{f}=\begin{bmatrix} f_1\\f_2\\...\\f_n \end{bmatrix}\\ \pmb{F}=\begin{bmatrix} f_{11}&f_{12}&...&f_{1n}\\f_{21}&f_{22}&...&f_{2n}\\...&...&...&...\\f_{m1}&f_{m2}&...&f_{mn} \end{bmatrix}\\ vec(\pmb X)=[x_{11},x_{21},...,x_{m1},x_{12},x_{22},...,x_{m2},...,x_{1n},x_{2n},...,x_{mn}]^T xxx=?????x1?x2?...xn???????XXX=?????x11?x21?...xm1??x12?x22?...xm2??............?x1n?x2n?...xmn???????f?f??f=?????f1?f2?...fn???????FFF=?????f11?f21?...fm1??f12?f22?...fm2??............?f1n?f2n?...fmn???????vec(XXX)=[x11?,x21?,...,xm1?,x12?,x22?,...,xm2?,...,x1n?,x2n?,...,xmn?]T
注:
1)在深度学习中,较多使用的是分母排列方式
2)两种排列方式只是两派人的符号约定,不同领域的不同作者会使用不同的符号约定(分子排列和分母排列中的一个)

  1. f(x)
    ? f ? x (1) \frac{\partial f}{\partial x}\tag{1} ?x?f?(1)

  2. f(x)
    分 母 布 局 ( 梯 度 向 量 形 式 / 列 向 量 偏 导 形 式 / 列 偏 导 向 量 形 式 ) : ? x f ( x ) = ? f ( x ) ? x = [ ? f ? x 1 ? f ? x 2 . . . ? f ? x n ] (2) 分母布局(\pmb{梯度向量形式}/列向量偏导形式/列偏导向量形式):\nabla_{\pmb x}f(\pmb x)=\frac{\partial f(\pmb x)}{\partial \pmb x}=\begin{bmatrix} \frac{\partial f}{\partial x_1}\\\frac{\partial f}{\partial x_2}\\...\\\frac{\partial f}{\partial x_n} \end{bmatrix}\tag{2} 梯度向量形式//?xxx?f(xxx)=?xxx?f(xxx)?=???????x1??f??x2??f?...?xn??f????????(2)

    分 子 布 局 ( 行 向 量 偏 导 形 式 / 行 偏 导 向 量 形 式 ) : D x f ( x ) = ? f ( x ) ? x T = [ ? f ? x 1 ? f ? x 2 . . . ? f ? x n ] (3) 分子布局(行向量偏导形式/行偏导向量形式):D_{\pmb x}f(\pmb x)=\frac{\partial f(\pmb x)}{\partial \pmb x^T}=\begin{bmatrix} \frac{\partial f}{\partial x_1}&\frac{\partial f}{\partial x_2}&...&\frac{\partial f}{\partial x_n} \end{bmatrix}\tag{3} /Dxxx?f(xxx)=?xxxT?f(xxx)?=[?x1??f???x2??f??...??xn??f??](3)

  3. f(X)

    (4)和(6)互为转置,(5)和(7)互为转置

    当X为列向量时,(2)(4)(5)相等,(3)(6)(7)相等。
    梯 度 向 量 形 式 / 列 向 量 偏 导 形 式 / 列 偏 导 向 量 形 式 : ? v e c ? X f ( X ) = ? f ( X ) ? v e c ? X = [ ? f ? x 11 ? f ? x 21 . . . ? f ? x m 1 ? f ? x 12 ? f ? x 22 . . . ? f ? x m 2 . . . ? f ? x 1 n ? f ? x 2 n . . . ? f ? x m n ] (4) \pmb{梯度向量形式}/列向量偏导形式/列偏导向量形式:\nabla_{vec\:\pmb X}f(\pmb X)=\frac{\partial f(\pmb X)}{\partial vec \:\pmb X}=\begin{bmatrix} \frac{\partial f}{\partial x_{11}}\\\frac{\partial f}{\partial x_{21}}\\...\\\frac{\partial f}{\partial x_{m1}}\\ \frac{\partial f}{\partial x_{12}}\\\frac{\partial f}{\partial x_{22}}\\...\\\frac{\partial f}{\partial x_{m2}}\\...\\\frac{\partial f}{\partial x_{1n}}\\ \frac{\partial f}{\partial x_{2n}}\\...\\\frac{\partial f}{\partial x_{mn}} \end{bmatrix}\tag{4} 梯度向量形式//?vecXXX?f(XXX)=?vecXXX?f(XXX)?=??????????????????????????x11??f??x21??f?...?xm1??f??x12??f??x22??f?...?xm2??f?...?x1n??f??x2n??f?...?xmn??f???????????????????????????(4)

    梯 度 矩 阵 : ? X f ( X ) = ? f ( X ) ? X = [ ? f ? x 11 ? f ? x 12 . . . ? f ? x 1 n ? f ? x 21 ? f ? x 22 . . . ? f ? x 2 n . . . . . . . . . . . . ? f ? x m 1 ? f ? x m 2 . . . ? f ? x m n ] (5) \pmb{梯度矩阵}:\nabla_{\pmb X}f(\pmb X)=\frac{\partial f(\pmb X)}{\partial \pmb X}=\begin{bmatrix} \frac{\partial f}{\partial x_{11}}&\frac{\partial f}{\partial x_{12}}&...&\frac{\partial f}{\partial x_{1n}}\\ \frac{\partial f}{\partial x_{21}}&\frac{\partial f}{\partial x_{22}}&...&\frac{\partial f}{\partial x_{2n}}\\ ...&...&...&...\\ \frac{\partial f}{\partial x_{m1}}&\frac{\partial f}{\partial x_{m2}}&...&\frac{\partial f}{\partial x_{mn}}\\ \end{bmatrix}\tag{5} 梯度矩阵?XXX?f(XXX)=?XXX?f(XXX)?=???????x11??f??x21??f?...?xm1??f???x12??f??x22??f?...?xm2??f??............??x1n??f??x2n??f?...?xmn??f????????(5)

    行 向 量 偏 导 形 式 / 行 偏 导 向 量 形 式 : D v e c ? X f ( X ) = ? f ( X ) ? v e c T ? X = [ ? f ? x 11 ? f ? x 21 . . . ? f ? x m 1 ? f ? x 12 ? f ? x 22 . . . ? f ? x m 2 . . . ? f ? x 1 n ? f ? x 2 n . . . ? f ? x m n ] (6) 行向量偏导形式/行偏导向量形式:D_{vec\:\pmb X}f(\pmb X)=\frac{\partial f(\pmb X)}{\partial vec^T \:\pmb X}=\begin{bmatrix} \frac{\partial f}{\partial x_{11}}& \frac{\partial f}{\partial x_{21}}& ...& \frac{\partial f}{\partial x_{m1}}& \frac{\partial f}{\partial x_{12}}& \frac{\partial f}{\partial x_{22}}& ...& \frac{\partial f}{\partial x_{m2}}& ...& \frac{\partial f}{\partial x_{1n}}& \frac{\partial f}{\partial x_{2n}}& ...& \frac{\partial f}{\partial x_{mn}} \end{bmatrix}\tag{6} /DvecXXX?f(XXX)=?vecTXXX?f(XXX)?=[?x11??f???x21??f??...??xm1??f???x12??f???x22??f??...??xm2??f??...??x1n??f???x2n??f??...??xmn??f??](6)

    J a c o b i a n 矩 阵 : D X f ( X ) = ? f ( X ) ? X T = [ ? f ? x 11 ? f ? x 21 . . . ? f ? x m 1 ? f ? x 12 ? f ? x 22 . . . ? f ? x m 2 . . . . . . . . . . . . ? f ? x 1 n ? f ? x 2 n . . . ? f ? x m n ] (7) Jacobian矩阵:D_{\pmb X}f(\pmb X)=\frac{\partial f(\pmb X)}{\partial \pmb X^T}=\begin{bmatrix} \frac{\partial f}{\partial x_{11}}&\frac{\partial f}{\partial x_{21}}&...&\frac{\partial f}{\partial x_{m1}}\\ \frac{\partial f}{\partial x_{12}}&\frac{\partial f}{\partial x_{22}}&...&\frac{\partial f}{\partial x_{m2}}\\ ...&...&...&...\\ \frac{\partial f}{\partial x_{1n}}&\frac{\partial f}{\partial x_{2n}}&...&\frac{\partial f}{\partial x_{mn}}\\ \end{bmatrix}\tag{7} JacobianDXXX?f(XXX)=?XXXT?f(XXX)?=???????x11??f??x12??f?...?x1n??f???x21??f??x22??f?...?x2n??f??............??xm1??f??xm2??f?...?xmn??f????????(7)


    以下为自己的总结,可能包含错误

  4. f(x)
    ? x f ( x ) = ? f ( x ) ? x = [ ? f 1 ? x ? f 2 ? x . . . ? f n ? x ] D x f ( x ) = ? f T ( x ) ? x = [ ? f 1 ? x ? f 2 ? x . . . ? f n ? x ] \nabla_x\pmb f(x)= \frac{\partial \pmb f(x)}{\partial x}= \begin{bmatrix} \frac{\partial f_1}{\partial x}\\ \frac{\partial f_2}{\partial x}\\ ...\\ \frac{\partial f_n}{\partial x} \end{bmatrix}\\ D_x\pmb f(x)= \frac{\partial \pmb f^T(x)}{\partial x}= \begin{bmatrix} \frac{\partial f_1}{\partial x}& \frac{\partial f_2}{\partial x}& ...& \frac{\partial f_n}{\partial x} \end{bmatrix} ?x?f?f??f(x)=?x?f?f??f(x)?=??????x?f1???x?f2??...?x?fn????????Dx?f?f??f(x)=?x?f?f??fT(x)?=[?x?f1????x?f2???...??x?fn???]

  5. f(x)
    ? x f ( x ) = ? f T ( x ) ? x = [ ? f 1 ? x 1 ? f 2 ? x 1 ? f n ? x 1 ? f 1 ? x 2 ? f 2 ? x 2 ? f n ? x 2 . . . . . . . . . ? f 1 ? x n ? f 2 ? x n ? f n ? x n ] D x f ( x ) = ? f ( x ) ? x T = [ ? f 1 ? x 1 ? f 1 ? x 2 ? f 1 ? x n ? f 2 ? x 1 ? f 2 ? x 2 ? f 2 ? x n . . . . . . . . . ? f n ? x 1 ? f n ? x 2 ? f n ? x n ] \nabla_{\pmb x}\pmb f(\pmb x)=\frac{\partial \pmb f^T(\pmb x)}{\partial \pmb x}=\begin{bmatrix} \frac{\partial f_1}{\partial x_1}&\frac{\partial f_2}{\partial x_1}&\frac{\partial f_n}{\partial x_1}\\ \frac{\partial f_1}{\partial x_2}&\frac{\partial f_2}{\partial x_2}&\frac{\partial f_n}{\partial x_2}\\ ...&...&...\\ \frac{\partial f_1}{\partial x_n}&\frac{\partial f_2}{\partial x_n}&\frac{\partial f_n}{\partial x_n}\\ \end{bmatrix}\\ D_{\pmb x}\pmb f(\pmb x)=\frac{\partial \pmb f(\pmb x)}{\partial \pmb x^T}=\begin{bmatrix} \frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&\frac{\partial f_1}{\partial x_n}\\ \frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&\frac{\partial f_2}{\partial x_n}\\ ...&...&...\\ \frac{\partial f_n}{\partial x_1}&\frac{\partial f_n}{\partial x_2}&\frac{\partial f_n}{\partial x_n}\\ \end{bmatrix}\\ ?xxx?f?f??f(xxx)=?xxx?f?f??fT(xxx)?=???????x1??f1???x2??f1??...?xn??f1????x1??f2???x2??f2??...?xn??f2????x1??fn???x2??fn??...?xn??fn?????????Dxxx?f?f??f(xxx)=?xxxT?f?f??f(xxx)?=???????x1??f1???x1??f2??...?x1??fn????x2??f1???x2??f2??...?x2??fn????xn??f1???xn??f2??...?xn??fn?????????

  6. f(X)
    ? X f ( X ) = ? f T ( X ) ? v e c X D X f ( X ) = ? f ( X ) ? v e c T X \nabla_{\pmb X}\pmb f(\pmb X)=\frac{\partial \pmb f^T(\pmb X)}{\partial vec\pmb X}\\ D_{\pmb X}\pmb f(\pmb X)=\frac{\partial \pmb f(\pmb X)}{\partial vec^T\pmb X}\\ ?XXX?f?f??f(XXX)=?vecXXX?f?f??fT(XXX)?DXXX?f?f??f(XXX)=?vecTXXX?f?f??f(XXX)?

  7. F(x)
    ? x F ( x ) = ? F ( x ) ? x = [ ? f 11 ? x ? f 12 ? x . . . ? f 1 n ? x ? f 21 ? x ? f 22 ? x . . . ? f 2 n ? x . . . . . . . . . . . . ? f n 1 ? x ? f n 2 ? x . . . ? f m n ? x ] D x F ( x ) = ? F T ( x ) ? x = [ ? f 11 ? x ? f 21 ? x . . . ? f n 1 ? x ? f 12 ? x ? f 22 ? x . . . ? f n 2 ? x . . . . . . . . . . . . ? f 1 n ? x ? f 2 n ? x . . . ? f m n ? x ] \nabla_x\pmb F(x)= \frac{\partial \pmb F(x)}{\partial x}= \begin{bmatrix} \frac{\partial f_{11}}{\partial x}&\frac{\partial f_{12}}{\partial x}&...&\frac{\partial f_{1n}}{\partial x}\\ \frac{\partial f_{21}}{\partial x}&\frac{\partial f_{22}}{\partial x}&...&\frac{\partial f_{2n}}{\partial x}\\ ...&...&...&...\\ \frac{\partial f_{n1}}{\partial x}&\frac{\partial f_{n2}}{\partial x}&...&\frac{\partial f_{mn}}{\partial x}\\ \end{bmatrix}\\ D_x\pmb F(x)= \frac{\partial \pmb F^T(x)}{\partial x}= \begin{bmatrix} \frac{\partial f_{11}}{\partial x}&\frac{\partial f_{21}}{\partial x}&...&\frac{\partial f_{n1}}{\partial x}\\ \frac{\partial f_{12}}{\partial x}&\frac{\partial f_{22}}{\partial x}&...&\frac{\partial f_{n2}}{\partial x}\\ ...&...&...&...\\ \frac{\partial f_{1n}}{\partial x}&\frac{\partial f_{2n}}{\partial x}&...&\frac{\partial f_{mn}}{\partial x}\\ \end{bmatrix} ?x?FFF(x)=?x?FFF(x)?=??????x?f11???x?f21??...?x?fn1????x?f12???x?f22??...?x?fn2???............??x?f1n???x?f2n??...?x?fmn????????Dx?FFF(x)=?x?FFFT(x)?=??????x?f11???x?f12??...?x?f1n????x?f21???x?f22??...?x?f2n???............??x?fn1???x?fn2??...?x?fmn????????

  8. F(x)
    ? X F ( x ) = ? v e c T ( F ( X ) ) ? x D X F ( x ) = ? v e c ( F ( X ) ) ? x T \nabla_{\pmb X}\pmb F(\pmb x)=\frac{\partial \pmb vec^T(\pmb F(\pmb X))}{\partial \pmb x}\\ D_{\pmb X}\pmb F(\pmb x)=\frac{\partial \pmb vec(\pmb F(\pmb X))}{\partial \pmb x^T}\\ ?XXX?FFF(xxx)=?xxx?vvvecT(FFF(XXX))?DXXX?FFF(xxx)=?xxxT?vvvec(FFF(XXX))?

  9. F(X)
    ? X F ( X ) = ? v e c T ( F ( X ) ) ? v e c X D X F ( X ) = ? v e c ( F ( X ) ) ? v e c T X \nabla_{\pmb X}\pmb F(\pmb X)=\frac{\partial \pmb vec^T(\pmb F(\pmb X))}{\partial vec\pmb X}\\ D_{\pmb X}\pmb F(\pmb X)=\frac{\partial \pmb vec(\pmb F(\pmb X))}{\partial vec^T\pmb X}\\ ?XXX?FFF(XXX)=?vecXXX?vvvecT(FFF(XXX))?DXXX?FFF(XXX)=?vecTXXX?vvvec(FFF(XXX))?

  人工智能 最新文章
2022吴恩达机器学习课程——第二课(神经网
第十五章 规则学习
FixMatch: Simplifying Semi-Supervised Le
数据挖掘Java——Kmeans算法的实现
大脑皮层的分割方法
【翻译】GPT-3是如何工作的
论文笔记:TEACHTEXT: CrossModal Generaliz
python从零学(六)
详解Python 3.x 导入(import)
【答读者问27】backtrader不支持最新版本的
上一篇文章      下一篇文章      查看所有文章
加:2022-04-01 00:02:58  更:2022-04-01 00:07:21 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2025年1日历 -2025/1/9 0:53:57-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码