IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 人工智能 -> Neural Networks -> 正文阅读

[人工智能]Neural Networks

Neural Networks

Model Representation I

How we represent our hypothesis or how we represent our model when using neural networks?

  • neurons are cells in the brain
  • neuron has a cell body
  • neuron has a number of input wires( dendrites, receive inputs from other locations)
  • has an output wire ( Axon)

Communicate with little pulses of electricity (also called spikes)

Doing computations and passing messages to other neurons as a result of what other inputs they’ve got.

-Do Computation

  • output some value on this output wire, or

in the biological neuron, this is an axon

  • where it gets a number of inputs, x1,

x2, x3 and it outputs some value computed.

  • x0, the bias unit or the bias neuron

Sigmoid(logestic) activiation function( g ( z ) = 1 1 + e ? z g(z)=\frac{1}{1+e^{-z}} g(z)=1+e?z1?)

A simplistic representation looks like:

[ x 0 x 1 x 2 ] → [ ?? ] → h θ ( x ) [x_0x_1x_2]\rightarrow[\;]\rightarrow h_\theta(x) [x0?x1?x2?][]hθ?(x)

( h θ ( x ) = 1 1 + e ? θ T X h_\theta(x)=\frac{1}{1+e^{-\theta^TX}} hθ?(x)=1+e?θTX1?)

“input layer”(layer 1) → \rightarrow layer 2 → \rightarrow "output layer"(layer 3)

have intermediate layers of nodes between the input and output layers called the "hidden layers"

“weights”(权重)=“parameter”

notation:

? a i ( j ) a_i^{(j)} ai(j)?:第j层的第i个神经元或单元的激活项

? Θ ( j ) \Theta^{(j)} Θ(j):权重矩阵,控制从第j层到第j+1层的映射

If we have one hidden layer, it would look like:

[ x 0 x 1 x 2 x 3 ] → [ a 1 ( 2 ) a 2 ( 2 ) a 3 ( 2 ) ] → h θ ( x ) [x_0x_1x_2x_3]\rightarrow[a_1^{(2)}a_2^{(2)}a_3^{(2)}]\rightarrow h_\theta(x) [x0?x1?x2?x3?][a1(2)?a2(2)?a3(2)?]hθ?(x)

The values for each of the “activation” nodes is obtained as follows:

? a 1 ( 2 ) = g ( Θ 10 ( 1 ) x 0 + Θ 11 ( 1 ) x 1 + Θ 12 ( 1 ) x 2 + Θ 13 ( 1 ) x 3 ) a_1^{(2)}=g(\Theta_{10}^{(1)}x_0+\Theta_{11}^{(1)}x_1+\Theta_{12}{(1)}x_2+\Theta_{13}^{(1)}x_3) a1(2)?=g(Θ10(1)?x0?+Θ11(1)?x1?+Θ12?(1)x2?+Θ13(1)?x3?)

? a 2 ( 2 ) = g ( Θ 20 ( 1 ) x 0 + Θ 21 ( 1 ) x 1 + Θ 22 ( 1 ) x 2 + Θ 23 ( 1 ) x 3 ) a_2^{(2)}=g(\Theta_{20}^{(1)}x_0+\Theta_{21}^{(1)}x_1+\Theta_{22}{(1)}x_2+\Theta_{23}^{(1)}x_3) a2(2)?=g(Θ20(1)?x0?+Θ21(1)?x1?+Θ22?(1)x2?+Θ23(1)?x3?)

? a 3 ( 2 ) = g ( Θ 30 ( 1 ) x 0 + Θ 31 ( 1 ) x 1 + Θ 32 ( 1 ) x 2 + Θ 33 ( 1 ) x 3 ) a_3^{(2)}=g(\Theta_{30}^{(1)}x_0+\Theta_{31}^{(1)}x_1+\Theta_{32}{(1)}x_2+\Theta_{33}^{(1)}x_3) a3(2)?=g(Θ30(1)?x0?+Θ31(1)?x1?+Θ32?(1)x2?+Θ33(1)?x3?)

h Θ ( x ) = a 1 ( 3 ) = g ( Θ 10 ( 2 ) a 0 ( 2 ) + Θ 11 ( 2 ) a 1 ( 2 ) + Θ 12 ( 2 ) a 2 ( 2 ) + Θ 13 ( 2 ) a 3 ( 2 ) ) h_\Theta(x)=a_1^{(3)}=g(\Theta_{10}^{(2)}a_0^{(2)}+\Theta_{11}^{(2)}a_1^{(2)}+\Theta_{12}^{(2)}a_2^{(2)}+\Theta_{13}^{(2)}a_3^{(2)}) hΘ?(x)=a1(3)?=g(Θ10(2)?a0(2)?+Θ11(2)?a1(2)?+Θ12(2)?a2(2)?+Θ13(2)?a3(2)?)

如果一个网络在第j层有 s j s_j sj?个单元,在j+1层有 s j + 1 s_{j+1} sj+1?个单元,那么矩阵 Θ j \Theta_j Θj?即控制第j层到第j+1层映射的矩阵,它的维度是** s j + 1 × s j + 1 s_{j+1}\times s_{j}+1 sj+1?×sj?+1**

Model Representation II

Vectorize

a 1 ( 2 ) = g ( z 1 ( 2 ) ) a_1^{(2)}=g(z_1^{(2)}) a1(2)?=g(z1(2)?)

a 2 ( 2 ) = g ( z 2 ( 2 ) ) a_2^{(2)}=g(z_2^{(2)}) a2(2)?=g(z2(2)?)

a 3 ( 2 ) = g ( z 3 ( 2 ) ) a_3^{(2)}=g(z_3^{(2)}) a3(2)?=g(z3(2)?)

for layer j=2 and node k, the variable z will be:

z k ( 2 ) = Θ k , 0 ( 1 ) x 0 + Θ k , 1 ( 1 ) x 1 + . . . + Θ k , n ( 1 ) x n z_k^{(2)}=\Theta_{k,0}^{(1)}x_0+\Theta_{k,1}^{(1)}x_1+...+\Theta_{k,n}^{(1)}x_n zk(2)?=Θk,0(1)?x0?+Θk,1(1)?x1?+...+Θk,n(1)?xn?

vector repretation of x and z j z^j zj:
x = [ x 0 x 1 . . . x n ] z ( j ) = [ z 1 ( j ) z 2 ( j ) . . . z n ( j ) ] x=\begin{bmatrix}x_0\\x_1\\...\\x_n\end{bmatrix}\quad z^{(j)}=\begin{bmatrix} z^{(j)}_1\\z^{(j)}_2\\...\\z^{(j)}_n\end{bmatrix} x=?????x0?x1?...xn???????z(j)=??????z1(j)?z2(j)?...zn(j)????????
setting x = a ( 1 ) x=a^{(1)} x=a(1), we can write the equation as:

z ( j ) = Θ ( j ? 1 ) a ( j ? 1 ) z^{(j)}=\Theta^{(j-1)}a^{(j-1)} z(j)=Θ(j?1)a(j?1)

Θ ( j ? 1 ) \Theta^{(j-1)} Θ(j?1) is matrix with dimensions s j × ( n + 1 ) s_j\times (n+1) sj?×(n+1) (where s j s_j sj? is the number of our activation nodes) by our vector a ( j ? 1 ) a^{(j-1)} a(j?1) with height (n+1). This gives us our vector z ( j ) z^{(j)} z(j) with height s j s_j sj?

for layer j:

a ( j ) = g ( z ( j ) ) a^{(j)}=g(z^{(j)}) a(j)=g(z(j)), g can be applied element-wise to vector z ( j ) z^{(j)} z(j)

After we have computed a ( j ) a^{(j)} a(j), we can add a bias unit(equals to 1) to layer j.

add a 0 ( j ) a^{(j)}_0 a0(j)?=1

first compute another z z z vector: z ( j + 1 ) = Θ ( j ) a ( j ) z^{(j+1)}=\Theta^{(j)}a^{(j)} z(j+1)=Θ(j)a(j)

get the final z z z vector by multiplying the next theta matrix after Θ ( j ? 1 ) \Theta^{(j-1)} Θ(j?1) with the values of all the activation nodes we just got. This last theta matrix Θ ( j ) \Theta^{(j)} Θ(j)will have only one row which is multiplied by one column$ a^{(j)}$ so that our result is a single number. We then get our final result with:

h Θ ( x ) = a ( j + 1 ) = g ( z ( j + 1 ) ) h_\Theta(x)=a^{(j+1)}=g(z^{(j+1)}) hΘ?(x)=a(j+1)=g(z(j+1))

Notice that in this last step, between layer j and layer j+1, we are doing exactly the same thing as we did in logistic regression.

前向传播(forward propagation)

Applications

Non-linear classification example: XOR/XNOR

$x_1,x_2 $are boundary(0 or 1)

y = x 1 X O R x 2 y=x_1XOR x_2 y=x1?XORx2? x 1 和 x 2 x_1和x_2 x1?x2?中恰好有一个的值等于1则为真(y=1)

? x 1 X N O R x 2 x_1 XNOR x_2 x1?XNORx2? 当正样本同时为真或同时为假时y=1

? ( N O T ( x 1 X O R x 2 ) NOT(x_1XOR x_2) NOT(x1?XORx2?))

Simple example: AND

x 1 , x 2 ∈ 0 , 1 x_1,x_2\in{0,1} x1?,x2?0,1

y = x 1 A N D x 2 y=x_1 AND x_2 y=x1?ANDx2?

加偏置单元(+1单元)以计算只有单个神经元的神经网络来计算这个AND函数

对权重(参数)赋值

Multi-class classification

在这里插入图片描述
在这里插入图片描述

  人工智能 最新文章
2022吴恩达机器学习课程——第二课(神经网
第十五章 规则学习
FixMatch: Simplifying Semi-Supervised Le
数据挖掘Java——Kmeans算法的实现
大脑皮层的分割方法
【翻译】GPT-3是如何工作的
论文笔记:TEACHTEXT: CrossModal Generaliz
python从零学(六)
详解Python 3.x 导入(import)
【答读者问27】backtrader不支持最新版本的
上一篇文章      下一篇文章      查看所有文章
加:2022-02-07 13:43:59  更:2022-02-07 13:44:41 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2025年1日历 -2025/1/10 11:26:07-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码