[人工智能] 2021-07-09

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> 2021-07-09 -> 正文阅读

[人工智能]2021-07-09

（1）Encoder

1.given a degraded image I ∈ R 3 × H ×W，Uformer fifirstly applies a 3 × 3 convolutional layer with LeakyReLU to extract low-level features X0 ∈ R C × H × W，

the feature maps X 0 are passed through K encoder stages（Each stage contains a stack of the proposed LeWin Transformer blocks and one down-sampling layer ），

2.LeWin Transformer blocks：takes advantage of the self-attention mechanism for capturing long-range dependencies ， cuts the computational cost （due to the usage of self-attention through non-overlapping windows on the feature maps）

3.In the down-sampling layer, we first reshape the flattened features into 2D spatial feature maps,and then down-sample the maps and double the channels using 4 × 4 convolution with stride 2.

For example,given the input feature maps X0 ∈ RC×H×W , the l-th stage of the encoder produces the feature maps?

（2）Bottleneck stage(图一，最下面的两个LeWin Transformer blocks)

4.Then, a bottleneck stage with a stack of LeWin Transformer blocks is added at the end of the encoder. In this stage, thanks to the hierarchical structure, the Transformer blocks capture longer (even global when the window size equals the feature map size) dependencies.

（3）Decoder

1.We use 2 × 2 transposed convolution with stride 2 for the up-sampling.This layer reduces half of the feature channels and doubles the size of the feature maps.

2.After that, the features inputted to the LeWin Transformer blocks are the up-sampled features and the corresponding features from the encoder through skip-connection.Next, the LeWin Transformer blocks are utilized to learn to restore the image.

3.After the K decoder stages, we reshape the flattened features to 2D feature maps and apply a 3 × 3 convolution layer to obtain a residual image R ∈ R3×H×W.

4.Finally, the restored image is obtained by I'?= I + R.

?remark:In our experiments, we empirically set K = 4 and each stage contains two LeWin Transformer blocks. We train Uformer using the Charbonnier loss.