简介
网上关于BN处理的说明有很多,它的前身就是白化处理,但是白化处理需要消除特征之间的相关性(使用PCA降维,矩阵分解等),这样比较消耗时间,然后对每一维特征的样本进行归一化,就是将其映射到均值为0,标准差为1的空间上。 BN汲取了白化处理的思想,但是BN的特点是进行批量归一化,比如一个形状为[N,C,H,W]的输入数据,它针对每一维特征进行样本大小为NxHxW的归一化操作,将该维特征的样本空间进行一个标准映射。为了验证我的理解,我做了实验一下实验。
准备数据
input1: 一维特征,batch=2的tensor input2: 一维特征,batch=2的tensor input: 二维特征,batch=2的tensor
import numpy as np
import torch
m1 = torch.nn.BatchNorm2d(1)
m2 = torch.nn.BatchNorm2d(2)
input1 = torch.randint(5,[2,1,2,2],dtype=torch.float32)
input2 = torch.randint(5,[2,1,2,2],dtype=torch.float32)
input = torch.cat((input1,input2),1)
print('input size: -----------------')
print(input.size())
print('input1: -----------------')
print(input1)
print('input2: -----------------')
print(input2)
print('input: -----------------')
print(input)
'''
input size: -----------------
torch.Size([2, 2, 2, 2])
input1: -----------------
tensor([[[[3., 3.],
[1., 1.]]],
[[[2., 2.],
[4., 1.]]]])
input2: -----------------
tensor([[[[2., 1.],
[2., 2.]]],
[[[1., 1.],
[4., 1.]]]])
input: -----------------
tensor([[[[3., 3.],
[1., 1.]],
[[2., 1.],
[2., 2.]]],
[[[2., 2.],
[4., 1.]],
[[1., 1.],
[4., 1.]]]])
'''
验证
通过实验发现,使用BN对input1操作的结果out1,对input2操作的结果out2,将out1和out2沿维度特征拼接的结果正好等于BN对input操作的结果。
out1 = m1(input1)
print('均值:'+str(torch.mean(out1)))
print('标准差:'+str(torch.std(out1)))
out2 = m1(input2)
out = m2(input)
print(out1)
print('#'*80+'手动计算')
out1_cal = (input1-torch.mean(input1))/torch.tensor(input1.numpy().std())
print(out1_cal)
print('#'*100)
print(out2)
print('#'*80+'手动计算')
out2_cal = (input2-torch.mean(input2))/torch.tensor(input2.numpy().std())
print(out2_cal)
print('#'*100)
print(out)
'''
均值:tensor(1.4901e-08, grad_fn=<MeanBackward0>)
标准差:tensor(1.0690, grad_fn=<StdBackward0>)
tensor([[[[ 0.8307, 0.8307],
[-1.0681, -1.0681]]],
[[[-0.1187, -0.1187],
[ 1.7802, -1.0681]]]], grad_fn=<NativeBatchNormBackward>)
################################################################################手动计算
tensor([[[[ 0.8307, 0.8307],
[-1.0681, -1.0681]]],
[[[-0.1187, -0.1187],
[ 1.7802, -1.0681]]]])
####################################################################################################
tensor([[[[ 0.2582, -0.7746],
[ 0.2582, 0.2582]]],
[[[-0.7746, -0.7746],
[ 2.3238, -0.7746]]]], grad_fn=<NativeBatchNormBackward>)
################################################################################手动计算
tensor([[[[ 0.2582, -0.7746],
[ 0.2582, 0.2582]]],
[[[-0.7746, -0.7746],
[ 2.3238, -0.7746]]]])
####################################################################################################
tensor([[[[ 0.8307, 0.8307],
[-1.0681, -1.0681]],
[[ 0.2582, -0.7746],
[ 0.2582, 0.2582]]],
[[[-0.1187, -0.1187],
[ 1.7802, -1.0681]],
[[-0.7746, -0.7746],
[ 2.3238, -0.7746]]]], grad_fn=<NativeBatchNormBackward>)
'''
|