采用Grad-cam方法将神经网络extractor的不同层进行可视化输出,我的示例是将resnet18进行可视化。 代码地址(百度网盘):resnet18_visual 提取码:4ocv
resnet18:
- architecture
上图是resnet18、resnet34、resnet50、resnet101、resnet152的结构图。 从上图能看出resnet的卷积部分主要分为4个模块(在代码中对应self.layer0~3)
代码
在网络中添加hook
class CamExtractor():
"""
Extracts cam features from the model
"""
def __init__(self, model):
self.model = model
self.gradients = []
def save_gradient(self, grad):
self.gradients.append(grad)
def forward_pass_on_convolutions(self, x):
"""
Does a forward pass on convolutions, hooks the function at given layer
"""
conv_output = []
x = self.model.conv1(x)
x = self.model.bn1(x)
x = self.model.relu(x)
x = self.model.maxpool(x)
x = self.model.layer1(x)
x.register_hook(self.save_gradient)
conv_output.append(x)
x = self.model.layer2(x)
x.register_hook(self.save_gradient)
conv_output.append(x)
x = self.model.layer3(x)
x.register_hook(self.save_gradient)
conv_output.append(x)
x = self.model.layer4(x)
x.register_hook(self.save_gradient)
conv_output.append(x)
return conv_output, x
def forward_pass(self, x):
"""
Does a full forward pass on the model
"""
conv_output, x = self.forward_pass_on_convolutions(x)
x = self.model.avgpool(x)
x = torch.flatten(x, 1)
x = self.model.fc(x)
return conv_output, x
这部分代码用于提取梯度值 查看源代码,了解resnet类的结构,resnet有4层,梯度存储在列表中输出 forward_pass_on_convolutions 中添加hook ,存储目标层的梯度值
Grad-cam
class GradCam():
"""
Produces class activation map
"""
def __init__(self, model):
self.model = model
self.model.eval()
self.extractor = CamExtractor(self.model)
def generate_cam(self, input_image, target_layer, target_class=None):
conv_output, model_output = self.extractor.forward_pass(input_image)
if target_class is None:
target_class = np.argmax(model_output.data.numpy())
one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_()
one_hot_output[0][target_class] = 1
self.model.zero_grad()
model_output.backward(gradient=one_hot_output, retain_graph=True)
guided_gradients = self.extractor.gradients[-1 - target_layer].data.numpy()[0]
target = conv_output[target_layer].data.numpy()[0]
weights = np.mean(guided_gradients, axis=(1, 2))
cam = np.ones(target.shape[1:], dtype=np.float32)
for i, w in enumerate(weights):
cam += w * target[i, :, :]
cam = np.maximum(cam, 0)
cam = (cam - np.min(cam)) / (np.max(cam) - np.min(cam))
cam = np.uint8(cam * 255)
cam_resize = Image.fromarray(cam).resize((input_image.shape[2],
input_image.shape[3]), Image.ANTIALIAS)
cam = np.uint8(cam_resize) / 255
return cam
这部分依据梯度值生成热图
结果
第3层的输出值结果最好,能比较好的将前景-背景区分开。
结束语
Grad-Cam是神经网络比较经典的可视化方法,笔者仅用resnet18示例。 若读者需要可视化自己的网络,只需要修改CamExtractor 类即可。
|