[人工智能] pytorch可视化实例：gradcam在resnet18上的应用（快餐式代码教程）

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> pytorch可视化实例：gradcam在resnet18上的应用（快餐式代码教程） -> 正文阅读

[人工智能]pytorch可视化实例：gradcam在resnet18上的应用（快餐式代码教程）

采用Grad-cam方法将神经网络extractor的不同层进行可视化输出，我的示例是将resnet18进行可视化。
代码地址（百度网盘）：resnet18_visual
提取码：4ocv

resnet18：

architecture

上图是resnet18、resnet34、resnet50、resnet101、resnet152的结构图。
从上图能看出resnet的卷积部分主要分为4个模块（在代码中对应self.layer0~3）

代码

在网络中添加hook


class CamExtractor():
    """
        Extracts cam features from the model
    """

    def __init__(self, model):
        self.model = model
        self.gradients = []

    def save_gradient(self, grad):
        self.gradients.append(grad)

    def forward_pass_on_convolutions(self, x):
        """
            Does a forward pass on convolutions, hooks the function at given layer
        """
        conv_output = []
        x = self.model.conv1(x)
        x = self.model.bn1(x)
        x = self.model.relu(x)
        x = self.model.maxpool(x)

        x = self.model.layer1(x)
        x.register_hook(self.save_gradient)
        conv_output.append(x)  # Save the convolution output on that layer

        x = self.model.layer2(x)
        x.register_hook(self.save_gradient)
        conv_output.append(x)

        x = self.model.layer3(x)
        x.register_hook(self.save_gradient)
        conv_output.append(x)

        x = self.model.layer4(x)
        x.register_hook(self.save_gradient)
        conv_output.append(x)

        return conv_output, x
   def forward_pass(self, x):
        """
            Does a full forward pass on the model
        """
        # Forward pass on the convolutions
        conv_output, x = self.forward_pass_on_convolutions(x)

        x = self.model.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.model.fc(x)
        return conv_output, x

这部分代码用于提取梯度值
查看源代码，了解resnet类的结构，resnet有4层，梯度存储在列表中输出
forward_pass_on_convolutions中添加hook，存储目标层的梯度值

Grad-cam


class GradCam():
    """
        Produces class activation map
    """

    def __init__(self, model):
        self.model = model
        self.model.eval()
        # Define extractor
        self.extractor = CamExtractor(self.model)

    def generate_cam(self, input_image, target_layer, target_class=None):

        # Full forward pass
        # conv_output is the output of convolutions at specified layer
        # model_output is the final output of the model (1, 1000)
        conv_output, model_output = self.extractor.forward_pass(input_image)
        if target_class is None:
            target_class = np.argmax(model_output.data.numpy())
        # Target for backprop
        one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_()
        one_hot_output[0][target_class] = 1

        # Zero grads
        self.model.zero_grad()

        # Backward pass with specified target
        model_output.backward(gradient=one_hot_output, retain_graph=True)

        # Get hooked gradients,gradients
        # layer0:(1,512,7,7), layer1:(1,256,14,14), layer2:(1,128,28,28), layer3:(1,64,56,56),
        # 与后面conv_output是反的，因此需要逆序处理
        guided_gradients = self.extractor.gradients[-1 - target_layer].data.numpy()[0]

        # Get convolution outputs
        # layer0.shape:(64,56,56) layer1:(128,28,28) layer2:(256,14,14) layer3:(512,7,7)
        target = conv_output[target_layer].data.numpy()[0]
        # Get weights from gradients
        weights = np.mean(guided_gradients, axis=(1, 2))  # Take averages for each gradient

        # Create empty numpy array for cam
        cam = np.ones(target.shape[1:], dtype=np.float32)

        # Have a look at issue #11 to check why the above is np.ones and not np.zeros
        # Multiply each weight with its conv output and then, sum
        for i, w in enumerate(weights):
            cam += w * target[i, :, :]
        cam = np.maximum(cam, 0)
        cam = (cam - np.min(cam)) / (np.max(cam) - np.min(cam))  # Normalize between 0-1
        cam = np.uint8(cam * 255)  # Scale between 0-255 to visualize
        cam_resize = Image.fromarray(cam).resize((input_image.shape[2],
                                                  input_image.shape[3]), Image.ANTIALIAS)
        cam = np.uint8(cam_resize) / 255

        return cam