开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> TensorFlow1.x如何使用官方预训练模型 -> 正文阅读

[人工智能]TensorFlow1.x如何使用官方预训练模型

环境：tensorflow1.13
模型：使用vgg19作为例子

注意：本文档的结果是使用CPU模式跑出来的，因为显卡是30系，系统是win10，没法在tf1.13版本下使用GPU模式。所以如果读者使用GPU模式跑出来的结果与本文档有些许出入，应该是正常现象。

文档背景是为了在tensorflow1.x中使用预训练的vgg模型计算vgg loss。

模型下载

官方的预训练模型在tensorflow的model仓库中，完整路径是tensorflow/models/research/slim，请注意选择tf1.13分支：
https://github.com/tensorflow/models/tree/r1.13.0/research/slim

vgg19的模型下载地址：
http://download.tensorflow.org/models/vgg_19_2016_08_28.tar.gz
解压后文件名是vgg_19.ckpt。

辅助函数

在下面的代码中，为了对比不同的模型使用方式是否带来一致的结果，写了一个可视化feature map的辅助函数来方便对比。下面的代码将直接使用，而不再赘述，读者如果想要尝试本文档代码，请自行粘贴过去。

def visualize_feature_map(feature_map,
                          col_nums=None,
                          gap_value=0.5,
                          gap_width=10,
                          gap_height=10):
    """
    Visualize feature map in one image.

    Parameters
    ----------
    feature_map: numpy array, shape is (height, width, channel)
    col_nums: number of feature map columns
    gap_value: value for feature map gap
    gap_width: width of gap
    gap_height: height of gap

    Returns
    -------
    image: image to show feature map
    """
    eps = 1e-6

    if feature_map.ndim == 4:
        if feature_map.shape[0] == 1:
            feature_map = np.squeeze(feature_map)
        else:
            raise ValueError("feature map must be 3 dims ndarray (height, "
                             "width, channel) or 4 dims ndarray whose shape "
                             "must be (1, height, width, channel)")

    # compute col_nums (if not set) and row_nums
    height, width, channel = feature_map.shape
    if col_nums is None:
        col_nums = int(round(np.sqrt(channel)))
    row_nums = int(np.ceil(channel / col_nums))

    # compute final image width and height
    image_width = col_nums * (width + gap_width) - gap_width
    image_height = row_nums * (height + gap_height) - gap_height

    image = np.ones(shape=(image_height, image_width),
                    dtype=feature_map.dtype) * gap_value
    cnt = 0
    while cnt < channel:
        row = cnt // col_nums
        col = cnt % col_nums

        row_beg = row * (height + gap_height)
        row_end = row_beg + height
        col_beg = col * (width + gap_width)
        col_end = col_beg + width

        image[row_beg:row_end, col_beg:col_end] = \
            feature_map[:, :, cnt] / (np.std(feature_map[:, :, cnt]) + eps)
        cnt += 1

    return image

模型使用

有三种典型的方式使用官方模型：

使用官方提供的模型文件加载模型。
需要首先使用placeholder在计算图中定义模型，然后使用tf.train.Saver()的restore方法加载模型参数。这种方式要求我们新定义的模型的节点名和参数名必须与vgg_19.ckpt中保存的一致，这也是为什么建议直接使用官方的模型定义文件。
魔改官方的模型文件。
在卷积部分，官方的模型文件只提供了relu层的feature map，有时候我们可能需要conv层的feature map，这时候就需要魔改。魔改后的节点名和参数名仍然必须与vgg_19.ckpt中保存的一致。
使用NewCheckpointReader并自定义模型
使用pywrap_tensorflow.NewCheckpointReader(model_path)可以读取权重参数，然后自己重新定义模型结构并把权重参数赋值过去。采用这种办法可以灵活定义模型结构和节点名称，但是代码写起来比较麻烦。（官方定义的模型文件只能拿到relu层的feature map，拿不到conv层的，所以灵活性欠佳）

1. 使用官方提供的模型文件加载模型

官方的模型定义文件路径（网址）：
https://github.com/tensorflow/models/blob/r1.13.0/research/slim/nets/vgg.py
找到vgg19的定义如下：

def vgg_19(inputs,
           num_classes=1000,
           is_training=True,
           dropout_keep_prob=0.5,
           spatial_squeeze=True,
           scope='vgg_19',
           fc_conv_padding='VALID',
           global_pool=False):
    """
    Oxford Net VGG 19-Layers version E Example.
    Note: All the fully_connected layers have been transformed to conv2d
    layers. To use in classification mode, resize input to 224x224.

    Args:
        inputs: a tensor of size [batch_size, height, width, channels].
        num_classes: number of predicted classes. If 0 or None, the logits
            layer is omitted and the input features to the logits layer are
            returned instead.
        is_training: whether or not the model is being trained.
        dropout_keep_prob: the probability that activations are kept in the
            dropout layers during training.
        spatial_squeeze: whether or not should squeeze the spatial dimensions
            of the outputs. Useful to remove unnecessary dimensions for
            classification.
        scope: Optional scope for the variables.
        fc_conv_padding: the type of padding to use for the fully connected
            layer that is implemented as a convolutional layer. Use 'SAME'
            padding if you are applying the network in a fully convolutional
            manner and want to get a prediction map downsampled by a factor of
            32 as an output. Otherwise, the output prediction map will be
            (input / 32) - 6 in case of 'VALID' padding.
        global_pool: Optional boolean flag. If True, the input to the
            classification layer is avgpooled to size 1x1, for any input size.
            (This is not part of the original VGG architecture.)
    Returns:
        net: the output of the logits layer (if num_classes is a non-zero
            integer), or the non-dropped-out input to the logits layer (if
            num_classes is 0 or None).
        end_points: a dict of tensors with intermediate activations.
    """
    with tf.variable_scope(scope, 'vgg_19', [inputs]) as sc:
        end_points_collection = sc.original_name_scope + '_end_points'
        # Collect outputs for conv2d, fully_connected and max_pool2d.
        with slim.arg_scope(
                [slim.conv2d, slim.fully_connected, slim.max_pool2d],
                outputs_collections=end_points_collection):
            net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3],
                              scope='conv1')
            net = slim.max_pool2d(net, [2, 2], scope='pool1')
            net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
            net = slim.max_pool2d(net, [2, 2], scope='pool2')
            net = slim.repeat(net, 4, slim.conv2d, 256, [3, 3], scope='conv3')
            net = slim.max_pool2d(net, [2, 2], scope='pool3')
            net = slim.repeat(net, 4, slim.conv2d, 512, [3, 3], scope='conv4')
            net = slim.max_pool2d(net, [2, 2], scope='pool4')
            net = slim.repeat(net, 4, slim.conv2d, 512, [3, 3], scope='conv5')
            net = slim.max_pool2d(net, [2, 2], scope='pool5')

            # Use conv2d instead of fully_connected layers.
            net = slim.conv2d(net, 4096, [7, 7], padding=fc_conv_padding,
                              scope='fc6')
            net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                               scope='dropout6')
            net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
            # Convert end_points_collection into a end_point dict.
            end_points = slim.utils.convert_collection_to_dict(
                end_points_collection)
            if global_pool:
                net = tf.reduce_mean(net, [1, 2], keep_dims=True,
                                     name='global_pool')
                end_points['global_pool'] = net
            if num_classes:
                net = slim.dropout(net, dropout_keep_prob,
                                   is_training=is_training,
                                   scope='dropout7')
                net = slim.conv2d(net, num_classes, [1, 1],
                                  activation_fn=None,
                                  normalizer_fn=None,
                                  scope='fc8')
                if spatial_squeeze:
                    net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
                end_points[sc.name + '/fc8'] = net
            return net, end_points

关于上述模型定义，有两个函数需要稍加解释：

slim.conv2d
slim.conv2d的定义在Pycharm中通过ctrl加单击的方式追进去后发现不对，真正的定义在下面路径的文件中：
D:\Program\anaconda3\envs\tf13\Lib\site-packages\tensorflow\contrib\layers\python\layers\layers.py
（请注意D:\Program\anaconda3\envs\tf13是我电脑上tf1.13的环境路径，需要根据自己环境修改）

其中第1117行的def convolution2d是函数定义及实现，3327行的conv2d = convolution2d起了一个短名字。convolution2d的参数列表中有activation_fn=nn.relu，所以这个卷积默认带有relu作为激活函数。

slim.repeat
作用是把某个算子重复n次。函数实现跟slim.conv2d同一个文件，文件中的函数定义和部分解释如下：

def repeat(inputs, repetitions, layer, *args, **kwargs):
  """Applies the same layer with the same arguments repeatedly.

y = repeat(x, 3, conv2d, 64, [3, 3], scope='conv1')
# It is equivalent to:

x = conv2d(x, 64, [3, 3], scope='conv1/conv1_1')
x = conv2d(x, 64, [3, 3], scope='conv1/conv1_2')
y = conv2d(x, 64, [3, 3], scope='conv1/conv1_3')

......

"""

使用脚本如下，注意其中vgg_19和visualize_feature_map两个函数在本文档上面已经出现过，并且比较长，所以下面的脚本中将其省略：

# -*- coding: utf-8 -*-
import os
import cv2
import tensorflow as tf
import numpy as np

os.environ['CUDA_VISIBLE_DEVICES'] = "/gpu:0"
slim = tf.contrib.slim


def vgg_19(inputs,
           num_classes=1000,
           is_training=True,
           dropout_keep_prob=0.5,
           spatial_squeeze=True,
           scope='vgg_19',
           fc_conv_padding='VALID',
           global_pool=False):
    # 见本文档前面
    pass


def visualize_feature_map(feature_map,
                          col_nums=None,
                          gap_value=0.5,
                          gap_width=10,
                          gap_height=10):
    # 见本文档前面
    pass


def main():
    image_file = r'E:\images\lena512color.tiff'
    model_path = r'E:\pretrained_model\tf1x\vgg_19.ckpt'
    inputs_ = tf.placeholder(dtype=tf.float32, shape=[None, None, None, 3])
    outputs, feature_map_dict = vgg_19(inputs_,
                                       num_classes=0,
                                       is_training=False,
                                       global_pool=True)

    # print trainable variables
    for var in tf.trainable_variables():
        print(var)

    # load pretrained model
    saver = tf.train.Saver()
    sess = tf.Session()
    saver.restore(sess, model_path)

    # running test
    inputs = cv2.imread(image_file)
    inputs = np.expand_dims(inputs, axis=0)
    out, feature_maps = sess.run([outputs, feature_map_dict],
                                 feed_dict={
                                     inputs_: inputs,
                                 })

    # print shape of feature maps
    for key in feature_maps.keys():
        print(key, feature_maps.get(key).shape)

    feature_map = feature_maps.get('vgg_19/conv3/conv3_4')
    feature_map = np.squeeze(feature_map)
    image = visualize_feature_map(feature_map)
    image = np.clip(image * 255, 0, 255).astype(np.uint8)
    # cv2.imwrite('lena_feature_map_vgg_conv3_4.png', image)

    # print statistics for feature map
    for i in range(5):
        mean_val = np.mean(feature_map[:, :, i])
        std = np.std(feature_map[:, :, i])
        min_val = np.min(feature_map[:, :, i])
        max_val = np.max(feature_map[:, :, i])
        print(i + 1, " min=%.4f, max=%.4f, mean=%.4f, std=%.4f" % (
            min_val, max_val, mean_val, std))

    # print part of final global feature vector
    feature_vec = feature_maps.get('global_pool')
    feature_vec = np.squeeze(feature_vec)
    for i in range(10):
        print(feature_vec[i])


if __name__ == '__main__':
    main()

对上述脚本需要做如下说明：

vgg_19的参数设置需要多加注意。
如果只是做推理的话，is_training一定要设置成False；
我使用的目的是为了计算vgg loss，所以不需要全连接的部分，因此为了使全连接部分不要报错，把num_classes设置成0，并且global_pool设置为True。

使用placeholder创建计算图以后才能restore权重。
脚本在计算图和restore权重的中间部分(# print trainable variables部分)打印了权重变量，包括名称，shape，dtype。

<tf.Variable 'vgg_19/conv1/conv1_1/weights:0' shape=(3, 3, 3, 64) dtype=float32_ref>
<tf.Variable 'vgg_19/conv1/conv1_1/biases:0' shape=(64,) dtype=float32_ref>
<tf.Variable 'vgg_19/conv1/conv1_2/weights:0' shape=(3, 3, 64, 64) dtype=float32_ref>
<tf.Variable 'vgg_19/conv1/conv1_2/biases:0' shape=(64,) dtype=float32_ref>
<tf.Variable 'vgg_19/conv2/conv2_1/weights:0' shape=(3, 3, 64, 128) dtype=float32_ref>
<tf.Variable 'vgg_19/conv2/conv2_1/biases:0' shape=(128,) dtype=float32_ref>
......

vgg_19的输出有两个。
第一个很容易理解，就是网络推理的输出，但对于计算vgg loss而言没有用。
第二个输出以dict的形式保存了网络的feature map，dict的key是feature map的节点名，value是feature map的数值，这是计算vgg loss真正需要的东西。在# print shape of feature maps部分打印了feature map的name和shape，另外还把conv3_4画在一张图上，用于一些简单直观的测试和检查。
```
vgg_19/conv1/conv1_1 (1, 512, 512, 64)
vgg_19/conv1/conv1_2 (1, 512, 512, 64)
vgg_19/pool1 (1, 256, 256, 64)
vgg_19/conv2/conv2_1 (1, 256, 256, 128)
vgg_19/conv2/conv2_2 (1, 256, 256, 128)
vgg_19/pool2 (1, 128, 128, 128)
......
```
打印feature map的一些统计数值，可以检查并确认以下事实：
feature map只有relu，没有conv，因为feature map的最小值都是0.0；
vgg出现在BN之前，所以网络中没有BN，导致feature map的数值很大（如果有BN的话数值一般不会超过5），因此计算vgg loss时根据具体情况一般需要乘上一个很小的权重系数。
```
1  min=0.0000, max=9201.5811, mean=386.3745, std=737.3252
2  min=0.0000, max=7389.5913, mean=1412.0540, std=616.6437
3  min=0.0000, max=3323.7239, mean=400.2662, std=522.4063
4  min=0.0000, max=4319.3765, mean=369.9904, std=644.4222
5  min=0.0000, max=8997.2305, mean=905.1512, std=1288.8953
......
```

打印一部分最后的feature vector，用于魔改模型定义函数后检查正确性：

0.00055606366
0.0
0.0
0.15579844
0.0
1.0548652
0.0
0.0
0.05207316
0.29752082
......

2. 魔改官方的模型文件

魔改后的模型和测试代码如下，同样visualize_feature_map需要从上面粘贴过来：

# -*- coding: utf-8 -*-
import os
import cv2
import tensorflow as tf
import numpy as np

slim = tf.contrib.slim


def vgg19(inputs,
          num_classes=1000,
          is_training=True,
          dropout_keep_prob=0.5,
          spatial_squeeze=True,
          scope='vgg_19',
          fc_conv_padding='VALID',
          global_pool=False):
    with tf.variable_scope(scope, 'vgg_19', [inputs]) as sc:
        end_points_collection = sc.original_name_scope + '_end_points'
        # Collect outputs for conv2d, fully_connected and max_pool2d.
        with slim.arg_scope(
                [slim.conv2d, slim.fully_connected, slim.max_pool2d],
                outputs_collections=end_points_collection):
            # conv blocks are modified as follows
            net_config = [
                [64, 2],
                [128, 2],
                [256, 4],
                [512, 4],
                [512, 4],
            ]  # [filters, blocks]

            net = inputs
            relu_dict = {}
            for i, config in enumerate(net_config):
                filters = config[0]
                for j in range(config[1]):
                    conv_scope = 'conv%d/conv%d_%d' % (i + 1, i + 1, j + 1)
                    relu_name = 'conv%d/relu%d_%d' % (i + 1, i + 1, j + 1)
                    net = slim.conv2d(net, filters, [3, 3],
                                      activation_fn=None,
                                      scope=conv_scope)
                    net = tf.nn.relu(net, name=relu_name)
                    relu_dict[net.op.name] = net
                net = slim.max_pool2d(net, [2, 2], scope='pool%d' % (i + 1))

            # Use conv2d instead of fully_connected layers.
            net = slim.conv2d(net, 4096, [7, 7], padding=fc_conv_padding,
                              scope='fc6')
            net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                               scope='dropout6')
            net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
            # Convert end_points_collection into a end_point dict.
            end_points = slim.utils.convert_collection_to_dict(
                end_points_collection)
            if global_pool:
                net = tf.reduce_mean(net, [1, 2], keep_dims=True,
                                     name='global_pool')
                end_points['global_pool'] = net
            if num_classes:
                net = slim.dropout(net, dropout_keep_prob,
                                   is_training=is_training,
                                   scope='dropout7')
                net = slim.conv2d(net, num_classes, [1, 1],
                                  activation_fn=None,
                                  normalizer_fn=None,
                                  scope='fc8')
                if spatial_squeeze:
                    net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
                end_points[sc.name + '/fc8'] = net

            end_points.update(relu_dict)
            return net, end_points


def visualize_feature_map(feature_map,
                          col_nums=None,
                          gap_value=0.5,
                          gap_width=10,
                          gap_height=10):
    # 见本文档前面
    pass


def main():
    image_file = r'D:\data\test_images\lena512color.tiff'
    model_path = r'E:\pretrained_model\tensorflow1.13\vgg_19.ckpt'
    inputs_ = tf.placeholder(dtype=tf.float32, shape=[None, None, None, 3])
    outputs, feature_map_dict = vgg19(inputs_,
                                      num_classes=0,
                                      is_training=False,
                                      global_pool=True)

    # check trainable variables
    for var in tf.trainable_variables():
        print(var)

    # load pretrained model
    saver = tf.train.Saver()
    sess = tf.Session()
    saver.restore(sess, model_path)

    # running test
    inputs = cv2.imread(image_file)
    inputs = np.expand_dims(inputs, axis=0)
    out, feature_maps = sess.run([outputs, feature_map_dict],
                                 feed_dict={
                                     inputs_: inputs,
                                 })

    # print shape of feature maps
    for key in feature_maps.keys():
        print(key, feature_maps.get(key).shape)

    feature_map = feature_maps.get('vgg_19/conv3/relu3_4')
    feature_map = np.squeeze(feature_map)
    image = visualize_feature_map(feature_map)
    image = np.clip(image * 255, 0, 255).astype(np.uint8)
    cv2.imwrite('lena_feature_map_vgg_relu3_4--2.png', image)

    # print statistics for relu3_4
    for i in range(5):
        mean_val = np.mean(feature_map[:, :, i])
        std = np.std(feature_map[:, :, i])
        min_val = np.min(feature_map[:, :, i])
        max_val = np.max(feature_map[:, :, i])
        print(i + 1, " min=%.4f, max=%.4f, mean=%.4f, std=%.4f" % (
            min_val, max_val, mean_val, std))

    # print statistics for conv3_4
    print('\n')
    feature_map = feature_maps.get('vgg_19/conv3/conv3_4')
    feature_map = np.squeeze(feature_map)
    for i in range(5):
        mean_val = np.mean(feature_map[:, :, i])
        std = np.std(feature_map[:, :, i])
        min_val = np.min(feature_map[:, :, i])
        max_val = np.max(feature_map[:, :, i])
        print(i + 1, " min=%.4f, max=%.4f, mean=%.4f, std=%.4f" % (
            min_val, max_val, mean_val, std))

    feature_vec = feature_maps.get('global_pool')
    feature_vec = np.squeeze(feature_vec)
    for i in range(10):
        print(feature_vec[i])


if __name__ == '__main__':
    main()

说明如下：

改模型结构的目的：将conv和relu分离开来，方便使用conv层的结果来作为vgg loss的输入。
改模型定义函数的关键点：确保结构不能变；确保节点名不能变；由于原始代码不能直接收集relu层的feature map，所以需要自行定义一个dict进行收集。

conv3_4的feature map统计信息如下，可以看到min value已经出现负数，所以确实实现了与relu层的分离。

1  min=-2909.7209, max=9201.5811, mean=92.2542, std=991.5225
2  min=-431.0446, max=7389.5913, mean=1411.6982, std=617.5237
3  min=-1092.2075, max=3323.7239, mean=339.5828, std=582.6731
4  min=-2396.4478, max=4319.3765, mean=106.6278, std=852.0536
5  min=-3547.4551, max=8997.2305, mean=699.2141, std=1488.1344

feature_maps里面的内容多了relu部分，因为是在代码的最后update进去的，所以这部分在feature_maps的最后：

......
vgg_19/conv1/relu1_1 (1, 512, 512, 64)
vgg_19/conv1/relu1_2 (1, 512, 512, 64)
vgg_19/conv2/relu2_1 (1, 256, 256, 128)
vgg_19/conv2/relu2_2 (1, 256, 256, 128)
vgg_19/conv3/relu3_1 (1, 128, 128, 256)
vgg_19/conv3/relu3_2 (1, 128, 128, 256)
vgg_19/conv3/relu3_3 (1, 128, 128, 256)
vgg_19/conv3/relu3_4 (1, 128, 128, 256)
......

其他输出变量已经检查，与第一种方式没有差异，说明魔改的结果是正确的。

3. 使用NewCheckpointReader并自定义模型

这种方法分两部分说明。
第一部分简单说明如何从预训练模型中拿到权重参数；第二部分详细说明如何将预训练权重系数赋值给新定义的模型，并进行测试。

下面是第一部分的代码：

# -*- coding: utf-8 -*-
from tensorflow.python import pywrap_tensorflow as wrap


def main():
    model_path = r'E:\pretrained_model\tf1x\vgg_19.ckpt'
    reader = wrap.NewCheckpointReader(model_path)

    variables_shape = reader.get_variable_to_shape_map()
    variables_dtype = reader.get_variable_to_dtype_map()
    for key in variables_shape.keys():
        print(key, variables_shape.get(key), variables_dtype.get(key))

    print('\n')
    print(reader.has_tensor("vgg_19/mean_rgb"))
    rgb_mean = reader.get_tensor("vgg_19/mean_rgb")
    print(rgb_mean)


if __name__ == '__main__':
    main()

上述代码有几个需说明的地方：

NewCheckpointReader用于加载预训练模型的权重
get_variable_to_shape_map()和get_variable_to_dtype_map()可以查看权重参数的shape和dtype
get_tensor()可以获得权重参数，返回的是numpy数组

打印出来的结果如下，其中有两个比较巧妙的参数，global_step和vgg_19/mean_rgb，mean_rgb打印出来有具体数值：

global_step [] <dtype: 'int64'>
vgg_19/conv2/conv2_2/biases [128] <dtype: 'float32'>
vgg_19/conv2/conv2_2/weights [3, 3, 128, 128] <dtype: 'float32'>
vgg_19/conv1/conv1_1/biases [64] <dtype: 'float32'>
vgg_19/conv1/conv1_1/weights [3, 3, 3, 64] <dtype: 'float32'>
vgg_19/conv1/conv1_2/biases [64] <dtype: 'float32'>
vgg_19/conv1/conv1_2/weights [3, 3, 64, 64] <dtype: 'float32'>
......
vgg_19/mean_rgb [3] <dtype: 'float32'>
......
vgg_19/fc8/weights [1, 1, 4096, 1000] <dtype: 'float32'>


[123.68 116.78 103.94]

下面是第二部分的代码：

# -*- coding: utf-8 -*-
import os
import cv2
import tensorflow as tf
import numpy as np
from tensorflow.python import pywrap_tensorflow as wrap

os.environ['CUDA_VISIBLE_DEVICES'] = "/gpu:0"
slim = tf.contrib.slim


def vgg19(inputs,
          scope_name='vgg_19'):
    with tf.variable_scope(scope_name):

        net_config = [
            [64, 2],
            [128, 2],
            [256, 4],
            [512, 4],
            [512, 4],
        ]  # [filters, blocks]

        feature_maps = {}
        x = inputs
        for i, config in enumerate(net_config):
            filters = config[0]
            for j in range(config[1]):
                conv_name = 'conv%d_%d' % (i + 1, j + 1)
                relu_name = 'relu%d_%d' % (i + 1, j + 1)

                x = tf.layers.conv2d(x, filters, [3, 3],
                                     padding='same',
                                     name=conv_name)
                feat_map_name = x.op.name.replace('/BiasAdd', '')
                feature_maps[feat_map_name] = x

                x = tf.nn.relu(x, name=relu_name)
                feature_maps[x.op.name] = x

            x = tf.layers.max_pooling2d(x, (2, 2), (2, 2),
                                        name='pool%d' % (i + 1))
            feat_map_name = x.op.name.replace('/MaxPool', '')
            feature_maps[feat_map_name] = x

        return x, feature_maps


def visualize_feature_map(feature_map,
                          col_nums=None,
                          gap_value=0.5,
                          gap_width=10,
                          gap_height=10):
    # 见本文档前面
    pass


def _get_pretrained_tensor_name(name):
    block_num = int(name.split('/')[1][4:].split('_')[0])
    name = name.replace('vgg_19', 'vgg_19/conv%d' % block_num)
    name = name.replace('kernel', 'weights').replace('bias', 'biases')
    return name


def main():
    image_file = r'E:\images\lena512color.tiff'
    model_path = r'E:\pretrained_model\tf1x\vgg_19.ckpt'
    inputs_ = tf.placeholder(dtype=tf.float32, shape=[None, None, None, 3])
    outputs, feature_map_dict = vgg19(inputs_)
    trainable_vars = tf.trainable_variables()

    # use NewCheckpointReader to get weights
    reader = wrap.NewCheckpointReader(model_path)

    sess = tf.Session()
    sess.run(tf.global_variables_initializer())

    # print trainable variables before assignment
    for var in trainable_vars:
        print(var)
        print(sess.run(var)[:, :, 0, 0])
        break

    # trainable variables assignment
    print('\n')
    for i, var in enumerate(trainable_vars):
        name = _get_pretrained_tensor_name(var.op.name)
        sess.run(var.assign(reader.get_tensor(name)))

    # print trainable variables after assignment
    for var in trainable_vars:
        print(var)
        name = _get_pretrained_tensor_name(var.op.name)
        print(sess.run(var)[:, :, 0, 0])
        print('pretrained weight:')
        print(reader.get_tensor(name)[:, :, 0, 0])
        break

    # test case
    inputs = cv2.imread(image_file)
    inputs = np.expand_dims(inputs, axis=0)
    out, feature_maps = sess.run([outputs, feature_map_dict],
                                 feed_dict={
                                     inputs_: inputs,
                                 })

    # print shape of feature maps
    print('\n')
    for key in feature_maps.keys():
        print(key, feature_maps.get(key).shape)

    feature_map = feature_maps.get('vgg_19/relu3_4')
    feature_map = np.squeeze(feature_map)
    image = visualize_feature_map(feature_map)
    image = np.clip(image * 255, 0, 255).astype(np.uint8)
    cv2.imwrite('lena_feature_map_vgg_conv3_4--2.png', image)

    # print statistics for feature map
    print('\n')
    for i in range(5):
        mean_val = np.mean(feature_map[:, :, i])
        std = np.std(feature_map[:, :, i])
        min_val = np.min(feature_map[:, :, i])
        max_val = np.max(feature_map[:, :, i])
        print(i + 1, " min=%.4f, max=%.4f, mean=%.4f, std=%.4f" % (
            min_val, max_val, mean_val, std))


if __name__ == '__main__':
    main()

比如针对vgg loss这个需求，通常我们不需要最后的全连接层，所以本着节省算力及显存的目的，上述新定义的模型将全连接部分给去掉了，并且feature map / variables 的名字也重新进行了定义，这种情况下就无法使用restore功能加载预训练参数，只能使用赋值的方式。

上述代码整体分为两个部分，一是权重参数赋值，二是跟之前一样的测试用例。

下面对赋值的流程进行说明：

使用placeholder创建计算图并获取trainable_vars
使用NewCheckpointReader加载预训练模型的权重参数
创建Session并初始化全局变量
使用var.assign()方法给权重参数赋值

上述代码打印出来的结果如下：

<tf.Variable 'vgg_19/conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>
[[ 0.04975817 -0.0374901  -0.04425776]
 [ 0.03555809  0.08642714  0.05649987]
 [-0.07783681 -0.03184588 -0.07609541]]
（sess.run(tf.global_variables_initializer())之后打印了kernel的一部分，为随机初始化的结果）

<tf.Variable 'vgg_19/conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>
[[ 0.39416704  0.37740308 -0.04594866]
 [ 0.2671299   0.09986369 -0.34100872]
 [-0.07573577 -0.2803425  -0.41602272]]
pretrained weight:
[[ 0.39416704  0.37740308 -0.04594866]
 [ 0.2671299   0.09986369 -0.34100872]
 [-0.07573577 -0.2803425  -0.41602272]]
 （权重参数赋值之后，有一次打印了kernel的一部分，同时也打印了预训练模型中对应的部分，可以看到kernel被成功赋值）


vgg_19/conv1_1 (1, 512, 512, 64)
vgg_19/relu1_1 (1, 512, 512, 64)
vgg_19/conv1_2 (1, 512, 512, 64)
vgg_19/relu1_2 (1, 512, 512, 64)
vgg_19/pool1 (1, 256, 256, 64)
......
vgg_19/conv5_4 (1, 32, 32, 512)
vgg_19/relu5_4 (1, 32, 32, 512)
vgg_19/pool5 (1, 16, 16, 512)
（检查featuremap的名字和shape）


1  min=0.0000, max=9201.5811, mean=386.3745, std=737.3252
2  min=0.0000, max=7389.5913, mean=1412.0540, std=616.6437
3  min=0.0000, max=3323.7239, mean=400.2662, std=522.4063
4  min=0.0000, max=4319.3765, mean=369.9904, std=644.4222
5  min=0.0000, max=8997.2305, mean=905.1512, std=1288.8953
（打印 relu3_4，并与之前的两种方法对比数值，结果是一样的，说明整体流程没什么问题）

最后来看一下代码中保存的那张feature map图：
在这里插入图片描述