五、TensorFlow 进阶

内容参考来自https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book开源书籍《TensorFlow2深度学习》，这只是我做的简单的学习笔记，方便以后复习

1.合并与分割

张量的合并可以使用拼接(Concatenate)和堆叠(Stack)操作实现，拼接操作并不会产生新的维度，仅在现有的维度上合并，而堆叠会创建新维度。选择使用拼接还是堆叠操作来合并张量，取决于具体的场景是否需要创建新维度。

拼接tf.concat(tensors, axis)

a = tf.random.normal([4,35,8]) # 模拟成绩册 A
b = tf.random.normal([6,35,8]) # 模拟成绩册 B
tf.concat([a,b],axis=0) # 拼接合并成绩册  > shape=(10, 35, 8)

堆叠tf.stack(tensors, axis)

a = tf.random.normal([35,8])
b = tf.random.normal([35,8])
tf.stack([a,b],axis=0) # 堆叠合并为 2 个班级，班级维度插入在最前 > shape=(2, 35, 8)

分割tf.split(x, num_or_size_splits, axis)

num_or_size_splits 参数：切割方案。当 num_or_size_splits 为单个数值时，如 10，表示等长切割为 10 份；当 num_or_size_splits 为 List 时，List 的每个元素表示每份的长度，如[2,4,2,2]表示切割为 4 份，每份的长度依次是 2、4、2、2。

x = tf.random.normal([10,35,8])
# 等长切割为 10 份
result = tf.split(x, num_or_size_splits=10, axis=0)
len(result) # 返回的列表为 10 个张量的列表
# 切割后的班级 shape 为[1,35,8]，仍保留了班级维度，这一点需要注意。
result[0] # 查看第一个班级的成绩册张量 shape=(1, 35, 8)  

x = tf.random.normal([10,35,8])
# 自定义长度的切割，切割为 4 份，返回 4 个张量的列表 result
result = tf.split(x, num_or_size_splits=[4,2,2,2] ,axis=0)
len(result) # 4
result[0] # shape=(4, 35, 8)

特别地，如果希望在某个维度上全部按长度为 1 的方式分割，还可以使用 tf.unstack(x,axis)函数。

x = tf.random.normal([10,35,8])
result = tf.unstack(x,axis=0) # Unstack 为长度为 1 的张量
len(result) # 返回 10 个张量的列表
result[0] # 第一个班级 shape=(35, 8)

2.数据统计

向量范数
- L1 范数：定义为向量𝒙的所有元素绝对值之和。
- L2 范数：定义为向量𝒙的所有元素的平方和，再开根号。
- ∞ ?范数：定义为向量𝒙的所有元素绝对值的最大值。

在 TensorFlow 中，可以通过 tf.norm(x, ord)求解张量的 L1、L2、∞等范数，其中参数ord 指定为 1、2 时计算 L1、L2 范数，指定为 np.inf 时计算∞ ?范数，例如：

x = tf.ones([2,2])
tf.norm(x,ord=1) # 计算 L1 范数
tf.norm(x,ord=2) # 计算 L2 范数
import numpy as np
tf.norm(x,ord=np.inf) # 计算∞范数

通过 tf.reduce_max、tf.reduce_min、tf.reduce_mean、tf.reduce_sum 函数可以求解张量在某个维度上的最大、最小、均值、和，也可以求全局最大、最小、均值、和信息。

x = tf.random.normal([4,10]) # 模型生成概率
tf.reduce_max(x,axis=1) # 统计概率维度上的最大值 shape=(4,)

通过 tf.argmax(x, axis)和 tf.argmin(x, axis)可以求解在 axis 轴上，x 的最大值、最小值所在的索引号.

pred = tf.argmax(out, axis=1) # 选取概率最大的位置 numpy=array([0, 0]

3.张量比较

tf.equal(a, b)或 tf.math.equal(a,b)

# 表 5.1 常用比较函数总结
# 函数                     比较逻辑
# tf.math.greater          𝑏 > 𝑐
# tf.math.less             𝑏 < 𝑐
# tf.math.greater_equal    𝑏 ≥ 𝑐
# tf.math.less_equal       𝑏 ≤ 𝑐
# tf.math.not_equal        𝑏 ≠ 𝑐
# tf.math.is_nan           𝑏 = nan

4.填充与复制

填充 tf.pad(x, paddings)

a = tf.constant([1,2,3,4,5,6]) # 第一个句子
b = tf.constant([7,8,1,6]) # 第二个句子
b = tf.pad(b, [[0,2]]) # 句子末尾填充 2 个 0  > numpy=array([7, 8, 1, 6, 0, 0])

# 图片
x = tf.random.normal([4,28,28,1])
# 图片上下、左右各填充 2 个单元
tf.pad(x,[[0,0],[2,2],[2,2],[0,0]]) # shape=(4, 32, 32, 1)

复制tf.tile()

x = tf.random.normal([4,32,32,3])
tf.tile(x,[2,3,3,1]) # 数据复制  > shape=(8, 96, 96, 3)

5.数据限幅

在 TensorFlow 中，可以通过 tf.maximum(x, a)实现数据的下限幅，即𝑦 ∈ [𝑏,+∞)；可以通过 tf.minimum(x, a)实现数据的上限幅，即𝑦 ∈ (?∞,𝑏]，举例如下

x = tf.range(9)
tf.maximum(x,2) # 下限幅到 2
tf.minimum(x,7) # 上限幅到 7

更方便地，我们可以使用 tf.clip_by_value 函数实现上下限幅

x = tf.range(9)
tf.clip_by_value(x,2,7) # 限幅为 2~7

6.高级操作

tf.gather 可以实现根据索引号收集数据的目的。

x = tf.random.uniform([4,35,8],maxval=100,dtype=tf.int32) # 成绩册张量
tf.gather(x,[0,1],axis=0) # 在班级维度收集第 1~2 号班级成绩册
# 比如，需要抽查所有班级的第 1、4、9、12、13、27 号同学的成绩数据
tf.gather(x,[0,3,8,11,12,26],axis=1)

tf.gather_nd 函数，可以通过指定每次采样点的多维坐标来实现采样多个点的目的。

# 根据多维坐标收集数据
tf.gather_nd(x,[[1,1],[2,2],[3,3]])

tf.boolean_mask通过给定掩码(Mask)的方式进行采样

# x 的shape[4,35,8]
tf.boolean_mask(x,mask=[True, False,False,True],axis=0)

tf.where(cond, a, b)操作可以根据 cond 条件的真假从参数𝑩或𝑪中读取数据

a = tf.ones([3,3]) # 构造 a 为全 1 矩阵
b = tf.zeros([3,3]) # 构造 b 为全 0 矩阵
cond =tf.constant([[True,False,False],[False,True,False],[True,True,False]])
tf.where(cond,a,b) # 根据条件从 a,b 中采样 array([[1., 0., 0.],[0., 1., 0.],[1., 1., 0.]]

# 当参数 a=b=None 时，即 a 和 b 参数不指定，tf.where 会返回 cond 张量中所有 True 的元素的索引坐标。
# cond=[[ True, False, False],[False, True, False],[ True, True, False]]
tf.where(cond) # 获取 cond 中为 True 的元素索引 > array([[0, 0],[1, 1],[2, 0],[2, 1]], dtype=int64)>

# 我们需要提取张量中所有正数的数据和索引
x = tf.random.normal([3,3]) # 构造 a
mask=x>0 # 比较操作，等同于 tf.math.greater()
indices=tf.where(mask) # 提取所有大于 0 的元素索引
# 拿到索引后，通过 tf.gather_nd 即可恢复出所有正数的元素
tf.gather_nd(x,indices) # 提取正数的元素值

# 实际上，当我们得到掩码 mask 之后，也可以直接通过 tf.boolean_mask 获取所有正数的元素向量:
tf.boolean_mask(x,mask) # 通过掩码提取正数的元素值

tf.scatter_nd(indices, updates, shape)函数可以高效地刷新张量的部分数据，但是这个函数只能在全 0 的白板张量上面执行刷新操作，因此可能需要结合其它操作来实现现有张量的数据刷新功能。

# 构造需要刷新数据的位置参数，即为 4、3、1 和 7 号位置
indices = tf.constant([[4], [3], [1], [7]])
# 构造需要写入的数据，4 号位写入 4.4,3 号位写入 3.3，以此类推
updates = tf.constant([4.4, 3.3, 1.1, 7.7])
# 在长度为 8 的全 0 向量上根据 indices 写入 updates 数据
tf.scatter_nd(indices, updates, [8]) # numpy=array([0. ,1.1, 0. , 3.3, 4.4, 0. , 0. , 7.7]

考虑 3 维张量的刷新例子，如下图 5.4 所示，白板张量的 shape 为[4,4,4]，共有 4 个通道的特征图，每个通道大小为4 × 4，现有 2 个通道的新数据 updates:[2,4,4]，需要写入索引为[1,3]的通道上。

# 构造写入位置，即 2 个位置
indices = tf.constant([[1],[3]])
updates = tf.constant([# 构造写入数据，即 2 个矩阵
	[[5,5,5,5],[6,6,6,6],[7,7,7,7],[8,8,8,8]],
	[[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4]]
])
# 在 shape 为[4,4,4]白板上根据 indices 写入 updates
tf.scatter_nd(indices,updates,[4,4,4])

tf.meshgrid 函数可以方便地生成二维网格的采样点坐标，方便可视化等应用场合。

import tensorflow as tf

x = tf.linspace(-8., 8, 100)  # 设置 x 轴的采样点
y = tf.linspace(-8., 8, 100)  # 设置 y 轴的采样点
x, y = tf.meshgrid(x, y)  # 生成网格点，并内部拆分后返回
x.shape, y.shape  # 打印拆分后的所有点的 x,y 坐标张量 shape (TensorShape([100, 100]), TensorShape([100, 100]))
z = tf.sqrt(x ** 2 + y ** 2)
z = tf.sin(z) / z  # sinc 函数实现
import matplotlib
from matplotlib import pyplot as plt
# 导入 3D 坐标轴支持
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = Axes3D(fig)  # 设置 3D 坐标轴
# 根据网格点绘制 sinc 函数 3D 曲面
ax.contour3D(x.numpy(), y.numpy(), z.numpy(), 50)
plt.show()

在这里插入图片描述

7.MINIST测试实战

import matplotlib
from matplotlib import pyplot as plt

# Default parameters for plots
matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus'] = False
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, optimizers
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
print(tf.__version__)


def preprocess(x, y):
    # [b, 28, 28], [b]
    print(x.shape, y.shape)
    x = tf.cast(x, dtype=tf.float32) / 255.
    x = tf.reshape(x, [-1, 28 * 28])
    y = tf.cast(y, dtype=tf.int32)
    y = tf.one_hot(y, depth=10)

    return x, y


# %%
(x, y), (x_test, y_test) = datasets.mnist.load_data()
print('x:', x.shape, 'y:', y.shape, 'x test:', x_test.shape, 'y test:', y_test)
# %%
batchsz = 512
train_db = tf.data.Dataset.from_tensor_slices((x, y))
train_db = train_db.shuffle(1000)
train_db = train_db.batch(batchsz)
train_db = train_db.map(preprocess)
train_db = train_db.repeat(20)

# %%

test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
test_db = test_db.shuffle(1000).batch(batchsz).map(preprocess)
x, y = next(iter(train_db))
print('train sample:', x.shape, y.shape)


# print(x[0], y[0])


# %%
def main():
    # learning rate
    lr = 1e-2
    accs, losses = [], []

    # 784 => 512
    w1, b1 = tf.Variable(tf.random.normal([784, 256], stddev=0.1)), tf.Variable(tf.zeros([256]))
    # 512 => 256
    w2, b2 = tf.Variable(tf.random.normal([256, 128], stddev=0.1)), tf.Variable(tf.zeros([128]))
    # 256 => 10
    w3, b3 = tf.Variable(tf.random.normal([128, 10], stddev=0.1)), tf.Variable(tf.zeros([10]))

    for step, (x, y) in enumerate(train_db):

        # [b, 28, 28] => [b, 784]
        x = tf.reshape(x, (-1, 784))

        with tf.GradientTape() as tape:

            # layer1.
            h1 = x @ w1 + b1
            h1 = tf.nn.relu(h1)
            # layer2
            h2 = h1 @ w2 + b2
            h2 = tf.nn.relu(h2)
            # output
            out = h2 @ w3 + b3
            # out = tf.nn.relu(out)

            # compute loss
            # [b, 10] - [b, 10]
            loss = tf.square(y - out)
            # [b, 10] => scalar
            loss = tf.reduce_mean(loss)

        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
        for p, g in zip([w1, b1, w2, b2, w3, b3], grads):
            p.assign_sub(lr * g)

        # print
        if step % 80 == 0:
            print(step, 'loss:', float(loss))
            losses.append(float(loss))

        if step % 80 == 0:
            # evaluate/test
            total, total_correct = 0., 0

            for x, y in test_db:
                # layer1.
                h1 = x @ w1 + b1
                h1 = tf.nn.relu(h1)
                # layer2
                h2 = h1 @ w2 + b2
                h2 = tf.nn.relu(h2)
                # output
                out = h2 @ w3 + b3
                # [b, 10] => [b]
                pred = tf.argmax(out, axis=1)
                # convert one_hot y to number y
                y = tf.argmax(y, axis=1)
                # bool type
                correct = tf.equal(pred, y)
                # bool tensor => int tensor => numpy
                total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
                total += x.shape[0]

            print(step, 'Evaluate Acc:', total_correct / total)

            accs.append(total_correct / total)

    plt.figure()
    x = [i * 80 for i in range(len(losses))]
    plt.plot(x, losses, color='C0', marker='s', label='训练')
    plt.ylabel('MSE')
    plt.xlabel('Step')
    plt.legend()
    plt.savefig('train.svg')
    plt.show()

    plt.figure()
    plt.plot(x, accs, color='C1', marker='s', label='测试')
    plt.ylabel('准确率')
    plt.xlabel('Step')
    plt.legend()
    plt.savefig('test.svg')
    plt.show()


if __name__ == '__main__':
    main()