[人工智能] Tensorflow Lite Model Maker实现图像分类和目标检测迁移学习

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> Tensorflow Lite Model Maker实现图像分类和目标检测迁移学习 -> 正文阅读

[人工智能]Tensorflow Lite Model Maker实现图像分类和目标检测迁移学习

博主的此时的环境配置见此前博客

Tensorflow Lite使用介绍_竹叶青lvye的博客-CSDN博客接着前面的博客系列讲，这里来介绍下Tensorflow LIte。TensorFlow Litehttps://tensorflow.google.cn/lite/guide?hl=zh-cn博主的环境简单介绍如下：python 3.6.5tensorflow-gpu 2.6.2cuda version: 11.2cudnn version: cudnn-11.2-linux-x64-v8.1.1.33主要参考官方文档资料，此文档大部分是有中文版的，更方便去掌握理解，博主这边具体https://blog.csdn.net/jiugeshao/article/details/124145815?spm=1001.2014.3001.5501

一.图像分类的迁移学习

主要是参考官网Image classification with TensorFlow Lite Model Maker，先安装下tflite-model-maker库

pip install -q tflite-model-maker --ignore-installed

博主通过默认的模型来迁移学习自己的数据集，实现分类，这边就简单实现狗和猫的分类

?上面的数据集在之前博客中用过，可前往下载，博主在官网代码基础上修改了一些，实现代码如下，按照默认的模型路径下载会出现timeout问题，前面博客中也分析了此。

import os
import numpy as np
import tensorflow as tf
assert tf.__version__.startswith('2')

from tflite_model_maker import model_spec
from tflite_model_maker import image_classifier
from tflite_model_maker.config import ExportFormat
from tflite_model_maker.config import QuantizationConfig
from tflite_model_maker.image_classifier import DataLoader
import matplotlib.pyplot as plt
from tensorflow_examples.lite.model_maker.core.task import model_spec as ms
import tensorflow_hub as hub
import time
import cv2

physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

# A helper function that returns 'red'/'black' depending on if its two input
# parameter matches or not.
def get_label_color(val1, val2):
  if val1 == val2:
    return 'black'
  else:
    return 'red'

data = DataLoader.from_folder("./classifyData")
train_data, rest_data = data.split(0.8)
validation_data, test_data = rest_data.split(0.5)
print(train_data.size)
print(validation_data.size)
print(test_data.size)

#plot 25 tranning images
plt.figure(figsize=(10,10))
for i, (image, label) in enumerate(data.gen_dataset().unbatch().take(25)):
  plt.subplot(5,5,i+1)
  plt.xticks([])
  plt.yticks([])
  plt.grid(False)
  plt.imshow(image.numpy(), cmap=plt.cm.gray)
  plt.xlabel(data.index_to_label[label.numpy()])

plt.show()


spec = image_classifier.ModelSpec(uri='https://storage.googleapis.com/tfhub-modules/tensorflow/efficientnet/lite0/feature-vector/2.tar.gz')
spec.input_image_shape = [121, 121]
model = image_classifier.create(train_data, model_spec=spec, validation_data=validation_data, epochs=5)
model.summary()

loss, accuracy = model.evaluate(test_data)
print(loss)
print(accuracy)

#config = QuantizationConfig.for_float16()
#model.export(export_dir='.', tflite_filename='model_fp16.tflite', quantization_config=config)

model.export(export_dir='./classifyMode')

# Then plot 100 test images and their predicted labels.
# If a prediction result is different from the label provided label in "test"
# dataset, we will highlight it in red color.
plt.figure(figsize=(20, 20))
predicts = model.predict_top_k(test_data)
for i, (image, label) in enumerate(test_data.gen_dataset().unbatch().take(100)):
  ax = plt.subplot(10, 10, i+1)
  plt.xticks([])
  plt.yticks([])
  plt.grid(False)
  plt.imshow(image.numpy(), cmap=plt.cm.gray)

  predict_label = predicts[i][0][0]
  color = get_label_color(predict_label,
                          test_data.index_to_label[label.numpy()])
  ax.xaxis.label.set_color(color)
  plt.xlabel('Predicted: %s' % predict_label)
plt.show()

# Load the TFLite model in TFLite Interpreter
tflite_file_path = "./classifyMode/model.tflite"
interpreter = tf.lite.Interpreter(tflite_file_path)
interpreter.allocate_tensors()

img = cv2.imread('./classifyData/dog/dog001.jpg')
plt.imshow(img)
plt.show()

img_n = img.transpose(2, 0, 1)
img = np.expand_dims(img, axis=0)
print(img.shape)
np.save('array',img)

input  = interpreter.get_input_details()[0]
output = interpreter.get_output_details()[0]

interpreter.set_tensor(input['index'], tf.convert_to_tensor(img))

t_model = time.perf_counter()
interpreter.invoke()
print(f'do inference cost:{time.perf_counter() - t_model:.8f}s')

output = interpreter.get_tensor(output['index'])
print(output)

?部分运行结果如下：

38
5
5
2022-04-17 22:07:03.038871: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
hub_keras_layer_v1v2 (HubKer (None, 1280)              3413024   
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 2)                 2562      
=================================================================
Total params: 3,415,586
Trainable params: 2,562
Non-trainable params: 3,413,024
_________________________________________________________________
None
/home/sxhlvye/anaconda3/envs/testTF/lib/python3.6/site-packages/keras/optimizer_v2/optimizer_v2.py:356: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
  "The `lr` argument is deprecated, use `learning_rate` instead.")
Epoch 1/5
2022-04-17 22:07:07.756702: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
1/1 [==============================] - 4s 4s/step - loss: 0.9480 - accuracy: 0.5312 - val_loss: 0.6193 - val_accuracy: 0.6000
Epoch 2/5
1/1 [==============================] - 0s 63ms/step - loss: 0.6182 - accuracy: 0.7500 - val_loss: 0.4740 - val_accuracy: 0.6000
Epoch 3/5
1/1 [==============================] - 0s 63ms/step - loss: 0.4008 - accuracy: 0.8750 - val_loss: 0.3973 - val_accuracy: 1.0000
Epoch 4/5
1/1 [==============================] - 0s 65ms/step - loss: 0.2639 - accuracy: 1.0000 - val_loss: 0.3816 - val_accuracy: 0.8000
Epoch 5/5
1/1 [==============================] - 0s 66ms/step - loss: 0.2988 - accuracy: 0.9688 - val_loss: 0.3701 - val_accuracy: 0.8000
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
hub_keras_layer_v1v2 (HubKer (None, 1280)              3413024   
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 2)                 2562      
=================================================================
Total params: 3,415,586
Trainable params: 2,562
Non-trainable params: 3,413,024
_________________________________________________________________
1/1 [==============================] - 0s 43ms/step - loss: 0.4794 - accuracy: 0.8000
0.4793754518032074
0.800000011920929
2022-04-17 22:07:11.215935: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2022-04-17 22:07:13.662071: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-04-17 22:07:13.662357: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2022-04-17 22:07:13.662422: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2022-04-17 22:07:13.662706: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-04-17 22:07:13.662998: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-04-17 22:07:13.663264: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-04-17 22:07:13.663575: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-04-17 22:07:13.663846: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-04-17 22:07:13.664073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3855 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1660 Ti with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5
2022-04-17 22:07:13.689299: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1137] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 913 nodes (656), 923 edges (664), time = 14.248ms.
  function_optimizer: function_optimizer did nothing. time = 0.289ms.

2022-04-17 22:07:14.357053: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2022-04-17 22:07:14.357079: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
2022-04-17 22:07:14.389818: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
fully_quantize: 0, inference_type: 6, input_inference_type: 3, output_inference_type: 3
WARNING:absl:For model inputs containing unsupported operations which cannot be quantized, the `inference_input_type` attribute will default to the original type.
(1, 121, 121, 3)
do inference cost:0.32141421s
[[  9 247]]

?在获得到tflite模型后，参考我之前博客加载其对图片进行了预测，结果正确。批量预测5张测试集的结果如下：

?有一张图片预测错误了，可以通过条调参等方式来提高识别率，这边就不详细描述了。

官网上还有很多好多一些资料，可详细去看下

迁移学习和微调 ?|? TensorFlow Core

图像的常用签名 ?|? TensorFlow Hub

二.目标检测的迁移学习

主要参考官网Object Detection with TensorFlow Lite Model Maker

按照官网安装下库

pip install -q --use-deprecated=legacy-resolver tflite-model-maker
pip install -q pycocotools

这里还是拿之前的博主自己构造的人行道数据集来做目标检测的实验，此数据集也在多个博客中所引用，关于该数据集的标注及使用可以从之前博客中查找。

ultralytics/yolov3训练预测自己数据集的配置过程_竹叶青lvye的博客-CSDN博客_ultralytics yolov3

balancap/SSD-Tensorflow使用及训练预测自己的数据集_竹叶青lvye的博客-CSDN博客_tensorflow训练自己的数据集

博主这边使用object_detector.DataLoader.from_pascal_voc函数来加载自己的数据集，仿照Pascal_VOC数据集的目录结构，博主构造了三个文件夹，分别作为训练，验证，测试集

数据集下载链接: https://pan.baidu.com/s/1AA6wm6QvT76f95ihiOWr2Q 提取码: kesp

详细代码如下（在官网代码基础上做了一些改动，供参考），至于所用模型的下载路径为什么和官网不同，可见我的之前博客：

import numpy as np
import os

from tflite_model_maker.config import QuantizationConfig
from tflite_model_maker.config import ExportFormat
from tflite_model_maker import model_spec
from tflite_model_maker import object_detector
import cv2
from PIL import Image

import tensorflow as tf
assert tf.__version__.startswith('2')

tf.get_logger().setLevel('ERROR')
from absl import logging
logging.set_verbosity(logging.ERROR)

classes = ['sidewalk']
# Define a list of colors for visualization
COLORS = np.random.randint(0, 255, size=(len(classes), 3), dtype=np.uint8)

def train():
  labels = {1: 'sidewalk'}
  train_imgs_dir = "./VOC2007_train/JPEGImages"
  train_Anno_dir = "./VOC2007_train/Annotations"

  valide_imgs_dir = "./VOC2007_valide/JPEGImages"
  valide_Anno_dir = "./VOC2007_valide/Annotations"

  test_imgs_dir = "./VOC2007_test/JPEGImages"
  test_Anno_dir = "./VOC2007_test/Annotations"

  traindata = object_detector.DataLoader.from_pascal_voc(train_imgs_dir, train_Anno_dir, labels)
  validata = object_detector.DataLoader.from_pascal_voc(valide_imgs_dir, valide_Anno_dir, labels)
  testdata = object_detector.DataLoader.from_pascal_voc(test_imgs_dir, test_Anno_dir, labels)
  spec = model_spec.get('efficientdet_lite0')
  spec.uri = 'https://storage.googleapis.com/tfhub-modules/tensorflow/efficientdet/lite0/feature-vector/1.tar.gz'
  spec.input_image_shape = [512, 288]

  model = object_detector.create(traindata, model_spec=spec, batch_size=6, train_whole_model=True, validation_data=validata, epochs=120)
  model.summary()

  model.evaluate(testdata)
  model.export(export_dir='./detectMode')

def preprocess_image(image_path, input_size):
  """Preprocess the input image to feed to the TFLite model"""
  img = tf.io.read_file(image_path)
  img = tf.io.decode_image(img, channels=3)
  img = tf.image.convert_image_dtype(img, tf.uint8)
  original_image = img
  resized_img = tf.image.resize(img, input_size)
  resized_img = resized_img[tf.newaxis, :]
  resized_img = tf.cast(resized_img, dtype=tf.uint8)
  return resized_img, original_image

def detect_objects(interpreter, image, threshold):
  """Returns a list of detection results, each a dictionary of object info."""

  signature_fn = interpreter.get_signature_runner()
  # Feed the input image to the model
  output = signature_fn(images=image)
  # Get all outputs from the model
  count = int(np.squeeze(output['output_0']))
  scores = np.squeeze(output['output_1'])
  classes = np.squeeze(output['output_2'])
  boxes = np.squeeze(output['output_3'])

  results = []
  for i in range(count):
    if scores[i] >= threshold:
      result = {
        'bounding_box': boxes[i],
        'class_id': classes[i],
        'score': scores[i]
      }
      results.append(result)
  return results

def run_odt_and_draw_results(image_path, interpreter, threshold=0.5):
  """Run object detection on the input image and draw the detection results"""
  # Load the input shape required by the model
  _, input_height, input_width, _ = interpreter.get_input_details()[0]['shape']

  # Load the input image and preprocess it
  preprocessed_image, original_image = preprocess_image(
      image_path,
      (input_height, input_width)
    )

  # Run object detection on the input image
  results = detect_objects(interpreter, preprocessed_image, threshold=threshold)

  # Plot the detection results on the input image
  original_image_np = original_image.numpy().astype(np.uint8)
  for obj in results:
    # Convert the object bounding box from relative coordinates to absolute
    # coordinates based on the original image resolution
    ymin, xmin, ymax, xmax = obj['bounding_box']
    xmin = int(xmin * original_image_np.shape[1])
    xmax = int(xmax * original_image_np.shape[1])
    ymin = int(ymin * original_image_np.shape[0])
    ymax = int(ymax * original_image_np.shape[0])

    # Find the class index of the current object
    class_id = int(obj['class_id'])

    # Draw the bounding box and label on the image
    color = [int(c) for c in COLORS[class_id]]
    cv2.rectangle(original_image_np, (xmin, ymin), (xmax, ymax), color, 2)
    # Make adjustments to make the label visible for all objects
    y = ymin - 15 if ymin - 15 > 15 else ymin + 15
    label = "{}: {:.0f}%".format(classes[class_id], obj['score'] * 100)
    cv2.putText(original_image_np, label, (xmin, y),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

  # Return the final image
  original_uint8 = original_image_np.astype(np.uint8)
  return original_uint8



if __name__ == '__main__':
  #start to train
  train()

  #start to load the tflite to predict a image
  DETECTION_THRESHOLD = 0.3
  model_path = './detectMode/model.tflite'
  TEMP_FILE = './135.bmp'

  # Load the TFLite model
  interpreter = tf.lite.Interpreter(model_path=model_path)
  interpreter.allocate_tensors()

  # Run inference and draw detection result on the local copy of the original file
  detection_result_image = run_odt_and_draw_results(
      TEMP_FILE,
      interpreter,
      threshold=DETECTION_THRESHOLD
  )

  # Show the detection result
  image = Image.fromarray(detection_result_image)
  image.show()

先会训练，训练完毕后会加载生成的tensorflow lite模型来对图像预测，预测这块方法和之前博客类似，同时把检测的一些信息在原图上画出来。运行如上代码，部分结果信息如下：

Epoch 115/120
3/3 [==============================] - 1s 386ms/step - det_loss: 0.3284 - cls_loss: 0.1994 - box_loss: 0.0026 - reg_l2_loss: 0.0632 - loss: 0.3916 - learning_rate: 2.7517e-05 - gradient_norm: 2.6857 - val_det_loss: 0.3057 - val_cls_loss: 0.2118 - val_box_loss: 0.0019 - val_reg_l2_loss: 0.0632 - val_loss: 0.3689
Epoch 116/120
3/3 [==============================] - 1s 341ms/step - det_loss: 0.2448 - cls_loss: 0.1637 - box_loss: 0.0016 - reg_l2_loss: 0.0632 - loss: 0.3080 - learning_rate: 1.6866e-05 - gradient_norm: 2.1395 - val_det_loss: 0.2840 - val_cls_loss: 0.1994 - val_box_loss: 0.0017 - val_reg_l2_loss: 0.0632 - val_loss: 0.3472
Epoch 117/120
3/3 [==============================] - 1s 430ms/step - det_loss: 0.2688 - cls_loss: 0.1589 - box_loss: 0.0022 - reg_l2_loss: 0.0632 - loss: 0.3320 - learning_rate: 8.8173e-06 - gradient_norm: 2.5229 - val_det_loss: 0.2777 - val_cls_loss: 0.1958 - val_box_loss: 0.0016 - val_reg_l2_loss: 0.0632 - val_loss: 0.3409
Epoch 118/120
3/3 [==============================] - 1s 408ms/step - det_loss: 0.2737 - cls_loss: 0.1875 - box_loss: 0.0017 - reg_l2_loss: 0.0632 - loss: 0.3369 - learning_rate: 3.3753e-06 - gradient_norm: 3.0734 - val_det_loss: 0.2729 - val_cls_loss: 0.1928 - val_box_loss: 0.0016 - val_reg_l2_loss: 0.0632 - val_loss: 0.3361
Epoch 119/120
3/3 [==============================] - 1s 363ms/step - det_loss: 0.2607 - cls_loss: 0.1643 - box_loss: 0.0019 - reg_l2_loss: 0.0632 - loss: 0.3239 - learning_rate: 5.4449e-07 - gradient_norm: 2.8871 - val_det_loss: 0.2664 - val_cls_loss: 0.1890 - val_box_loss: 0.0015 - val_reg_l2_loss: 0.0632 - val_loss: 0.3296
Epoch 120/120
3/3 [==============================] - 1s 433ms/step - det_loss: 0.2595 - cls_loss: 0.1696 - box_loss: 0.0018 - reg_l2_loss: 0.0632 - loss: 0.3227 - learning_rate: 3.2667e-07 - gradient_norm: 2.3589 - val_det_loss: 0.2606 - val_cls_loss: 0.1856 - val_box_loss: 0.0015 - val_reg_l2_loss: 0.0632 - val_loss: 0.3238
Model: ""
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
keras_layer (KerasLayer)     multiple                  3234464   
_________________________________________________________________
class_net/class-predict (Sep multiple                  1746      
_________________________________________________________________
box_net/box-predict (Separab multiple                  2916      
=================================================================
Total params: 3,239,126
Trainable params: 3,191,990
Non-trainable params: 47,136
_________________________________________________________________
1/1 [==============================] - 3s 3s/step

?预测结果如下：