[人工智能] 【MATLAB深度学习】采用 Deeplab v3+ 实现全景分割

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> 【MATLAB深度学习】采用 Deeplab v3+ 实现全景分割 -> 正文阅读

[人工智能]【MATLAB深度学习】采用 Deeplab v3+ 实现全景分割

语义分割网络对图像中的每个像素进行分类，从而对图像进行分割。语义分割的应用包括用于自动驾驶的道路分割和用于医疗诊断的癌细胞分割。本文展示了如何使用 MATLAB 训练语义分割网络 Deeplab v3+，实现了自动驾驶场景下的全景分割。

本例使用剑桥大学的CamVid数据集进行训练。这个数据集是一个图像集合，包含驾驶时获得的街道视图。该数据集提供了32个语义类的像素级标签，包括汽车、行人和道路。如下图所示：

在这里插入图片描述

文章目录

一、下载预训练模型

这个例子创建了Deeplab v3+网络，权值由预先训练的Resnet-18网络初始化。ResNet-18是一个高效的网络，非常适合处理资源有限的应用程序。根据应用需求，还可以使用其他预先训练过的网络，如MobileNet v2或ResNet-50。

使用ResNet-18前，需要打开附加功能资源管理器，并点击安装Deep Learning Toolbox Model for ResNet-18 Network。

在这里插入图片描述
安装完后，可以开始下载预训练模型：

if  ~exist('pretrainedNetwork/deeplabv3plusResnet18CamVid.mat','file')
    disp('Downloading pretrained network (58 MB)...');
    pretrainedURL = 'https://www.mathworks.com/supportfiles/vision/data/deeplabv3plusResnet18CamVid.mat';
    websave('pretrainedNetwork/deeplabv3plusResnet18CamVid.mat', pretrainedURL);
end

二、数据集准备

1、数据集下载

本次实验使用的是CamVid数据集，下面我们进行下载和解压：

imageURL = 'http://web4.cs.ucl.ac.uk/staff/g.brostow/MotionSegRecData/files/701_StillsRaw_full.zip';
labelURL = 'http://web4.cs.ucl.ac.uk/staff/g.brostow/MotionSegRecData/data/LabeledApproved_full.zip';
 
outputFolder = 'CamVid'; 
labelsZip = fullfile(outputFolder,'labels.zip');
imagesZip = fullfile(outputFolder,'images.zip');

if ~exist(outputFolder,'file')  
    mkdir(outputFolder); 
end

if ~exist(labelsZip, 'file') 
    disp('Downloading 16 MB CamVid dataset labels...'); 
    websave(labelsZip, labelURL);
    disp('Complete Download CamVid dataset labels!'); 
    unzip(labelsZip, fullfile(outputFolder,'labels'));
    disp('Complete Unzip CamVid dataset labels!'); 
end

if ~exist(imagesZip,'file')  
    disp('Downloading 557 MB CamVid dataset images...');  
    websave(imagesZip, imageURL);       
    disp('Complete Download CamVid dataset images!'); 
    unzip(imagesZip, fullfile(outputFolder,'images')); 
    disp('Complete Unzip CamVid dataset images!'); 
end

2、加载 CamVid 图像

imgDir = fullfile(outputFolder,'images');
imds = imageDatastore(imgDir);
Img = readimage(imds,559);
Img = histeq(Img);
imshow(Img)

3、创建图像、标签数据存储

classes = [
    "Sky"
    "Building"
    "Pole"
    "Road"
    "Pavement"
    "Tree"
    "SignSymbol"
    "Fence"
    "Car"
    "Pedestrian"
    "Bicyclist"
    ];

% 标签转换
labelIDs = camvidPixelLabelIDs();
% 使用类和标签id创建pixelLabelDatastore
labelDir = fullfile(outputFolder,'labels');
pxds = pixelLabelDatastore(labelDir,classes,labelIDs);

4、划分数据集

[imdsTrain, imdsVal, pxdsTrain, pxdsVal] = partitionCamVidData(imds, pxds);

二、创建 Deeplab v3+ 网络

1、参数初始化

% 输入图像尺寸
imageSize = [720 960 3];

% 类别数
numClasses = numel(classes);

2、创建网络

% Create DeepLab v3+.
lgraph = deeplabv3plusLayers(imageSize, numClasses, "resnet18");

三、训练网络

1、使用类权重来平衡类

tbl = countEachLabel(pxds);
imageFreq = tbl.PixelCount ./ tbl.ImagePixelCount;
classWeights = median(imageFreq) ./ imageFreq;

pxLayer = pixelClassificationLayer('Name','labels','Classes',tbl.Name,'ClassWeights',classWeights);
lgraph = replaceLayer(lgraph,"classification",pxLayer);

2、设置训练选项

dsVal = combine(imdsVal,pxdsVal);
dsTrain = combine(imdsTrain, pxdsTrain);

% Define training options. 
options = trainingOptions('sgdm', ...
    'LearnRateSchedule','piecewise',...
    'LearnRateDropPeriod',10,...
    'LearnRateDropFactor',0.3,...
    'Momentum',0.9, ...
    'InitialLearnRate',1e-3, ...
    'L2Regularization',0.005, ...
    'ValidationData',dsVal,...
    'MaxEpochs',30, ...  
    'MiniBatchSize',8, ...
    'Shuffle','every-epoch', ...
    'CheckpointPath', tempdir, ...
    'VerboseFrequency',2,...
    'Plots','training-progress',...
    'ValidationPatience', 4);

3、开始训练

doTraining = false;
if doTraining    
    [net, info] = trainNetwork(dsTrain,lgraph,options);
else
    data = load(pretrainedNetwork); 
    net = data.net;
end

四、全景分割测试

I = readimage(imdsVal,35);
C = semanticseg(I, net);

cmap = camvidColorMap;
B = labeloverlay(I,C,'Colormap',cmap,'Transparency',0.4);
imshow(B);
pixelLabelColorbar(cmap, classes);