引言
介绍如何使用卷积神经网络实现参数回归预测,本文将通过卷积神经网络实现一个回归网络预测人脸landmark,这里主要是预测最简单的五点坐标。
一、人脸五点标定数据与制作
1、数据采集与说明
- 人脸图像数据采集-1046张人脸数据
2、标注工具与标注
工具下载链接:Face-Annotation
python annotate_faces.py -d D:\Face-Annotation-Tool\demo
- GUI:
-Sample output:
D:\Face-Annotation-Tool\demo\1488.jpg 87,120,218,146,149,208,102,242,163,250,0,0,1,1
3、自定义数据集类
- Map-style数据集
- 五点标定数据集与数据加载
- Image – 图像
- Landmarks – 五点坐标
4、代码
import torch
import numpy as np
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils
import cv2 as cv
class FaceLandmarksDataset(Dataset):
def __init__(self, txt_file):
self.transform = transforms.Compose([transforms.ToPILImage(),
transforms.Resize((64, 64)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
])
lines = []
with open(txt_file) as read_file:
for line in read_file:
line = line.replace('\n', '')
lines.append(line)
self.landmarks_frame = lines
def __len__(self):
return len(self.landmarks_frame)
def num_of_samples(self):
return len(self.landmarks_frame)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
contents = self.landmarks_frame[idx].split('\t')
image_path = contents[0]
img = cv.imread(image_path)
h, w, c = img.shape
landmarks = np.zeros(10, dtype=np.float32)
for i in range(1, len(contents), 2):
landmarks[i - 1] = np.float32(contents[i]) / w
landmarks[i] = np.float32(contents[i + 1]) / h
landmarks = landmarks.astype('float32').reshape(-1, 2)
sample = {'image': self.transform(img), 'landmarks': torch.from_numpy(landmarks)}
return sample
if __name__ == "__main__":
ds = FaceLandmarksDataset("D:/AllKindsOfCode/PytorchClass/face_landmark_src/landmark_output.txt")
print(len(ds))
print("start")
for i in range(len(ds)):
sample = ds[i]
print(ds[i])
if i == 3:
break
dataloader = DataLoader(ds, batch_size=4, shuffle=True)
for i_batch, sample_batched in enumerate(dataloader):
print(i_batch, sample_batched['image'].size(), sample_batched['landmarks'].size())
print("")
二、模型设计与推理
三、部署推理
|