代码源自mask-rcnn模型,github地址如下:https://github.com/multimodallearning/pytorch-mask-rcnn/blob/809abba590db89779ac02c42286135f18ea08b53/utils.py#L270
需要将图像处理成正方形,代码如下:
def resize_image(image, min_dim=None, max_dim=None, padding=False):
h, w = image.shape[:2]
window = (0, 0, h, w)
scale = 1
# Scale?
if min_dim:
# Scale up but not down
scale = max(1, min_dim / min(h, w))
# Does it exceed max dim?
if max_dim:
image_max = max(h, w)
if round(image_max * scale) > max_dim:
scale = max_dim / image_max
# Resize image and mask
if scale != 1:
image = scipy.misc.imresize(
image, (round(h * scale), round(w * scale)))
# Need padding?
if padding:
# Get new height and width
h, w = image.shape[:2]
top_pad = (max_dim - h) // 2
bottom_pad = max_dim - h - top_pad
left_pad = (max_dim - w) // 2
right_pad = max_dim - w - left_pad
padding = [(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
image = np.pad(image, padding, mode='constant', constant_values=0)
window = (top_pad, left_pad, h + top_pad, w + left_pad)
return image, window, scale, padding
针对不同尺寸的图像,直接将原始图像作为输入,并将batchsize设置为1,然后使用梯度累积变相扩充batchsize,不过我试过这种方法效果不是很好。下次试试上面这个resize将原图变成正方形,观察其效果如何
|