利用gpustat或nvidia-smi实时监控GPU使用率

安装gpustat

apt install gpustat

启动gpustat

watch -n1 --color gpustat --color

每秒输出实时监测结果，如下图：在这里插入图片描述

也可利用nvidia-smi实时监控，会显示更多的参数

$ watch -n 1 nvidia-smi --query-gpu=index,gpu_name,memory.total,memory.used,memory.free,temperature.gpu,pstate,utilization.gpu,utilization.memory --format=csv

在这里插入图片描述

输出torch对应的设备

首先在python里检查，也是大家用的最多的方式，检查GPU是否可用（但实际并不一定真的在用）

torch.cuda.is_available()

更严谨一些，在程序运行的时候查看是否真的在使用GPU，插入代码，在运行时输出torch对应的设备，如果这里输出的是CPU，肯定就没有在GPU上运行了。

    # setting device on GPU if available, else CPU
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print('Using device:', device)
    print()
    
    #Additional Info when using cuda
    if device.type == 'cuda':
        print(torch.cuda.get_device_name(0))
        print('Memory Usage:')
        print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
        print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

参考文章：如何检查pytorch是否正在使用GPU？

使用简单全连接网络检测GPU情况

可以直接运行一个简单的全连接网络，查看GPU的使用情况：

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary
from torchvision import models

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        #此处的16*5*5为conv2经过pooling之后的尺寸，即为fc1的输入尺寸，在这里写死了，因此后面的输入图片大小不能任意调整
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    def num_flat_features(self, x):
        size = x.size()[1:]
        num_features = 1
        for s in size:
            num_features *= s
        return num_features
net = Net()
print(net)

params = list(net.parameters())
print (len(params))
print(params[0].size())
print(params[1].size())
print(params[2].size())
print(params[3].size())
print(params[4].size())
print(params[5].size())
print(params[6].size())
print(params[7].size())
print(params[8].size())
print(params[9].size())

input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
vgg = net.to(device)
summary(vgg, (1, 32, 32))