Linux应用内存分配失败的问题
大型的嵌入式应用常占用巨量的内存。一些“杂揉、拼凑”而成的应用,常在一个应用中包含多个功能模块,例如音视频处理模块,系统控制模块等。这样的应用设计会带来一系列的内存问题,最主要的一个是音视频的应用会占用大量的内存空间,从而影响应用的运行性能。笔者根据以往的经验,列出一种与子进程创建相关的内存分配失败问题。当某个应用向Linux内核申请内存但内核无法满足时,内核会根据配置,选择性地杀掉该进程;该功能与OOM-Killer相关(注意,区别于安卓内核中的lowmemory-killer):
[ 113.092762] Tasks state (memory values in pages):
[ 113.097532] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 113.106311] [ 136] 81 136 651 40 32768 0 0 ubusd
[ 113.114629] [ 137] 0 137 962 45 32768 0 0 ash
[ 113.122769] [ 138] 0 138 599 18 32768 0 0 askfirst
[ 113.131360] [ 284] 514 284 756 72 32768 0 0 logd
[ 113.139589] [ 419] 453 419 698 63 36864 0 0 dnsmasq
[ 113.148071] [ 481] 0 481 666 17 32768 0 0 dropbear
[ 113.156652] [ 579] 0 579 1455 50 36864 0 0 hostapd
[ 113.165144] [ 580] 0 580 1455 51 49152 0 0 wpa_supplicant
[ 113.174253] [ 641] 0 641 766 90 32768 0 0 netifd
[ 113.182655] [ 698] 0 698 693 65 36864 0 0 odhcpd
[ 113.191060] [ 912] 0 912 928 20 36864 0 0 udhcpc
[ 113.199463] [ 927] 0 927 984 41 36864 0 0 ntpd
[ 113.207679] [ 1099] 0 1099 209490 208986 1703936 0 0 fork-vfork
[ 113.216431] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=fork-vfork,pid=1099,uid=0
[ 113.229448] Out of memory: Killed process 1099 (fork-vfork) total-vm:837960kB, anon-rss:835944kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1664kB oom_score_adj:0
[ 113.460009] oom_reaper: reaped process 1099 (fork-vfork), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Killed
如上,笔者编写的fork-vfork 调试应用因向内核申请内存失败,而被杀死。可以修改系统配置,以禁用OOM-Killer ;那么当分配内存失败时,应用会得到空指针:
sysctl vm.oom_kill_allocating_task=0
sysctl vm.overcommit_memory=2
子进程创建与写时拷贝(COW)
当调用fork 创建子进程时,内核会将父进程的地址空间拷贝到子进程中。以笔者的调试结果,并未发现写时拷贝发挥作用;该功能对应用的调试应当是透明的,COW 功能可能被内核使用到,但应该忽略该功能的存在:现在可以认为以fork 系统调用(实际上是clone )创建子进程时,内核必须有一个地址空间的拷贝过程。那么当系统的内存不足时,创建子进程就会失败:
Memory allocated: 0x15/0x80, 168.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1445
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 10336 micro seconds
Memory allocated: 0x16/0x80, 176.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1446
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 10743 micro seconds
Memory allocated: 0x17/0x80, 184.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1447
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 11156 micro seconds
Memory allocated: 0x18/0x80, 192.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1448
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 11720 micro seconds
Memory allocated: 0x19/0x80, 200.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1449
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 12192 micro seconds
Memory allocated: 0x1a/0x80, 208.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1450
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 12589 micro seconds
Memory allocated: 0x1b/0x80, 216.00 MB / 1024.00 MB
`forked parent pid: 1423, child pid: 1451
Parent: parent pid: 1423, child pid: 0
time spent in `fork: 12950 micro seconds
Memory allocated: 0x1c/0x80, 224.00 MB / 1024.00 MB
Error, failed to `fork process: Cannot allocate memory
Memory allocated: 0x1d/0x80, 232.00 MB / 1024.00 MB
Error, failed to `fork process: Cannot allocate memory
Memory allocated: 0x1e/0x80, 240.00 MB / 1024.00 MB
Error, failed to `fork process: Cannot allocate memory
注意到,笔者编写的调试应用fork-vfork ,因其分配的内存过多,而不能创建新的子进程;尽管创建子进程失败后,其仍能申请的新的内存,颇有“一山不容二虎”的味道。此外,随着内存的增加,其创建子进程的时间也增加了:这说明COW 功能在fork 系统调用过程中未启作用;更重要的一点是,大内存应用应当避免频繁创建子进程,否则其性能会下降。
以vfork创建进程,避免内存拷贝
当以vfork 系统调用创建子进程时,父进程会等待子进程退出运行,或执行execve 之后,才会继续运行。这样可以避免将父进程的地址空间拷贝到子进程,节约时间。不过这样做的一个后果是,子进程可能“意外”地修改父进程的内存数据:
Memory allocated: 0x1f/0x80, 248.00 MB / 1024.00 MB
vforked parent pid: 1452, child pid: 1483
Parent: parent pid: 1452, child pid: 2021
time spent in vfork: 199 micro seconds
Memory allocated: 0x20/0x80, 256.00 MB / 1024.00 MB
vforked parent pid: 1452, child pid: 1484
Parent: parent pid: 1452, child pid: 2021
time spent in vfork: 189 micro seconds
Memory allocated: 0x21/0x80, 264.00 MB / 1024.00 MB
vforked parent pid: 1452, child pid: 1485
Parent: parent pid: 1452, child pid: 2021
time spent in vfork: 179 micro seconds
Memory allocated: 0x22/0x80, 272.00 MB / 1024.00 MB
vforked parent pid: 1452, child pid: 1486
Parent: parent pid: 1452, child pid: 2021
time spent in vfork: 204 micro seconds
Memory allocated: 0x23/0x80, 280.00 MB / 1024.00 MB
vforked parent pid: 1452, child pid: 1487
Parent: parent pid: 1452, child pid: 2021
time spent in vfork: 178 micro seconds
Memory allocated: 0x24/0x80, 288.00 MB / 1024.00 MB
vforked parent pid: 1452, child pid: 1488
Parent: parent pid: 1452, child pid: 2021
time spent in vfork: 201 micro seconds
如上,系统调用vfork 花费的时间基本上是恒定的,不会内存增加而增长。这是因为避免了内存拷贝,子进程直接用父进程的地址空间去运行。这样的代价是,子进程将父进程的pidvp 指针指向的空间修改成为了2021 ,这一结果在父进程中是可见的:
if (pid == 0) {
*pidvp = getpid();
fprintf(stdout, "\t%sed parent pid: %ld, child pid: %ld\n",
method, (long) getppid(), (long) *pidvp);
fflush(stdout);
*pidvp = 2021;
_exit(0);
}
创建子进程不仅会将父进程中的所有的线程阻塞(以等待fork 系统调用拷贝进程的地址空间,或等待vfork 创建的子进程退出或加载新的进程镜像),还可能因在子进程中响应异步操作(例如在子进程中处理了某个信号)而修改父进程的内存或系统资源。因此,在大内存应用的开发中,不能推荐以vfork 替代fork 系统调用:大内存应用应尽量避免创建子进程。一种常见于桌面系统的方案是,将系统中的多个模块以单进程实现;使用D-Bus等组件,以进程间通信的机制将各个功能模块连接成有机的整体,这即降低了软件内聚,去除耦和,也增加了各个模块的可维护性、可扩展性,从而提升嵌入式应用软件的开发效率。
调试应用源码
笔者编写的fork-vfork 调试应用代码如下:
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>
#include <time.h>
#define MEMBLOCK_MAX_BLOCKS 512
#define MEMBLOCK_DEFAULT_BLOCKS 64
#define MEMBLOCK_MAX_SIZE 0x1000000
#define MEMBLOCK_DEFAULT_SIZE 65536
typedef pid_t (* fork_func_t)(void);
static fork_func_t g_fork = fork;
struct memblock {
unsigned int num_blocks;
unsigned int size_block;
uint64_t size_total;
unsigned char * * blocks;
int ranfd;
};
static void nano_sleep(unsigned int msec)
{
struct timespec tspec;
tspec.tv_sec = (time_t) (msec / 1000);
tspec.tv_nsec = (long) ((msec % 1000) * 1000000);
nanosleep(&tspec, NULL);
}
static unsigned int memblock_count(const struct memblock * mb, int verbose)
{
unsigned int idx;
unsigned int rval;
rval = 0;
for (idx = 0; idx < mb->num_blocks; ++idx) {
if (mb->blocks[idx] != NULL)
rval++;
}
if (verbose) {
double dval0, dval1;
uint64_t size;
size = (uint64_t) rval;
size *= (uint64_t) mb->size_block;
dval0 = (double) size;
dval1 = (double) mb->size_total;
fprintf(stdout, "Memory allocated: %#x/%#x, %.02f MB / %.02f MB\n",
rval, mb->num_blocks, dval0 / 1048576.0, dval1 / 1048576.0);
fflush(stdout);
}
return rval;
}
static void memblock_destory(struct memblock * mb)
{
unsigned int idx;
close(mb->ranfd);
mb->ranfd = -1;
for (idx = 0; idx < mb->num_blocks; ++idx) {
if (mb->blocks[idx] != NULL) {
free(mb->blocks[idx]);
mb->blocks[idx] = NULL;
}
}
}
static int memblock_free(struct memblock * mb)
{
ssize_t rl1;
unsigned int avail;
unsigned int freeidx;
unsigned int idx, count;
avail = memblock_count(mb, 0);
if (avail == 0)
goto err0;
freeidx = 0;
rl1 = read(mb->ranfd, &freeidx, sizeof(freeidx));
if (rl1 != (ssize_t) sizeof(freeidx)) {
fprintf(stderr, "Error, failed to read random device: %s\n",
strerror(errno));
fflush(stderr);
return -1;
}
freeidx = freeidx % avail;
count = 0;
for (idx = 0; idx < mb->num_blocks; ++idx) {
if (mb->blocks[idx] == NULL)
continue;
if (count == freeidx) {
free(mb->blocks[idx]);
mb->blocks[idx] = NULL;
return 0;
}
count++;
}
err0:
fputs("Warning, no available memory block to free!\n", stderr);
fflush(stderr);
return -1;
}
static int memblock_alloc(struct memblock * mb)
{
size_t sizeb, sizer;
unsigned char * newblock;
unsigned int idx, empty;
empty = ~0u;
newblock = NULL;
for (idx = 0; idx < mb->num_blocks; ++idx) {
if (mb->blocks[idx] == NULL) {
empty = idx;
break;
}
}
if (empty == ~0u) {
fputs("Warning, no memory block available\n", stderr);
fflush(stderr);
return -1;
}
sizeb = (size_t) mb->size_block;
if (sizeb == 0 || sizeb > MEMBLOCK_MAX_SIZE) {
fprintf(stderr, "Error, invalid memory block size: %#x\n",
(unsigned int) sizeb);
fflush(stderr);
return -1;
}
newblock = (unsigned char *) malloc(sizeb);
if (newblock == NULL) {
fprintf(stderr, "Error, system out of memory: %#x\n",
(unsigned int) sizeb);
fflush(stderr);
return -1;
}
sizer = 0;
while (sizer < sizeb) {
ssize_t rval;
size_t sizec = sizeb - sizer;
if (sizec > 65536) {
sizec = 65536;
}
rval = read(mb->ranfd, newblock + sizer, sizec);
if (rval != (ssize_t) sizec) {
fprintf(stderr, "Error, failed to read random device: %s\n",
strerror(errno));
fflush(stderr);
break;
}
sizer += sizec;
}
if (sizer < sizeb) {
free(newblock);
return -1;
}
mb->blocks[empty] = newblock;
return 0;
}
static int memblock_init(struct memblock * mb,
int argc, char *argv[])
{
unsigned int num_blocks;
unsigned int size_block;
uint64_t size_total = 0;
num_blocks = MEMBLOCK_DEFAULT_BLOCKS;
if (argc > 0x1) {
num_blocks = (unsigned int) strtoul(argv[1], NULL, 0x0);
if (num_blocks == 0 || num_blocks > MEMBLOCK_MAX_BLOCKS) {
num_blocks = MEMBLOCK_DEFAULT_BLOCKS;
fprintf(stderr, "Error, invalid number of blocks: %s\n", argv[1]);
fflush(stderr);
}
}
size_block = MEMBLOCK_DEFAULT_SIZE;
if (argc > 0x2) {
size_block = (unsigned int) strtoul(argv[2], NULL, 0x0);
if (size_block == 0 || size_block > MEMBLOCK_MAX_SIZE) {
size_block = MEMBLOCK_DEFAULT_SIZE;
fprintf(stderr, "Error, invalid block size: %s\n", argv[2]);
fflush(stderr);
}
}
size_total = (uint64_t) num_blocks;
size_total *= (uint64_t) size_block;
fprintf(stdout, "INFO: memory blocks: %u, block size: %#x, total: 0x%x%08x\n",
num_blocks, size_block, (unsigned int) (size_total >> 32), (unsigned int) size_total);
fflush(stdout);
mb->num_blocks = num_blocks;
mb->size_block = size_block;
mb->size_total = size_total;
mb->blocks = (unsigned char * *) calloc(
(size_t) (num_blocks + 0x1), sizeof(unsigned char *));
if (mb->blocks == NULL) {
fputs("Error, system out of memory!\n", stderr);
fflush(stderr);
return -1;
}
mb->ranfd = open("/dev/urandom", O_RDONLY);
if (mb->ranfd == -1) {
fprintf(stderr, "Error, failed to open random device: %s\n", strerror(errno));
fflush(stderr);
free((void *) mb->blocks);
mb->blocks = NULL;
return -1;
}
return 0;
}
static int fork_process(void)
{
int stats;
long interval;
pid_t pid, pidr;
const char * method;
struct timespec now, then;
volatile pid_t * pidvp = NULL;
now.tv_sec = 0;
now.tv_nsec = 0;
then.tv_sec = 0;
then.tv_nsec = 0;
pidvp = (volatile pid_t *) malloc(4906);
if (pidvp == NULL) {
fputs("Error, system out of memory for pidvp!\n", stderr);
fflush(stderr);
return -1;
}
*pidvp = 0;
method = (g_fork == fork) ? "`fork" : "vfork";
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &now);
pid = g_fork();
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &then);
if (pid == (pid_t) -1) {
fprintf(stderr, "\tError, failed to %s process: %s\n",
method, strerror(errno));
fflush(stderr);
free((void *) pidvp);
pidvp = NULL;
return -1;
}
if (pid == 0) {
*pidvp = getpid();
fprintf(stdout, "\t%sed parent pid: %ld, child pid: %ld\n",
method, (long) getppid(), (long) *pidvp);
fflush(stdout);
*pidvp = 2021;
_exit(0);
}
again:
stats = 0;
pidr = waitpid(pid, &stats, 0);
if (pidr == (pid_t) -1) {
int err_n;
err_n = errno;
if (err_n == EINTR)
goto again;
fprintf(stderr, "\tError, failed to waitpid(%ld): %s\n",
(long) pid, strerror(err_n));
fflush(stderr);
free((void *) pidvp);
pidvp = NULL;
return -1;
}
interval = then.tv_nsec - now.tv_nsec;
interval /= 1000;
interval += (long) ((then.tv_sec - now.tv_sec) * 1000);
fprintf(stdout, "\tParent: parent pid: %ld, child pid: %ld\n\t\ttime spent in %s: %ld micro seconds\n",
(long) getpid(), (long) *pidvp, method, interval);
fflush(stdout);
free((void *) pidvp);
pidvp = NULL;
return 0;
}
int main(int argc, char *argv[])
{
int ret, rval;
unsigned int count;
unsigned int max_blocks;
struct memblock mblock;
rval = 0;
count = 0;
if (argc > 0x3) {
g_fork = vfork;
fputs("Using vfork to create child process.\n", stdout);
fflush(stdout);
} else {
fputs("Using `fork to create child process.\n", stdout);
fflush(stdout);
}
ret = memblock_init(&mblock, argc, argv);
if (ret < 0)
return 1;
max_blocks = (mblock.num_blocks * 0x3) >> 0x2;
while (count < max_blocks) {
ret = memblock_alloc(&mblock);
if (ret < 0) {
rval = 2;
goto err0;
}
count = memblock_count(&mblock, 0x1);
fork_process();
nano_sleep(250);
}
again:
while (count < mblock.num_blocks) {
memblock_alloc(&mblock);
count++;
memblock_count(&mblock, 0x1);
fork_process();
nano_sleep(750);
}
while (count >= max_blocks) {
ret = memblock_free(&mblock);
if (ret < 0) {
rval = 3;
goto err0;
}
count = memblock_count(&mblock, 0x1);
fork_process();
nano_sleep(750);
}
goto again;
err0:
memblock_destory(&mblock);
return rval;
}
编译和运行的操作如下:
aarch64-linux-gnu-gcc -Wall -O1 -ggdb -fPIC -o fork-vfork fork-vfork.c
./fork-vfork 128 0x800000 # 调用 fork
./fork-vfork 128 0x800000 1 # 调用 vfork
|