[开发测试] 上传图片重新调整大小导致内存溢出oom

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 开发测试 -> 上传图片重新调整大小导致内存溢出oom -> 正文阅读

[开发测试]上传图片重新调整大小导致内存溢出oom

压测上传人员图片接口

一、了解案发现场情况：

并发1 ，压测10分钟，正常
并发10，压测10分钟，正常
并发20 ，压测10分钟，内存占用增大，qps下降，但是无oom，无失败，接口平均响应时长增大
并发30 ，测试10分钟，服务开始变慢，压测失败数量增高，线程不断oom报错，最后导致服务崩溃

二、分析原因

分析请求链路
1. 压测工具-> openapI ms -> middle-base ms -> file server ，很明显base微服务出问题
查看base服务日志，发现大量的上传图片请求处理，后期每一个请求报一个oom，最后服务崩溃
1. 怀疑内存溢出导致，

复现压测场景

使用arthas 工具， dashboard ，重新压力30测试，发现处理大约30个左右线程，大部分处于wait状态，运行时间2min 十几个，本来一个上传图片的请求不该执行这么长时间
使用 thread -3 ，查看最繁忙的线程做什么，发现大多数阻塞在一个resize方法，改方法会重新调整图片分辨率大小，压测使用1M 和4M图片，但是没有规定分辨率，业务后台会自动把图片调整到指定的分表率，大概率这个方法出问题了

接下来分为2步，

第一步，导出oom日志，使用mat工具分析，发现奇奇整整30个线程，查看占用内存最大，发现每个线程自身存在一个大的字段存储图片4M ，使用G1 ，一共1g内存，，会直接分配到大对象中，
本地复现，把这段上传图片代码复制到maim方法重新写一遍，读取本地文件，sleep线程，启动后使用jvisivmg 图形化工具查看内存占用，发现执行2次resize 后，内存增加60M ，同时30个并发，内存飙升600M，不断触发minjor 和mixed gc

public static byte[] fullHandler(File file) throws IOException {
    ImageInfo image = ImageUtil.resize(file, 480, 0);//第一次调用方法
    List<AnalyzeResp> analyzeRespList = FoliageUtil.analyzeMulti(image.getImage());
    if(CollectionUtils.isEmpty(analyzeRespList)){
        log.info("no face found");
        return  null;
    }
    AnalyzeResp analyzeResp = analyzeRespList.get(0);
    int x1 = analyzeResp.getFace_info().getRect().getLeft();
    // int y1 = image.getHeight() - analyzeResp.getFace_info().getRect().getTop();
    int y1 = analyzeResp.getFace_info().getRect().getTop();
    int x0 = (int) (x1 * ((double) image.getOrgWidth() / (double) image.getWidth()));
    int y0 = (int) (y1 * ((double) image.getOrgHeight() / image.getHeight()));
    int faceHeight =
            analyzeResp.getFace_info().getRect().getBottom() - analyzeResp.getFace_info().getRect().getTop();
    int faceWidth =
            analyzeResp.getFace_info().getRect().getRight() - analyzeResp.getFace_info().getRect().getLeft();
    int faceTrueHeight = (int) (faceHeight * ((double) image.getOrgHeight() / image.getHeight()));
    int faceTrueWidth = (int) (faceWidth * ((double) image.getOrgWidth() / image.getWidth()));
    //
    boolean flag = true;
    int x00 = (int) (x0 - (faceTrueWidth / 2d));
    if (x00 < 0) {
        flag = false;
    }
    int faceTrueWidthExtend = faceTrueWidth * 2;
    if (flag && (x00 + faceTrueWidthExtend > image.getOrgWidth())) {
        flag = false;
    }
    int y00 = (int) (y0 - (faceTrueHeight * 0.7));
    if (flag && y00 < 0) {
        flag = false;
    }
    int faceTrueHeightExtend = faceTrueHeight * 2;
    if (flag && y00 + faceTrueHeightExtend > image.getOrgHeight()) {
        flag = false;
    }
    byte[] ret = null;
    if (flag) {
        ret = ImageUtil.getBytesByCmpImageSub(file, x00, y00, faceTrueWidthExtend, faceTrueHeightExtend);
    }
    if (ret == null) {
        ret = image.getImage();
    }
    ImageInfo result = ImageUtil.resize(ret, 800, 200);// 第二次
    if (result == null) {
        return ret;
    }
    return result.getImage();
}

public static ImageInfo resize(byte[] image, int objectLength, int minlength) throws IOException {
    ImageInfo imageInfo = new ImageInfo();
    InputStream byteArrayInputStream = new ByteArrayInputStream(image);
    BufferedImage sourceImg = ImageIO.read(byteArrayInputStream);
    try {
        int minLength = sourceImg.getHeight() > sourceImg.getWidth() ? sourceImg.getWidth() : sourceImg.getHeight();
        if (minLength < minlength) {
            return null;
        }
        imageInfo.setOrgHeight(sourceImg.getHeight());
        imageInfo.setOrgWidth(sourceImg.getWidth());
        if (minLength < objectLength) {
            imageInfo.setImage(image);
            imageInfo.setHeight(sourceImg.getHeight());
            imageInfo.setWidth(sourceImg.getWidth());
            return imageInfo;
        }
        // 高度不符合标准的进行等比压缩
        try {
            double height, width;
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            if (sourceImg.getHeight() > sourceImg.getWidth()) {
                width = objectLength;
                height = (double) sourceImg.getHeight() * ((double) objectLength / (double) sourceImg.getWidth());
            } else {
                width = (double) sourceImg.getWidth() * ((double) objectLength / (double) sourceImg.getHeight());
                height = objectLength;
            }
            Thumbnails.of(new ByteArrayInputStream(image)).size((int) width, (int) height).toOutputStream(bos);
            imageInfo.setImage(bos.toByteArray());
            imageInfo.setHeight((int) height);
            imageInfo.setWidth((int) width);
            return imageInfo;
        } catch (Throwable e) {
            e.printStackTrace();
            return null;
        }
    } catch (Exception e) {
        e.printStackTrace();
        return imageInfo;

    }
}

总结：目前看来，压测会不断并发30个，应该就是改方法导致，重新resize4m 图片，查询google，会把图片摊开，再押错，摊开后，内在占用会增大，

三、导出oom分析

图片、堆栈日志、分析报告：链接: https://pan.baidu.com/s/1NVrPtcYwuTPZEq3l2_vg-w 提取码: mbil

https://gceasy.ycrash.cn/my-gc-report.jsp?p=c2hhcmVkLzIwMjIvMDEvMTcvLS1nYy5sb2ctLTgtNS00NA==&channel=WEB

使用垃圾回收器 G1 ，内存参数设置

CMD java -Xms1G -Xmx1G -XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -XX:MaxGCPauseMillis=30 -XX:+DisableExplicitGC -XX:TargetSurvivorRatio=90 -XX:G1NewSizePercent=25 -XX:G1MaxNewSizePercent=40 -XX:G1MixedGCLiveThresholdPercent=35 -XX:+AlwaysPreTouch -XX:+ParallelRefProcEnabled -jar micro-server-base-1.0.6.2-SNAPSHOT-exec.jar

1. System 时间大于 User 时间

刚开始压测，正常运行，随着请求处理增加，每个请求需要60m空间，一共1G ，其中图片4M，会直接分配到大对象，随着内存增加，先进行mijor gc ，然后mixed gc（full gc），刚开始可以正常回收，但是后来system占用时间越来愈多，

刚gc完毕，大量线程过来，导致系统不断gc，中间偶尔用户线程执行完，一个线程过来申请内存，直接移除，接下来一个接一个，
内在老年代里因为新时代存在引用不能及时回收，新生代不断有对象进入，又gc死亡，导致mixgc回收不过来，最终导致内存不断溢出，直到崩溃

old区回收情况

开始时，还可以注册回收

最后，回收不掉

这样下去，肯定会溢出，但是为什么回收不掉呢? 是内存泄漏了吗，报告没有，具体待分析
初步定位30个线程，每个持有resize后大对象，根据线程执行进度不同，有道resize（）变大60m，有的还没有走到resize（），总共占用950M，通过oom堆快照可以看出每个线程内存不同，进度不同，而且oom日志，可以看到线程栈出错位置也不一样