IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 人工智能 -> Color Map Optimization for 3D Reconstruction with Consumer Depth Cameras 论文解读 -> 正文阅读

[人工智能]Color Map Optimization for 3D Reconstruction with Consumer Depth Cameras 论文解读

输入

  • color/depth:输入RGB-D相机手持生成的连续图像(.jpg/.png)

  • 3d点云文件(.ply)

  • 相机轨迹:由KinectFusion生成(.log)

  • 相机内参:

    • 数据格式
      { f x 0 c x 0 f y c y 0 0 1 } \left\{ \begin{matrix} fx & 0 & cx \\ 0 & fy & cy \\ 0 & 0 & 1 \end{matrix} \right \} ????fx00?0fy0?cxcy1?????

    • fx = 525.0; fy = 525.0; // default focal length
      cx = 319.5; cy = 239.5; // default optical center
      

数据处理

  • 相机是手持的,会导致运动模糊
    • 进行模糊评估,从输入图像序列中选取一个子集
    • 选定一帧后,在该帧后的 ( t ? , t + ) (t-,t+) (t?,t+)时间间隔内选取模糊程度最低的一帧
      • 实际中使用 ( t ? , t + ) = ( 1 , 5 ) s (t-,t+)=(1,5)s (t?,t+)=(1,5)s
    • 关键帧表示为 I i I_i Ii?
      • 将mesh网格渲染到 I i I_i Ii?的图像平面上
      • 使用距离 I i I_i Ii?最近的深度图算出相机pose
      • 通过比较每个顶点的深度值和相应的深度进行可见性测试,由此可以确定关键帧 I i I_i Ii?对应的顶点子集
      • 这个顶点子集进一步通过投影点到图像边缘的距离和深度不连续性进行滤除。保留的顶点表示为 P i {P_i} Pi?
        • 实际中阈值为9个像素

Camera Pose Optimization

论文解读

整体思路

  • 输入:重建好的三维点云和一组关键帧,以及每一张关键帧相应的相机姿态
  • 原因:重建过程中估计的相机姿态存在一定的误差,直接进行纹理映射会产生重影模糊等问题
  • 优化过程
    • 首先根据点云3D坐标和关键帧的相机位姿,将3D顶点其投影到2D平面,得到若干张投影图
    • 根据每个顶点投影后的uv坐标获取到关键帧相应位置的像素值
    • 每个顶点在每个可见该顶点的关键帧中都可以得到一个对应的像素值 Γ i ( p , T i ) \Gamma_i(p, T_i) Γi?(p,Ti?),将这些像素值求平均就是 C§
  • 目标:让 C§ Γ i ( p , T i ) \Gamma_i(p, T_i) Γi?(p,Ti?)尽可能的接近。

目标函数设置

  • 输入
    • 关键帧集合 { I i } \{I_i\} {Ii?}
    • 关键帧对应的顶点子集 { P i } \{P_i\} {Pi?}
  • 优化物:位姿矩阵 { T i } \{T_i\} {Ti?}
  • notes:使用齐次坐标
    • P i ? P 3 P_i \subset \Bbb P^3 Pi??P3
    • T i T_i Ti? 4 × 4 4\times4 4×4的矩阵
    • 齐次坐标(homogeneous coordinates)
      • 在投影空间中,两条平行线会相交于无限远处的一点,在 ( X , Y ) 的基础上增加一个新的分量w,变成 ( x , y , w ) X = x / w , Y = y / w
  • 最终目标:
    • ? p ∈ P \forall p \in P ?pP,最大化p在所有相关联图像中 I p = { I i : p ∈ P } I_p = \{I_i : p \in P\} Ip?={Ii?:pP}的颜色一致性

刚体变换(Rigid transformation)

这里 { T i } \{T_i\} {Ti?}主要是考虑刚体变换(如旋转,平移),非刚体矫正在下一部分进行考虑

目标函数

目标函数为:
E ( C , T ) = ∑ i ∑ p ∈ P ( C ( p ) ? Γ i ( p , T i ) ) 2 E(C,T)=\sum_i\sum_{p\in P}(C(p)-\Gamma_i(p,T_i))^2 E(C,T)=i?pP?(C(p)?Γi?(p,Ti?))2

  • Γ i ( p , T i ) \Gamma_i(p,T_i) Γi?(p,Ti?)是点p在图像 I i I_i Ii?对应投影点的颜色值

  • C§是集合 { Γ i ( p , T i ) } I i ∈ I p \{\Gamma_i(p,T_i)\}_{I_i\in I_p} {Γi?(p,Ti?)}Ii?Ip?? (所有和点p相关的纹理图像的平均值)

  • 计算: Γ i ( u ( g ( p , T i ) ) ) \Gamma_i(u(g(p,T_i))) Γi?(u(g(p,Ti?)))

    • 刚体变换 g

    • 投影变换 u

    • 颜色变换 Γ i \Gamma_i Γi?
      g ( p , T i ) = T i p u ( g x , g y , g z , g w ) = ( g x f x g z + c x , g y f y g z + c y ) T g(p,T_i)= T_i p \\ u(g_x,g_y,g_z,g_w)=(\frac{g_xf_x}{g_z}+c_x,\frac{g_yf_y}{g_z}+c_y)^T g(p,Ti?)=Ti?pu(gx?,gy?,gz?,gw?)=(gz?gx?fx??+cx?,gz?gy?fy??+cy?)T

  • 可以推导出最终目标函数和残差项
    E ( C , T ) = ∑ i ∑ p ∈ P r i , p 2 r i , p = C ( p ) ? Γ i ( u ( g ( p , T i ) ) ) E( C, T)=\sum_i\sum_{ p\in P}r_{i,p}^2 \\ r_{i,p}= C( p)-\Gamma_i( u( g( p, T_i))) E(C,T)=i?pP?ri,p2?ri,p?=C(p)?Γi?(u(g(p,Ti?)))

高斯牛顿法求解

  • 原因:求解非线性最小二乘问题

  • 需要优化的参数整合为: x 0 = [ C 0 , T 0 ] x^0=[C^0,T^0] x0=[C0,T0]

    • T i 0 T_i^0 Ti0?是距离 I i I_i Ii?最近的深度图提供的相机姿态
    • C 0 ( p ) C^0(p) C0(p) { Γ i ( p , T i ) } I i ∈ I p \{\Gamma_i(p,T_i)\}_{I_i \in I_p} {Γi?(p,Ti?)}Ii?Ip??的平均值
  • 求解过程

    • 每次迭代都对参数x进行更新
      x k + 1 = x k + Δ x x^{k+1}=x^k+\Delta x xk+1=xk+Δx

    • 求解 Δ x \Delta x Δx
      J r T J t Δ x = ? J r T r J_r^T J_t \Delta x = -J_r^T r JrT?Jt?Δx=?JrT?r

      • r为残差项
      • J为雅各比矩阵(一阶导数以一定形式排列成的矩阵)

      r = [ r i , p ( x ) ∣ x = x k ] ( i , p ) , J r = [ ? r i , p ( x ) ∣ x = x k ] ( i , p ) r=[r_{i, p}( x)\mid _{ x= x^k}]_{(i, p)},\\ J_ r=[\nabla r_{i, p}( x)\mid _{ x= x^k}]_{(i, p)} r=[ri,p?(x)x=xk?](i,p)?,Jr?=[?ri,p?(x)x=xk?](i,p)?

    • J r J_r Jr?

      • 在每次迭代中,根据目标函数的定义计算残差项r

      • $ r_{i, p} $关于 C和 T j ∣ j ≠ i T_j\mid _{j\neq i} Tj?j?=i?的偏导都是平凡解

        • T i T_i Ti?局部线性化近似为 T i k T_i^k Tik?
          T i ≈ { 1 ? γ i β i α i γ i 1 ? α i a i ? β i α i 1 c i 0 0 0 1 } T i k T_i \approx \left\{ \begin{matrix} 1 & -\gamma_i & \beta_i & \alpha_i\\ \gamma_i & 1 & -\alpha_i & a_i\\ -\beta_i & \alpha_i & 1 & c_i \\ 0 & 0 & 0 & 1 \end{matrix} \right\}T_i^k Ti?????????1γi??βi?0??γi?1αi?0?βi??αi?10?αi?ai?ci?1?????????Tik?

        • ξ i = ( α i , β i , γ i , a i , b i , c i ) T \xi_i = (\alpha_i , \beta_i, \gamma_i, a_i, b_i, c_i )^T ξi?=(αi?,βi?,γi?,ai?,bi?,ci?)T

        • 由链式法则,可得:
          ? r i , p ( ξ i ) ∣ x = x k = ? ? ? ξ i ( Γ i ° u ° g ) ∣ x = x k = ? ? Γ i ( u ) J u ( g ) J g ( ξ i ) ∣ x = x k \nabla r_{i, p}(\xi_i)\mid _{ x= x^k}=-\frac{\partial}{\partial \xi_i}(\Gamma_i \circ u \circ g)\mid _{ x= x^k} \\ =-\nabla \Gamma_i( u) J_ u( g) J_ g(\xi_i)\mid _{ x= x^k} ?ri,p?(ξi?)x=xk?=??ξi???(Γi?°u°g)x=xk?=??Γi?(u)Ju?(g)Jg?(ξi?)x=xk?

交替优化法求解

  • 目标:高斯牛顿法计算复杂度过高,十分耗时,因此作者提出交替优化法
  • 过程
    • 当T固定时,非线性最小二乘问题变成了线性最小二乘问题(计算p在各个与其相关的图片上的投影点的颜色平均值)
      • C ( p ) = 1 n P ∑ Γ i ( p , T i ) C(p)=\frac{1}{n_ P}\sum \Gamma_i( p, T_i) C(p)=nP?1?Γi?(p,Ti?)
      • n P n_P nP?是与vertex p相关的图片个数
    • 当C固定时,目标函数分解为多个独立的目标函数
      • E i ( T ) = ∑ p ∈ P i r i , p 2 E_i( T)=\sum_{ p \in P_i}r_{i, p}^2 Ei?(T)=pPi??ri,p2?
      • E i ( T ) E_i(T) Ei?(T) 只和 ξ i \xi_i ξi?的六个变量有关,在每次迭代中,使用高斯牛顿法更新这六个变量
      • 每个 T i T_i Ti?都是相互独立的,因此可以并行计算

源码解析

注:本文只选取核心实现代码,均来源于open3d开源c++源代码,网址链接

雅各比矩阵具体求解实现方法

文中写得比较含糊其辞,个人认为这位大佬的解答推导很全面,不做更多解读

ColorMapUtils.cpp

std::vector<geometry::Image> CreateDepthBoundaryMasks(// 深度边缘检测
        const std::vector<geometry::Image>& images_depth,
        double depth_threshold_for_discontinuity_check, // 视差非连续区检查阈值?查看RigidOptimizer.h
        int half_dilation_kernel_size_for_discontinuity_map) {// 不连续映射的半扩张核大小?查看RigidOptimizer.h
        // 根据阈值判断保留哪些像素点(深度梯度变化过大,就扔掉)
        // 与上一参数相关,当发现边缘时,附近的点可以ignore,该阈值为half kernel
    auto n_images = images_depth.size();
    std::vector<geometry::Image> masks;
    for (size_t i = 0; i < n_images; i++) {
        utility::LogDebug("[MakeDepthMasks] geometry::Image {:d}/{:d}", i,
                          n_images);
        masks.push_back(*images_depth[i].CreateDepthBoundaryMask(
                depth_threshold_for_discontinuity_check,
                half_dilation_kernel_size_for_discontinuity_map));
    }
    return masks;
}

std::tuple<std::vector<std::vector<int>>, std::vector<std::vector<int>>>
CreateVertexAndImageVisibility(
        const geometry::TriangleMesh& mesh,
        const std::vector<geometry::Image>& images_depth,
        const std::vector<geometry::Image>& images_mask,
        const camera::PinholeCameraTrajectory& camera_trajectory,
        double maximum_allowable_depth,// 像素点深度阈值,大于某个阈值的会被ignore(例如背景)
        double depth_threshold_for_visibility_check) {// rgbd图像像素点深度和3d模型深度差别过大会被ignore
    size_t n_camera = camera_trajectory.parameters_.size();
    size_t n_vertex = mesh.vertices_.size();
    
    std::vector<std::vector<int>> visibility_image_to_vertex;// 相机在某位姿条件下可看到的像素点集合
    visibility_image_to_vertex.resize(n_camera);
    
    std::vector<std::vector<int>> visibility_vertex_to_image;//可以看到对应像素点的相机位姿集合
    visibility_vertex_to_image.resize(n_vertex);

#pragma omp parallel for schedule(static) \
        num_threads(utility::EstimateMaxThreads())// 并行计算
    for (int camera_id = 0; camera_id < int(n_camera); camera_id++) {
        for (int vertex_id = 0; vertex_id < int(n_vertex); vertex_id++) {
            Eigen::Vector3d X = mesh.vertices_[vertex_id];
            float u, v, d;
            std::tie(u, v, d) = Project3DPointAndGetUVDepth(
                    X, camera_trajectory.parameters_[camera_id]);// 获取该位姿下图片像素点所对应的三维空间坐标
            int u_d = int(round(u));
            int v_d = int(round(v));
            if (d < 0.0 ||
                !images_depth[camera_id].TestImageBoundary(u_d, v_d)) {// 深度值不满足阈值
                continue;
            }
            float d_sensor =
                    *images_depth[camera_id].PointerAt<float>(u_d, v_d);
            if (d_sensor > maximum_allowable_depth) {// 深度值过大(例如,背景)
                continue;
            }

            if (*images_mask[camera_id].PointerAt<uint8_t>(u_d, v_d) == 255) {// 判断点是否在边缘,在边缘的点会在相机位姿变化下产生过大的颜色变化
                continue;
            }
            if (std::fabs(d - d_sensor) >=
                depth_threshold_for_visibility_check) {// 深度错误,跳过
                continue;
            }
            visibility_image_to_vertex[camera_id].push_back(vertex_id);
            #pragma omp critical(CreateVertexAndImageVisibility)
            { visibility_vertex_to_image[vertex_id].push_back(camera_id); }
        }
    }

    for (int camera_id = 0; camera_id < int(n_camera); camera_id++) {
        size_t n_visible_vertex = visibility_image_to_vertex[camera_id].size();
        utility::LogDebug(
                "[cam {:d}]: {:d}/{:d} ({:.5f}%) vertices are visible",
                camera_id, n_visible_vertex, n_vertex,
                double(n_visible_vertex) / n_vertex * 100);
    }

    return std::make_tuple(visibility_vertex_to_image,
                           visibility_image_to_vertex);
}

void SetProxyIntensityForVertex(
        const geometry::TriangleMesh& mesh,
        const std::vector<geometry::Image>& images_gray,
        const utility::optional<std::vector<ImageWarpingField>>& warping_fields,// 图像非刚性畸变域,nongrigid部分使用
        const camera::PinholeCameraTrajectory& camera_trajectory,
        const std::vector<std::vector<int>>& visibility_vertex_to_image,
        std::vector<double>& proxy_intensity,
        int image_boundary_margin) {
    auto n_vertex = mesh.vertices_.size();
    proxy_intensity.resize(n_vertex);

#pragma omp parallel for schedule(static) \
        num_threads(utility::EstimateMaxThreads())
    for (int i = 0; i < int(n_vertex); i++) {
        proxy_intensity[i] = 0.0;
        float sum = 0.0;
        for (size_t iter = 0; iter < visibility_vertex_to_image[i].size();
             iter++) {// 找到能看到对应三维坐标的图像序号(相机位姿)
            int j = visibility_vertex_to_image[i][iter];
            float gray;
            bool valid = false;
            if (warping_fields.has_value()) {
                std::tie(valid, gray) = QueryImageIntensity<float>(
                        images_gray[j], warping_fields.value()[j],
                        mesh.vertices_[i], camera_trajectory.parameters_[j],
                        utility::nullopt, image_boundary_margin);
            } else {
                std::tie(valid, gray) = QueryImageIntensity<float>(
                        images_gray[j], utility::nullopt, mesh.vertices_[i],
                        camera_trajectory.parameters_[j], utility::nullopt,
                        image_boundary_margin);
            }

            if (valid) {
                sum += 1.0;
                proxy_intensity[i] += gray;
            }
        }
        if (sum > 0) {
            proxy_intensity[i] /= sum;// 求所有图像中的色彩均值,论文公式13
        }
    }
}

RigidOptimizer.cpp

static void ComputeJacobianAndResidualRigid(// 计算雅各比和残差项
        int row,
        Eigen::Vector6d& J_r,
        double& r,
        double& w,
        const geometry::TriangleMesh& mesh,// mesh
        const std::vector<double>& proxy_intensity,// 各像素颜色代表
        const geometry::Image& images_gray,// 灰度图
        const geometry::Image& images_dx,
        const geometry::Image& images_dy,
        const Eigen::Matrix4d& intrinsic,// 相机内参 4x4矩阵
        const Eigen::Matrix4d& extrinsic,// 相机当前位姿 4x4矩阵
        const std::vector<int>& visibility_image_to_vertex,
        const int image_boundary_margin) {
    J_r.setZero();
    r = 0;
    int vid = visibility_image_to_vertex[row];// 找到想处理的点
    Eigen::Vector3d x = mesh.vertices_[vid];
    Eigen::Vector4d g = extrinsic * Eigen::Vector4d(x(0), x(1), x(2), 1); // 刚体变化后坐标(公式2)
    Eigen::Vector4d uv = intrinsic * g;
    double u = uv(0) / uv(2);
    double v = uv(1) / uv(2);
    if (!images_gray.TestImageBoundary(u, v, image_boundary_margin)) return;
    bool valid;
    double gray, dIdx, dIdy;

    std::tie(valid, gray) = images_gray.FloatValueAt(u, v);
    std::tie(valid, dIdx) = images_dx.FloatValueAt(u, v);
    std::tie(valid, dIdy) = images_dy.FloatValueAt(u, v);
    if (gray == -1.0) return;
    // 开始求在对应图像上的投影
    double invz = 1. / g(2);
    // 直接通过链式法则求解J
    // 求解dud参数,公式3直接求导
    double v0 = dIdx * intrinsic(0, 0) * invz;
    double v1 = dIdy * intrinsic(1, 1) * invz;
    double v2 = -(v0 * g(0) + v1 * g(1)) * invz;
    // u将3D点映射到2D,u输出的是一个2D坐标,是一个2x1的矩阵,因此dudα也是一个2x1的矩阵,因此dIdu * dudα是一个常数。
    // 因为我们有6个要优化的参数x={α,β,γ,a,b,c}此会得到一个mx6的雅可比矩阵,且矩阵中的每个元素都是一个常数。
    J_r(0) = (-g(2) * v1 + g(1) * v2);//1x2矩阵 dot 2x1矩阵=常数
    J_r(1) = (g(2) * v0 - g(0) * v2);
    J_r(2) = (-g(1) * v0 + g(0) * v1);
    J_r(3) = v0;// duda
    J_r(4) = v1;// dudb
    J_r(5) = v2;// dudc
    r = (gray - proxy_intensity[vid]);// 似乎和论文中正好相反
    w = 1.0;  // Dummy
}

std::pair<geometry::TriangleMesh, camera::PinholeCameraTrajectory>
RunRigidOptimizer(const geometry::TriangleMesh& mesh,// mesh
                  const std::vector<geometry::RGBDImage>& images_rgbd,
                  const camera::PinholeCameraTrajectory& camera_trajectory,// 相机轨迹
                  const RigidOptimizerOption& option) {
    // 以下参数优化阶段会进行变化
    geometry::TriangleMesh opt_mesh = mesh;
    camera::PinholeCameraTrajectory opt_camera_trajectory = camera_trajectory;

    // 以下参数优化阶段不会改动
    std::vector<geometry::Image> images_gray;
    std::vector<geometry::Image> images_dx;
    std::vector<geometry::Image> images_dy;
    std::vector<geometry::Image> images_color;
    std::vector<geometry::Image> images_depth;
    std::vector<geometry::Image> images_mask;
    std::vector<std::vector<int>> visibility_vertex_to_image;
    std::vector<std::vector<int>> visibility_image_to_vertex;

    // ...这里忽略debug相关操作

    std::vector<double> proxy_intensity;
    int total_num_ = 0;
    int n_camera = int(opt_camera_trajectory.parameters_.size());
    // 获取像素点代理颜色值
    // 图像畸变域在rigid情况下为null
    SetProxyIntensityForVertex(opt_mesh, images_gray, utility::nullopt,
                               opt_camera_trajectory,
                               visibility_vertex_to_image, proxy_intensity,
                               option.image_boundary_margin_);
    for (int itr = 0; itr < option.maximum_iteration_; itr++) {
        utility::LogDebug("[Iteration {:04d}] ", itr + 1);
        double residual = 0.0;
        total_num_ = 0;
#pragma omp parallel for schedule(static) \
        num_threads(utility::EstimateMaxThreads())
        for (int c = 0; c < n_camera; c++) {// 开始公式10的计算
            Eigen::Matrix4d pose;
            pose = opt_camera_trajectory.parameters_[c].extrinsic_;// 获取相机外参

            auto intrinsic = opt_camera_trajectory.parameters_[c]
                                     .intrinsic_.intrinsic_matrix_;// 相机内参,求解3d点云在图像中的投影
            auto extrinsic = opt_camera_trajectory.parameters_[c].extrinsic_;// 相机当前位姿
            Eigen::Matrix4d intr = Eigen::Matrix4d::Zero();// 变为四维矩阵
            intr.block<3, 3>(0, 0) = intrinsic;
            intr(3, 3) = 1.0;// 最后一行最后一列数据设为1.0

            auto f_lambda = [&](int i, Eigen::Vector6d& J_r, double& r,
                                double& w) {// [&] c++11 的lambda表达式
                w = 1.0;  // Dummy.
                ComputeJacobianAndResidualRigid(
                        i, J_r, r, w, opt_mesh, proxy_intensity, images_gray[c],
                        images_dx[c], images_dy[c], intr, extrinsic,
                        visibility_image_to_vertex[c],
                        option.image_boundary_margin_);
            };
            Eigen::Matrix6d JTJ;
            Eigen::Vector6d JTr;
            double r2;
            std::tie(JTJ, JTr, r2) =
                    utility::ComputeJTJandJTr<Eigen::Matrix6d, Eigen::Vector6d>(
                            f_lambda, int(visibility_image_to_vertex[c].size()),
                            false);
            // 公式7
            bool is_success;
            Eigen::Matrix4d delta;
            std::tie(is_success, delta) =
                    utility::SolveJacobianSystemAndObtainExtrinsicMatrix(JTJ,    
            // 得到最终相机位姿优化后结果                                                             JTr);
            pose = delta * pose;
            opt_camera_trajectory.parameters_[c].extrinsic_ = pose;
            // 并行求解平均残差,不断优化
#pragma omp critical(RunRigidOptimizer)
            {
                residual += r2;
                total_num_ += int(visibility_image_to_vertex[c].size());
            }
        }

        //... 忽略文件保存以及输出部分

    return std::make_pair(opt_mesh, opt_camera_trajectory);
}

Non-Rigid Correction

  • 目的:修复不准确的mesh和镜头畸变导致的误差

整体思路

在Rigid的基础上,添加一个非刚体矫正(用变形函数F表示),重新通过链式法则进行优化

目标函数

对每一张图片 I i I_i Ii?

  • 非刚体校正可以用一个变形函数 F i F_i Fi?表示
    F i ( v i , l ) = v i , l + f i , l F_i( v_{i,l})= v_{i,l}+ f_{i,l} Fi?(vi,l?)=vi,l?+fi,l?

    • v i , l ∈ V i v_{i,l} \in V_i vi,l?Vi?,表示变形的控制范围
    • f i , l f_{i,l} fi,l?是一个二维向量
  • 图像平面上的所有投影点u通过非刚体矫正后变为
    F i ( u ) = u + ∑ i θ l ( u ) F_i( u)= u+\sum_i \theta_l( u) Fi?(u)=u+i?θl?(u)

    • θ l \theta_l θl?为双线性插值函数
  • 推导可得 F i ( u ) F_i( u) Fi?(u) { F i ( v i , l ) } v i , l ∈ V i \{ F_i( v_{i,l})\}_{ v_{i,l}\in V_i} {Fi?(vi,l?)}vi,l?Vi??的线性组合

  • V i V_i Vi?是一个 20 × 16 20\times16 20×16的网格( 21 × 17 21\times17 21×17个控制点)。之

  • 之前的目标函数通过非刚体变换后变为
    E c ( C , T , F ) = ∑ i ∑ p ∈ P i r i , p 2 E_c( C, T, F)=\sum_i\sum_{ p\in P_i}r_{i, p}^2 Ec?(C,T,F)=i?pPi??ri,p2?

  • 残差项为
    r i , p = C ( p ) ? Γ i ( F i ( u ( g ( p , T i ) ) ) ) r_{i, p}=C( p)-\Gamma_i( F_i( u( g( p, T_i)))) ri,p?=C(p)?Γi?(Fi?(u(g(p,Ti?))))

  • 为了防止校正函数漂移,加入 L 2 L^2 L2正则项:
    E r ( F ) = ∑ i ∑ l f i , l T E_r( F)=\sum_i\sum_l f_{i,l}^T Er?(F)=i?l?fi,lT?

  • 则完整的目标函数为:
    E ( C , T , F ) = E c ( C , T , F ) + λ E r ( F ) E( C, T, F)=E_c( C, T, F)+\lambda Er( F) E(C,T,F)=Ec?(C,T,F)+λEr(F)

  • 实际中作者使用 λ = 0.1 \lambda=0.1 λ=0.1

源码解析

注:本文只选取核心实现代码,均来源于open3d开源c++源代码,网址链接

NonRigidOptimizer.cpp

// 详见公式17,与rigid相比多出了F函数的求解,在此基础上进行代码理解
static void ComputeJacobianAndResidualNonRigid(
        int row,
        Eigen::Vector14d& J_r,
        double& r,
        Eigen::Vector14i& pattern,
        const geometry::TriangleMesh& mesh,
        const std::vector<double>& proxy_intensity,
        const geometry::Image& images_gray,
        const geometry::Image& images_dx,
        const geometry::Image& images_dy,
        const ImageWarpingField& warping_fields,
        const Eigen::Matrix4d& intrinsic,
        const Eigen::Matrix4d& extrinsic,
        const std::vector<int>& visibility_image_to_vertex,
        const int image_boundary_margin) {
    J_r.setZero();
    pattern.setZero();
    r = 0;
    int anchor_w = warping_fields.anchor_w_;
    double anchor_step = warping_fields.anchor_step_;
    int vid = visibility_image_to_vertex[row];
    Eigen::Vector3d V = mesh.vertices_[vid];
    Eigen::Vector4d G = extrinsic * Eigen::Vector4d(V(0), V(1), V(2), 1);
    Eigen::Vector4d L = intrinsic * G;
    double u = L(0) / L(2);
    double v = L(1) / L(2);
    if (!images_gray.TestImageBoundary(u, v, image_boundary_margin)) {
        return;
    }
    int ii = (int)(u / anchor_step);
    int jj = (int)(v / anchor_step);
    if (ii >= warping_fields.anchor_w_ - 1 ||
        jj >= warping_fields.anchor_h_ - 1) {
        return;
    }
    double p = (u - ii * anchor_step) / anchor_step;// 根据step进行warping
    double q = (v - jj * anchor_step) / anchor_step;
    Eigen::Vector2d grids[4] = {// 4x2矩阵
            warping_fields.QueryFlow(ii, jj),
            warping_fields.QueryFlow(ii, jj + 1),
            warping_fields.QueryFlow(ii + 1, jj),
            warping_fields.QueryFlow(ii + 1, jj + 1),
    };
    Eigen::Vector2d uuvv = (1 - p) * (1 - q) * grids[0] +
                           (1 - p) * (q)*grids[1] + (p) * (1 - q) * grids[2] +
                           (p) * (q)*grids[3];
    double uu = uuvv(0);
    double vv = uuvv(1);
    if (!images_gray.TestImageBoundary(uu, vv, image_boundary_margin)) {
        return;
    }
    bool valid;
    double gray, dIdfx, dIdfy;
    std::tie(valid, gray) = images_gray.FloatValueAt(uu, vv);
    std::tie(valid, dIdfx) = images_dx.FloatValueAt(uu, vv);
    std::tie(valid, dIdfy) = images_dy.FloatValueAt(uu, vv);
    Eigen::Vector2d dIdf(dIdfx, dIdfy);
    Eigen::Vector2d dfdx =
            ((grids[2] - grids[0]) * (1 - q) + (grids[3] - grids[1]) * q) /
            anchor_step;// x变化,ii
    Eigen::Vector2d dfdy =
            ((grids[1] - grids[0]) * (1 - p) + (grids[3] - grids[2]) * p) /
            anchor_step;// y变化,jj
    // 求解delta r(公式20的上一个公式)
    // dIdx=dIdf*dfdx
    double dIdx = dIdf.dot(dfdx);// df/dx-->JF(u)
    double dIdy = dIdf.dot(dfdy);
    double invz = 1. / G(2);
    double v0 = dIdx * intrinsic(0, 0) * invz;
    double v1 = dIdy * intrinsic(1, 1) * invz;
    double v2 = -(v0 * G(0) + v1 * G(1)) * invz;
    // 等同于rigid部分,求解delta r(thetai)
    J_r(0) = -G(2) * v1 + G(1) * v2;
    J_r(1) = G(2) * v0 - G(0) * v2;
    J_r(2) = -G(1) * v0 + G(0) * v1;
    J_r(3) = v0;
    J_r(4) = v1;
    J_r(5) = v2;
    // 基于nonrigid,求解delta r(fi,l)
    J_r(6) = dIdf(0) * (1 - p) * (1 - q);
    J_r(7) = dIdf(1) * (1 - p) * (1 - q);
    J_r(8) = dIdf(0) * (1 - p) * (q);
    J_r(9) = dIdf(1) * (1 - p) * (q);
    J_r(10) = dIdf(0) * (p) * (1 - q);
    J_r(11) = dIdf(1) * (p) * (1 - q);
    J_r(12) = dIdf(0) * (p) * (q);
    J_r(13) = dIdf(1) * (p) * (q);
    pattern(0) = 0;
    pattern(1) = 1;
    pattern(2) = 2;
    pattern(3) = 3;
    pattern(4) = 4;
    pattern(5) = 5;
    pattern(6) = 6 + (ii + jj * anchor_w) * 2;
    pattern(7) = 6 + (ii + jj * anchor_w) * 2 + 1;
    pattern(8) = 6 + (ii + (jj + 1) * anchor_w) * 2;
    pattern(9) = 6 + (ii + (jj + 1) * anchor_w) * 2 + 1;
    pattern(10) = 6 + ((ii + 1) + jj * anchor_w) * 2;
    pattern(11) = 6 + ((ii + 1) + jj * anchor_w) * 2 + 1;
    pattern(12) = 6 + ((ii + 1) + (jj + 1) * anchor_w) * 2;
    pattern(13) = 6 + ((ii + 1) + (jj + 1) * anchor_w) * 2 + 1;
    r = (gray - proxy_intensity[vid]);
}

std::pair<geometry::TriangleMesh, camera::PinholeCameraTrajectory>
RunNonRigidOptimizer(const geometry::TriangleMesh& mesh,
                     const std::vector<geometry::RGBDImage>& images_rgbd,
                     const camera::PinholeCameraTrajectory& camera_trajectory,
                     const NonRigidOptimizerOption& option) {
    // The following properties will change during optimization.
    geometry::TriangleMesh opt_mesh = mesh;
    camera::PinholeCameraTrajectory opt_camera_trajectory = camera_trajectory;
    std::vector<ImageWarpingField> warping_fields;

    // ...和Rigid基本近似,删除
    // 在这里只保留核心部分
    
    warping_fields = CreateWarpingFields(images_gray,
                                         option.number_of_vertical_anchors_);
    warping_fields_init = CreateWarpingFields(
            images_gray, option.number_of_vertical_anchors_);
    
    // ...和Rigid基本近似,删除
    // 在这里只保留核心部分
    
#pragma omp parallel for schedule(static) \
        num_threads(utility::EstimateMaxThreads())
        for (int c = 0; c < n_camera; c++) {
            int nonrigidval = warping_fields[c].anchor_w_ *
                              warping_fields[c].anchor_h_ * 2;
            double rr_reg = 0.0;

            Eigen::Matrix4d pose;
            pose = opt_camera_trajectory.parameters_[c].extrinsic_;

            auto intrinsic = opt_camera_trajectory.parameters_[c]
                                     .intrinsic_.intrinsic_matrix_;
            auto extrinsic = opt_camera_trajectory.parameters_[c].extrinsic_;
            Eigen::Matrix4d intr = Eigen::Matrix4d::Zero();
            intr.block<3, 3>(0, 0) = intrinsic;
            intr(3, 3) = 1.0;

            auto f_lambda = [&](int i, Eigen::Vector14d& J_r, double& r,
                                Eigen::Vector14i& pattern) {
                ComputeJacobianAndResidualNonRigid(
                        i, J_r, r, pattern, opt_mesh, proxy_intensity,
                        images_gray[c], images_dx[c], images_dy[c],
                        warping_fields[c], intr, extrinsic,
                        visibility_image_to_vertex[c],
                        option.image_boundary_margin_);
            };
            Eigen::MatrixXd JTJ;
            Eigen::VectorXd JTr;
            double r2;
            std::tie(JTJ, JTr, r2) =
                    ComputeJTJandJTrNonRigid<Eigen::Vector14d, Eigen::Vector14i,
                                             Eigen::MatrixXd, Eigen::VectorXd>(
                            f_lambda, int(visibility_image_to_vertex[c].size()),
                            nonrigidval, false);

            double weight = option.non_rigid_anchor_point_weight_ *
                            visibility_image_to_vertex[c].size() / n_vertex;
            for (int j = 0; j < nonrigidval; j++) {
                double r = weight * (warping_fields[c].flow_(j) -
                                     warping_fields_init[c].flow_(j));// r求解已不同于rigid
                JTJ(6 + j, 6 + j) += weight * weight;
                JTr(6 + j) += weight * r;
                rr_reg += r * r;
            }

            bool success;
            Eigen::VectorXd result;
            std::tie(success, result) = utility::SolveLinearSystemPSD(
                    JTJ, -JTr, /*prefer_sparse=*/false,
                    /*check_symmetric=*/false,
                    /*check_det=*/false, /*check_psd=*/false);
            Eigen::Vector6d result_pose;
            result_pose << result.block(0, 0, 6, 1);
            auto delta = utility::TransformVector6dToMatrix4d(result_pose);
            pose = delta * pose;

            for (int j = 0; j < nonrigidval; j++) {
                warping_fields[c].flow_(j) += result(6 + j);
            }// 更新warping fields
            opt_camera_trajectory.parameters_[c].extrinsic_ = pose;
#pragma omp critical(RunRigidOptimizer)
            {
                residual += r2;
                total_num_ += int(visibility_image_to_vertex[c].size());
            }
        }

        //... 忽略文件保存以及输出部分

    return std::make_pair(opt_mesh, opt_camera_trajectory);
}

python demo运行

import open3d as o3d
import os
import sys
import re

def load_fountain_dataset():
	def get_file_list(path, extension=None):
		def sorted_alphanum(file_list_ordered):
			convert = lambda text: int(text) if text.isdigit() else text
			alphanum_key = lambda key: [
                convert(c) for c in re.split('([0-9]+)', key)
            ]
			return sorted(file_list_ordered, key=alphanum_key)
		
		if extension is None:
			file_list = [
                path + f
                for f in os.listdir(path)
                if os.path.isfile(os.path.join(path, f))
            ]
		else:
			file_list = [
                path + f
                for f in os.listdir(path)
                if os.path.isfile(os.path.join(path, f)) and
                os.path.splitext(f)[1] == extension
            ]
		file_list = sorted_alphanum(file_list)
		return file_list
		
	path = "./data/fountain/"
	depth_image_path = get_file_list(os.path.join(path, "depth/"),extension=".png")
	color_image_path = get_file_list(os.path.join(path, "image/"),extension=".jpg")# 也可以使用png格式
	assert (len(depth_image_path) == len(color_image_path))
	
	rgbd_images = []
	for i in range(len(depth_image_path)):
		depth = o3d.io.read_image(os.path.join(depth_image_path[i]))
		color = o3d.io.read_image(os.path.join(color_image_path[i]))
		rgbd_image = o3d.geometry.RGBDImage.create_from_color_and_depth(
			color, depth, convert_rgb_to_intensity=False)
		rgbd_images.append(rgbd_image)

	camera_trajectory = o3d.io.read_pinhole_camera_trajectory(
        os.path.join(path, "scene/key.log"))
	mesh = o3d.io.read_triangle_mesh(
		os.path.join(path, "scene", "integrated.ply"))
	return mesh, rgbd_images, camera_trajectory

# 读取数据集
mesh, rgbd_images, camera_trajectory = load_fountain_dataset()

# 可以在这里设置相机内参为非默认值
for i in range(len(camera_trajectory.parameters)):
    camera_trajectory.parameters[i].intrinsic.intrinsic_matrix=[[4509.06,0,837.755],
                                                                [0,4509.06,557.454],
                                                                [0,0,1]]
print(camera_trajectory.parameters[0].intrinsic.intrinsic_matrix)

mesh_optimized = o3d.pipelines.color_map.run_rigid_optimizer(
    mesh, rgbd_images, camera_trajectory,
    o3d.pipelines.color_map.RigidOptimizerOption(maximum_iteration=0))# 最终结果可视化,rigid和nonrigid的结果可视化同理,在这里篇幅有限删去
o3d.visualization.draw_geometries([mesh_optimized],
                                  zoom=0.5399,
                                  front=[0.0665, -0.1107, -0.9916],
                                  lookat=[0.7353, 0.6537, 1.0521],
                                  up=[0.0136, -0.9936, 0.1118])
is_ci=True
# Run rigid optimization.
maximum_iteration = 100 if is_ci else 300
with o3d.utility.VerbosityContextManager(
        o3d.utility.VerbosityLevel.Debug) as cm:
    mesh_optimized = o3d.pipelines.color_map.run_rigid_optimizer(
        mesh, rgbd_images, camera_trajectory,
        o3d.pipelines.color_map.RigidOptimizerOption(
            maximum_iteration=maximum_iteration))

# Run non-rigid optimization.
maximum_iteration = 100 if is_ci else 300
with o3d.utility.VerbosityContextManager(
        o3d.utility.VerbosityLevel.Debug) as cm:
    mesh_optimized = o3d.pipelines.color_map.run_non_rigid_optimizer(
        mesh, rgbd_images, camera_trajectory,
        o3d.pipelines.color_map.NonRigidOptimizerOption(
            maximum_iteration=maximum_iteration))

参考

感谢国内外各位大佬的笔记和数据集以及open3d文档说明

  1. csdn某大佬的note
  2. 这位大佬的关于雅各布矩阵的推导求解
  3. Open3D官方文档
  4. Zhou大佬的官方主页,可以搜寻到数据集和论文
  5. 数据集和Log文件格式说明
  人工智能 最新文章
2022吴恩达机器学习课程——第二课(神经网
第十五章 规则学习
FixMatch: Simplifying Semi-Supervised Le
数据挖掘Java——Kmeans算法的实现
大脑皮层的分割方法
【翻译】GPT-3是如何工作的
论文笔记:TEACHTEXT: CrossModal Generaliz
python从零学(六)
详解Python 3.x 导入(import)
【答读者问27】backtrader不支持最新版本的
上一篇文章      下一篇文章      查看所有文章
加:2021-11-23 12:20:39  更:2021-11-23 12:23:08 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2025年1日历 -2025/1/11 4:48:17-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码