Summary
During the integration of an IMU Preintegration Factor into a Visual-Inertial Odometry (VIO) system using Ceres Solver, a critical failure occurred in the manifold optimization. The system failed to converge, and the state estimates drifted aggressively. The investigation revealed a fundamental mismatch between the Local Parameterization (Manifold) implementation and the way the Cost Function (Jacobians) was being computed. Specifically, the Evaluate function was treating raw parameters as if they were in the tangent space, while simultaneously applying an exponential map that ignored the underlying manifold structure.
Root Cause
The failure stems from a violation of the manifold optimization principle. In Ceres, when using LocalParameterization (now deprecated in favor of Manifold, but still common), the optimizer operates on the tangent space (Lie Algebra), but the Evaluate function is provided with the manifold state (Lie Group).
The core issues identified were:
- Double Exponential Mapping: The
Evaluatefunction was manually applyingSophus::SO3d::exp(omegai)to the input parameters. However, if theLocalParameterizationis correctly configured, the optimizer’s delta updates are already handled in the tangent space. Applyingexp()again insideEvaluateassumes the inputparameters[0]is a vector in $\mathfrak{so}(3)$, but if the optimization step is large or the parameterization is inconsistent, this leads to incorrect gradient directions. - Incorrect Jacobian Calculation: For manifold-based optimization, the Jacobian provided to Ceres must be the Jacobian with respect to the tangent space, not the raw parameter vector. The code was attempting to calculate residuals using the manifold elements but failing to provide the $\frac{\partial \text{residual}}{\partial \delta}$ relationship required by the Gauss-Newton/Levenberg-Marquardt algorithms.
- Parameter Block Mismatch: The
ImuIntegFactorexpected a specific structure, but thePlusoperation inPoseSO3LocalParameterizationwas not mathematically consistent with the composition rule for $SE(3)$ or $SO(3) \times \mathbb{R}^3$ used in the factor.
Why This Happens in Real Systems
In complex robotics frameworks, this happens due to Abstraction Leaks.
- Mathematical Complexity: VIO requires managing $SO(3)$ (rotations), $\mathbb{R}^3$ (translation/velocity), and $\mathfrak{so}(3)$ (biases) simultaneously.
- Library Misuse: Developers often treat Ceres as a “black box” for least-squares, forgetting that when moving from Euclidean space to Lie Groups, the standard addition $\mathbf{x} + \delta$ is replaced by the group composition $\mathbf{x} \oplus \delta$.
- Implicit Assumptions: A developer might write a
CostFunctionassuming the input is always a Lie Algebra element, while the optimizer is actually passing a Lie Group element that has been updated via aLocalParameterization.
Real-World Impact
- Divergence in SLAM/VIO: The most immediate impact is the immediate divergence of the pose estimate, rendering the navigation system useless.
- Unstable Covariance: The information matrix (Hessian approximation) becomes ill-conditioned, leading to physically impossible uncertainty estimates.
- Computational Waste: The optimizer may spend hundreds of iterations attempting to correct for a “gradient” that is actually a mathematical artifact of the incorrect parameterization, leading to high CPU/latency spikes.
Example or Code
The problematic pattern in the Evaluate function is highlighted below. The error is applying the exponential map to parameters that the optimizer expects to be treated as manifold points, without providing the correct tangent-space Jacobian.
// INCORRECT PATTERN
bool ImuIntegFactor::Evaluate(double const* const* parameters,
double* residuals,
double** jacobians) const {
// parameters[0] is the manifold state (e.g., rotation vector or quaternion)
// The developer incorrectly assumes they must "exp" it to get the rotation
Eigen::Vector3d omegai(parameters[0][0], parameters[0][1], parameters[0][2]);
Sophus::SO3d Ri = Sophus::SO3d::exp(omegai);
// ... computation of residuals ...
// ERROR: If jacobians is not NULL, the developer MUST provide
// the derivative with respect to the tangent space (the delta),
// not the derivative with respect to the raw parameters.
if (jacobians) {
// Calculating d_residual / d_parameters
// instead of d_residual / d_tangent_space
}
return true;
}
How Senior Engineers Fix It
Senior engineers follow the Lie Theory approach strictly:
- Define the Manifold: Ensure the
LocalParameterization(orManifold) correctly implementsPlus,Minus, andDelta. - Tangent Space Jacobians: In the
Evaluatefunction, always compute the Jacobian with respect to the tangent space increment $\delta$.- If the state is $\mathbf{x}$ and the residual is $f(\mathbf{x})$, we need $\frac{\partial f(\mathbf{x} \oplus \delta)}{\partial \delta} \big|_{\delta=0}$.
- Use Sophus/Eigen Properly: Use the adjoint representation $\text{Ad}_R$ to transform Jacobians between different coordinate frames (e.g., from body frame to world frame).
- Standardize Parameter Blocks: Ensure that a single parameter block (like
pose) has a single, consistent manifold definition that matches all factors using that block.
Why Juniors Miss It
- Euclidean Bias: Juniors are trained in standard calculus where $\mathbf{x}_{new} = \mathbf{x} + \Delta \mathbf{x}$. They struggle to realize that in $SO(3)$, the “addition” is a non-linear manifold operation.
- Black-Box Optimization: They often treat the
jacobiansargument in Ceres as a place to put “the derivative of the formula,” without realizing the formula must be derived relative to the tangent space defined by theLocalParameterization. - Ignoring the Adjoint: They often forget that when a rotation is applied to a vector, the derivative must account for the rotation’s effect on the vector’s direction, which requires the Adjoint matrix.