SGAD-SLAM: Splatting Gaussians at Adjusted Depth for Better Radiance Fields in RGBD SLAM
Abstract: 3D Gaussian Splatting (3DGS) has made remarkable progress in RGBD SLAM. Current methods usually use 3D Gaussians or view-tied 3D Gaussians to represent radiance fields in tracking and mapping. However, these Gaussians are either too flexible or too limited in movements, resulting in slow convergence or limited rendering quality. To resolve this issue, we adopt pixel-aligned Gaussians but allow each Gaussian to adjust its position along its ray to maximize the rendering quality, even if Gaussians are simplified to improve system scalability. To speed up the tracking, we model the depth distribution around each pixel as a Gaussian distribution, and then use these distributions to align each frame to the 3D scene quickly. We report our evaluations on widely used benchmarks, justify our designs, and show advantages over the latest methods in view rendering, camera tracking, runtime, and storage complexity. Please see our project page for code and videos at https://machineperceptionlab.github.io/SGAD-SLAM-Project .
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What is this paper about?
This paper introduces a faster, more memoryโfriendly way to build 3D maps from a video that also has depth (called RGBโD). The system is named SGADโSLAM. It helps a camera figure out where it is (tracking) while it builds a 3D model of the scene (mapping). The big idea is to represent the scene using lots of tiny, soft โblobsโ of color and transparency (called Gaussians) that are tied to image pixels but can slide a little closer or farther along the cameraโs line of sight. This makes the 3D model look better and run faster, even in large spaces.
What questions are the researchers trying to answer?
They focus on three simple questions:
- How can we make 3D maps that look good from different viewpoints without slowing down the system?
- How can we keep tracking the cameraโs position accurately and quickly, even when the scene is large or the images are noisy?
- How can we use less memory so the method scales to big spaces?
How does their method work? (Explained with everyday ideas)
To understand the method, think of two jobs happening at the same time: mapping and tracking.
- Mapping (drawing the world): Imagine painting a 3D scene with millions of tiny, soft, colored dots (Gaussians). Each dot starts at a pixel in the camera image and sits on an invisible โstringโ (a ray) that runs from the camera, through that pixel, into the scene. SGADโSLAM lets each dot slide a bit forward or backward along its string to find the best spot that makes the rendered picture match the real photo. This โslidingโ is called an adjusted depth offset. Because the dots stay tied to pixels and only move along their own string, the system needs to keep far fewer dots in memory at onceโjust for the current and nearby framesโmaking it much more scalable.
- Tracking (knowing where the camera is): Think of the depth image as a set of bumps and shapes. Around each point, the method summarizes the local shape as a little soft 3D โpuffโ (a Gaussian) that captures the neighborhoodโs orientation and spread. To find the camera pose for a new frame, the system aligns these puffs from the current frame to a global set of puffs built from earlier framesโlike matching puzzle pieces by their shapes, not just by exact points. This shapeโmatching (a robust form of ICP) is fast and resistant to noise. If the scene is tricky (fast motion or little texture), they first do a quick coarse alignment using the rendered dots from the previous frame to get a good starting position.
Two more practical choices make it efficient:
- Simplified dots: They use spherical โblobsโ (simpler Gaussians) with just color, size, opacity, and a 1โnumber depth offset. This saves memory compared to full, stretched ellipsoids.
- Scale normalization: They normalize sizes so frames with different depth ranges still match well, preventing scale mismatches from confusing the tracker.
What did they find, and why does it matter?
Across several wellโknown test sets (Replica, TUMโRGBD, ScanNet, and ScanNet++), SGADโSLAM:
- Renders views more accurately: It produces sharper, more faithful images (higher PSNR and SSIM, lower LPIPS) than recent methods, even those that also use Gaussian splatting.
- Tracks the camera more precisely: It often achieves the best or nearโbest camera accuracy (lower ATE RMSE), sometimes outperforming methods that rely on extra preโtrained โloop closureโ detectors.
- Runs faster and scales better: Because it only optimizes dots for a few frames at a time, it uses memory more efficiently and can process large scenes. Tracking is especially fast thanks to the shapeโbased alignment. It also parallelizes well across multiple GPUs.
- Is robust to noise: Since it matches overall shapes (not just individual points) and allows each pixelโs dot to slide along its ray, it stays stable even when depth data is imperfect.
These results matter because SLAM systems are the backbone of robotics, AR, and VR. Better rendering improves how virtual and real worlds blend; faster, more accurate tracking makes robots and headsets more reliable; and lower memory use enables larger, more complex spaces.
Whatโs the bigger impact?
- For AR/VR: More realistic scenes and smoother motion mean more convincing experiences.
- For robots: Faster, robust tracking and mapping help robots navigate and understand large spaces without heavy hardware.
- For 3D capture: Creators can scan bigger areas and get betterโlooking reconstructions more quickly.
In short, the paper shows that letting each pixelโs โsoft dotโ slide a little along its viewing lineโwhile keeping the representation simpleโstrikes a sweet spot: high visual quality, accurate camera tracking, fast performance, and the ability to handle large scenes.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
The paper introduces a strong approach but leaves several aspects under-specified or unexplored. The following concrete gaps can guide future research:
- Ambiguity in depth-offset parameterization: the text alternates between a per-Gaussian offset and a single frame-wise offset ฮด_i. Clarify which is used and quantify the trade-offs (rendering fidelity, memory, and convergence) between per-pixel vs. per-frame offsets.
- Restricted motion model: allowing Gaussians to move only along viewing rays cannot correct lateral/parallax errors from miscalibrated depth or multi-view inconsistencies. Assess failure modes and explore lightweight in-plane or local 3D displacements.
- Simplified spherical Gaussians without anisotropy or view-dependent appearance: omitting ellipsoidal covariances and spherical harmonics limits modeling of slanted surfaces and specularities. Evaluate on glossy/reflective/translucent scenes and study low-cost SH/anisotropy additions.
- No local densification: removing densification may miss fine/thin structures and high-frequency details. Benchmark on scenes with thin geometry (e.g., wires, chair legs) and test adaptive, budgeted densification along rays.
- Occlusion ordering and surface consistency: depth-offset adjustments may move Gaussians across surfaces and violate visibility. Introduce occlusion-aware regularizers or cross-view consistency constraints and analyze failure cases.
- Dependence on depth inpainting/โground-truthโ depth for initialization: robustness to real sensor artifacts (holes, edge bleeding, multipath, rolling shutter, per-frame scale bias) is not thoroughly evaluated. Test realistic noise/bias models and large missing regions.
- Tracking degeneracies in low-structure scenes: geometry-only GICP can be ill-posed in planar/corridor settings. Add degeneracy detection, regularization, or complementary cues (photometric/semantic) and quantify failure rates.
- Scale normalization details: the normalization is critical but the exact formulation is not fully specified. Provide the formula, invariance properties, and sensitivity analysis for reproducibility.
- Convergence basin and initialization: quantify how far the pose can be from ground truth for reliable convergence; report success rates and runtime overhead of the rendering-based initialization vs. alternatives.
- Global map growth and scalability of the tracking set T: memory and search costs for long sequences/large spaces are not characterized. Define pruning, curation, and hierarchical/voxel hashing strategies; evaluate on hour-long or building-scale sequences.
- Lack of loop closure: performance over very long loops is untested. Develop loop-closure mechanisms compatible with geometry-only distributions (without data-driven place recognition) and evaluate drift correction.
- Consistency of appearance across frames: optimizing only per-frame Gaussians (with limited neighbor coupling) risks color/geometry seams across the scene. Add global photometric constraints or lightweight cross-frame bundle adjustment and measure seam artifacts.
- Unified scene representation at inference: the total number of per-frame Gaussians can be extremely large (hundreds of millions). Specify how a compact, globally queryable radiance field is assembled, stored, and streamed post-mapping.
- Real-time constraints: single-GPU mapping runs at ~0.89 s/frame (non-real-time). Explore accuracyโspeed trade-offs, scheduling (e.g., keyframes), or model compression to reach 30 fps mapping on commodity hardware.
- Robustness beyond random depth noise: current tests inject random pixel noise. Evaluate systematic biases (depth-scale errors), spatially correlated noise, temporal outliers, and sensor-specific artifacts across devices.
- Parameter sensitivity: the impact of R (downsampling), K_c (neighbors), NN(i) (neighbor frames), and loss weights (ฯ, ฯ, ฯ) is not studied. Provide ablations or auto-tuning strategies.
- Generalization to outdoor and large depth ranges: all benchmarks are indoor. Assess performance with sunlight interference, large-scale depth variation, and low-texture outdoor scenes.
- Dynamic and non-rigid scenes: no mechanism to detect/handle moving objects is provided. Investigate dynamic masking, multi-body tracking, or residual explanations to extend to dynamic RGB-D SLAM.
- Camera model and photometric variability: assumptions of pinhole, constant exposure/white balance, and accurate intrinsics are implicit. Evaluate robustness to lens distortion, auto-exposure, rolling shutter, and radiometric changes.
- Theoretical properties of modified GICP: the proposed point-to-surface correspondence with scale normalization deviates from standard GICP; provide convergence analysis, conditions for well-posedness, and behavior under outliers.
- Visibility and correspondence selection in tracking: criteria for non-overlap when adding to T, outlier rejection, and nearest-neighbor search strategy are not fully detailed. Specify thresholds and their effect on accuracy/speed.
- Novel view synthesis at large baselines: gains are shown, but degradation vs. viewpoint distance is not analyzed. Benchmark extrapolation performance and relate it to the restricted motion and simplified appearance model.
- Effect of removing multi-scale anti-aliasing: spherical Gaussians without explicit multi-scale/LODs may alias at high zoom or far distances. Study anti-aliasing or mip-like strategies compatible with the simplified model.
- Reproducibility and missing baseline results: some baselines are absent on ScanNet due to implementation issues; key implementation details (e.g., scale normalization, correspondence policy) are sparse. Provide full code, configs, and protocols for fair comparison.
Practical Applications
Overview
SGAD-SLAM introduces a scalable RGBD SLAM system that combines pixel-aligned Gaussians with learnable depth offsets (movement along each pixelโs viewing ray) and a fast, geometry-similarity tracking strategy (GICP with point-to-surface correspondences and scale normalization). This yields state-of-the-art rendering quality, robust and efficient camera tracking, lower memory footprints, and better scalability to large scenes, without reliance on pre-trained loop-closure priors. Below are concrete applications, organized by deployment horizon, with sectors, potential tools/workflows, and key assumptions or dependencies that influence feasibility.
Immediate Applications
These can be piloted or deployed now with commodity RGBD sensors (e.g., RealSense, Azure Kinect, iPhone/iPad LiDAR) and a single GPU or modest multi-GPU setup.
- Robotics: real-time indoor navigation and mapping
- Use cases: warehouse AMRs, service robots, cleaning robots, inventory robots; live mapping and odometry in texture-poor corridors, offices, and retail floor spaces.
- Sectors: robotics, logistics, retail.
- Tools/products/workflows: ROS/ROS2 node for SGAD-SLAM; drop-in replacement for TSDF/NeRF-based mapping; plug-in GICP tracker (point-to-surface + scale normalization) for existing RGBD odometry stacks.
- Assumptions/dependencies: RGBD sensor availability; largely static or quasi-static scenes; known intrinsics and time-synchronized RGBโdepth; moderate GPU (desktop or Jetson-class); minimal specular/transparent surfaces.
- AR/VR/XR: room-scale capture for occlusion and realistic passthrough
- Use cases: instant occlusion meshes and photorealistic radiance fields for headsets; persistent anchors; mixed-reality object placement.
- Sectors: AR/VR, gaming, media.
- Tools/products/workflows: Unity/Unreal plugin to stream pixel-aligned Gaussians and surfaces; cloud-side SGAD-SLAM mapping with on-device rendering; export to mesh via Marching Cubes for physics and occlusion.
- Assumptions/dependencies: RGBD-capable device or tethered RGBD capture; network uplink for cloud mapping (if not on-device); relatively static scene during capture.
- AEC and Facility Management: rapid as-built capture and updates
- Use cases: scanning floors, corridors, mechanical rooms; quick update cycles for facility records; construction progress snapshots.
- Sectors: AEC, facility ops.
- Tools/products/workflows: handheld RGBD scanning โ SGAD-SLAM mapping โ export textured meshes and radiance fields; multi-GPU parallelization for large buildings; BIM alignment pipeline.
- Assumptions/dependencies: indoor RGBD coverage (range limits vs LiDAR); controlled operator motion to avoid severe rolling shutter; need for global alignment to survey control if metric absolute accuracy is required.
- Real estate and retail digitization: fast 3D tours and store planograms
- Use cases: walkable virtual tours; store-layout verification; fixture and signage updates.
- Sectors: real estate, retail.
- Tools/products/workflows: mobile scanning app using SGAD-SLAM; cloud pipeline to produce novel-view renderings and navigable meshes; CMS integration for 3D tours.
- Assumptions/dependencies: indoor RGBD capture; privacy and consent flows for scanned spaces; post-processing for compression/hosting.
- VFX and on-set previsualization: photorealistic scene capture
- Use cases: quick set digitization for camera blocking and lighting previews; accurate occlusion for virtual production.
- Sectors: film/TV, media.
- Tools/products/workflows: on-set RGBD sweep โ SGAD-SLAM mapping โ radiance field previews in Unreal; export to DCC tools; high-fidelity view synthesis for previz.
- Assumptions/dependencies: controlled lighting helps; scene static during sweep; GPU workstation or small render farm.
- Inspection and maintenance (indoor): close-range asset capture
- Use cases: telecom closets, process skids, lab spaces; condition documentation.
- Sectors: manufacturing, utilities (indoor), pharma.
- Tools/products/workflows: technician handheld capture โ SGAD-SLAM โ mesh + radiance field repository; change-over-time comparisons using repeated captures.
- Assumptions/dependencies: RGBD range suits close quarters; static or limited motion scenes; safety protocols for scanning in operational environments.
- Research and education: a fast, robust SLAM baseline
- Use cases: benchmark reproduction; ablations; teaching advanced SLAM with radiance fields.
- Sectors: academia, R&D labs.
- Tools/products/workflows: open-source SGAD-SLAM; notebooks/CLI to turn RGBD logs into 3DGS + meshes; integration in evaluation pipelines (Replica, TUM-RGBD, ScanNet/++).
- Assumptions/dependencies: availability of datasets; standard GPU; adherence to benchmark protocols.
- SLAM/odometry component upgrade: enhanced GICP tracker
- Use cases: replacing noise-sensitive point-to-point ICP with point-to-surface + scale-normalized geometry alignment for RGBD odometry.
- Sectors: robotics, mapping software.
- Tools/products/workflows: standalone C++/CUDA module for GICP variant; bindings for Open3D/PCL; ROS2 node.
- Assumptions/dependencies: access to depth and calibrated intrinsics; local geometry distributions computed per-frame.
- Consumer daily use: quick home scans for interior design
- Use cases: furniture placement, DIY projects, VR home walkthroughs.
- Sectors: consumer apps, e-commerce.
- Tools/products/workflows: mobile app for iPhone/iPad LiDAR or Android depth; SGAD-SLAM cloud mapping; AR try-before-you-buy with occlusion-accurate meshes.
- Assumptions/dependencies: mobile energy/GPU constraints (cloud offload often required); privacy/compliance for home data.
Long-Term Applications
These require further research, engineering, or scaling, but the paperโs innovations directly enable or de-risk them.
- Multi-user, multi-robot collaborative mapping at building scale
- Vision: teams of robots or users map disjoint areas concurrently; per-frame Gaussians enable sharded, parallel optimization across nodes/GPUs; later fuse into a consistent global map.
- Sectors: robotics, security, smart buildings.
- Tools/products/workflows: distributed SGAD-SLAM service; map-merging with global optimization/loop closure; edge-cloud coordination.
- Assumptions/dependencies: robust distributed data association (loop closures, place recognition); clock/time sync; bandwidth for partial map streaming; consistency under long-term drift.
- Persistent AR at campus/city blocks with photorealistic occlusion
- Vision: shared, persistent radiance-field maps spanning large indoor complexes, accessible across sessions/devices.
- Sectors: AR platform providers, location-based entertainment.
- Tools/products/workflows: continuous capture and background re-optimization; versioned map services; streaming of pixel-aligned Gaussians on demand.
- Assumptions/dependencies: scalable storage/compute; privacy-by-design (access control, redaction); handling scene changes and maintenance updates.
- On-device, real-time mobile deployment
- Vision: run SGAD-SLAM entirely on headsets/phones using simplified spherical Gaussians and optimized kernels.
- Sectors: AR/VR, mobile.
- Tools/products/workflows: kernel fusion, quantization, tiling, and memory pooling; use of mobile NPUs/GPUs; adaptive quality modes.
- Assumptions/dependencies: further algorithmic acceleration; thermal/battery limits; mobile driver support for fast splatting.
- Dynamic and deformable scene modeling (4D SGAD-SLAM)
- Vision: extend pixel-aligned Gaussians with per-ray time-varying offsets/opacity to capture moving people/objects and nonrigid deformations.
- Sectors: robotics (people-aware navigation), telepresence, sports analytics.
- Tools/products/workflows: motion segmentation + dynamic/static factorization; temporal regularizers; multi-hypothesis rendering for occlusion handling.
- Assumptions/dependencies: robust segmentation/association; reduced artifacts at object boundaries; computational budget.
- Cross-sensor fusion: LiDAR/stereo + RGBD for large, mixed-range scenes
- Vision: combine SGAD-SLAM with LiDAR to cover long-range and outdoor/indoor transitions while retaining photorealistic rendering.
- Sectors: industrial inspection, digital twins, autonomous systems (indoor/outdoor).
- Tools/products/workflows: calibration and joint optimization of heterogeneous geometry distributions; per-sensor confidence weighting; unified splatting for appearance.
- Assumptions/dependencies: precise extrinsics; handling rolling shutter and timing offsets; robust reflectivity handling.
- Telepresence and 3D communications using Gaussian streaming
- Vision: low-latency streaming of pixel-aligned Gaussians (instead of full meshes) for immersive remote walkthroughs.
- Sectors: enterprise collaboration, real estate, remote assistance.
- Tools/products/workflows: rate-controlled Gaussian transmission; server-side novel-view synthesis; client-side cache and culling.
- Assumptions/dependencies: network QoS; compression standards for 3D Gaussian data; graceful degradation strategies.
- Semantic, task-driven mapping
- Vision: attach semantic labels/uncertainty to pixel-aligned Gaussians for downstream tasks (navigation, inventory, safety checks).
- Sectors: robotics, AEC/FM, retail.
- Tools/products/workflows: lightweight semantic heads over SGAD-SLAM; class-specific priors; hierarchical map layers (geometry, appearance, semantics).
- Assumptions/dependencies: training data for semantics; robust label propagation under viewpoint change; compute overhead vs real-time needs.
- Policy and standardization: privacy-aware digital twins and 3D map formats
- Vision: consistent governance around large-scale indoor scans; standard formats/APIs for Gaussian-based maps and redaction/retention policies.
- Sectors: public policy, enterprise IT, standards bodies.
- Tools/products/workflows: data minimization via per-frame optimization (less global state residency); consent tracking; standard file/stream formats for 3D Gaussian representations.
- Assumptions/dependencies: cross-industry alignment; integration with identity/access; legal frameworks for shared 3D spaces.
- Automated quality assurance for construction and manufacturing
- Vision: compare SGAD-SLAM reconstructions against design specs; detect deviations and missing elements in near real-time.
- Sectors: AEC, manufacturing.
- Tools/products/workflows: alignment to CAD/BIM; deviation heatmaps; automated reporting.
- Assumptions/dependencies: high-precision calibration and scale; handling reflective/occluding machinery; acceptance criteria for tolerances.
- Content creation pipelines for games and digital assets
- Vision: creator tools that convert scans to game-ready assets leveraging SGAD-SLAM radiance fields and meshes with material proxies.
- Sectors: gaming, digital content.
- Tools/products/workflows: asset exporter (3DGS โ mesh/PBR textures); level-of-detail generation; engine-specific importers.
- Assumptions/dependencies: domain-specific material capture; runtime budgets for in-engine rendering; licensing of scanned environments.
Notes on Feasibility Across Applications
- Strengths to leverage:
- High-fidelity rendering with simplified Gaussians and per-ray depth offsets (better PSNR/SSIM, robust novel views).
- Fast, robust tracking via geometry similarity (GICP with point-to-surface + scale normalization), especially in low-texture scenes.
- Scalability: only a small, per-frame subset of Gaussians is optimized; supports multi-GPU parallelism.
- Robustness to noisy depth due to Gaussian depth modeling and offset learning.
- Typical dependencies/assumptions:
- Valid RGBD input with known intrinsics and synchronized streams.
- Mostly static scenes during capture; dynamics require future extensions.
- Adequate GPU for real-time or near-real-time mapping; cloud offload if mobile.
- Depth sensorsโ operational constraints (range, sunlight, reflective/transparent surfaces).
- For very large spaces, loop closure/global optimization improves long-term consistency (future integration).
These applications align with SGAD-SLAMโs demonstrated performance gains and architectural choices (pixel-aligned Gaussians at adjusted depth and fast geometry-based tracking), translating benchmark improvements into concrete value in products, services, and workflows.
Glossary
- 3D Gaussian Splatting (3DGS): An explicit scene representation using 3D Gaussian primitives rendered via splatting for efficient differentiable rendering. Example: "3D Gaussian Splatting (3DGS) has made remarkable progress in RGBD SLAM."
- Adjusted depth: A per-pixel depth modified by an offset to reposition Gaussians along the viewing ray. Example: "Pixel-aligned Gaussians at adjusted depth."
- ATE RMSE: Absolute Trajectory Error root mean square; a standard metric for camera pose accuracy. Example: "ATE RMSE "
- Back-projected 3D point: A 3D point obtained by projecting a pixel with known depth into 3D space using camera intrinsics. Example: "back-projected 3D point"
- COLMAP: A structure-from-motion/multi-view stereo pipeline commonly used to recover camera poses and sparse/dense reconstructions. Example: "COLMAP~\cite{schoenberger2016mvs}"
- Covariance matrix: A matrix capturing local geometric variation around a point, used here to parameterize Gaussian shape/uncertainty. Example: "covariance matrix"
- Densification: The process of adding more primitives (e.g., Gaussians) locally to increase detail in the representation. Example: "local densification process"
- Differentiable splatting: A rendering operation that blends projected Gaussians in a differentiable manner, enabling gradient-based optimization. Example: "differentiable splatting operation"
- Ellipsoid Gaussians: Anisotropic 3D Gaussian primitives with full covariance (orientation and different axis variances), as used in standard 3DGS. Example: "ellipsoid Gaussians"
- F1-score: The harmonic mean of precision and recall used to evaluate reconstruction quality. Example: "F1-score"
- Gaussian distribution: The normal distribution used to model depth or local 3D geometry around points. Example: "Gaussian distribution"
- Generalized ICP (GICP): A registration algorithm that aligns point sets by modeling local structure with covariance (Gaussian) and minimizing distribution overlap. Example: "Generalized ICP (GICP)"
- Geometry similarity: A criterion for aligning frames by matching local geometric distributions instead of raw color. Example: "geometry similarity"
- LPIPS: Learned Perceptual Image Patch Similarity; a perceptual metric for image similarity. Example: "LPIPS"
- Loop closure: Detecting revisits to previously seen places to correct pose drift via global optimization. Example: "loop closure"
- Marching Cubes: An algorithm for extracting iso-surfaces (meshes) from volumetric fields. Example: "Marching Cubes~\cite{Lorensen87marchingcubes}"
- Multi-view stereo (MVS): Techniques that recover depth and camera poses from multiple overlapping images via photometric consistency. Example: "multi-view stereo (MVS)"
- NeRF: Neural Radiance Fields; a neural function that models scene density and color for view synthesis. Example: "NeRF~\cite{mildenhall2020nerf}"
- NetVLAD: A deep global image descriptor used for place recognition and loop detection. Example: "NetVLAD models~\cite{Arandjelovic16}"
- Novel view synthesis: Rendering images from camera viewpoints not present in the training data. Example: "novel view synthesis"
- Opacity: The per-Gaussian alpha/visibility parameter controlling contribution during splatting. Example: "opacity ()"
- Pixel-aligned Gaussians: Gaussians associated with individual pixels and aligned along their camera rays. Example: "pixel-aligned Gaussians"
- Point-to-point distance: The standard ICP correspondence metric measuring Euclidean distances between matched points. Example: "point-to-point distance"
- Point-to-surface distance: A correspondence metric measuring distance from a point to a local surface defined by a normal (smallest-variance direction). Example: "point-to-surface distance"
- Pose graph optimization: A global optimization over a graph of camera poses and constraints (e.g., loops) to reduce drift. Example: "pose graph optimization"
- PSNR: Peak Signal-to-Noise Ratio; an image fidelity metric used for rendering evaluation. Example: "PSNR"
- Radiance field: A function that maps 3D positions and viewing directions to emitted/reflective color and density. Example: "radiance field"
- Ray tracing-based rendering: Rendering that integrates samples along camera rays through the radiance field; common in NeRF. Example: "ray tracing-based rendering"
- RGBD odometry: Frame-to-frame camera motion estimation using both color and depth data. Example: "RGBD odometry~\cite{colorptreg_odo}"
- RGBD SLAM: Simultaneous Localization and Mapping using RGB-D input sequences. Example: "RGBD SLAM jointly estimates camera poses and geometry"
- Scale Normalization: Normalizing Gaussian scales to mitigate depth-range variation across frames for robust matching. Example: "Scale Normalization."
- SSIM: Structural Similarity Index; an image quality metric measuring structural fidelity. Example: "SSIM"
- SVD: Singular Value Decomposition; used to extract scales and rotations from covariance matrices. Example: "SVD~\cite{Segal2009GeneralizedICP}"
- View frustum: The pyramidal volume visible to a camera; constrains where primitives may reside. Example: "View Frustum"
- View-tied Gaussians: Gaussians anchored to pixels at fixed depths in a specific view, limiting their movement across rays. Example: "view-tied Gaussians"
Collections
Sign up for free to add this paper to one or more collections.