Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection

Published 5 Apr 2026 in cs.CV | (2604.03972v1)

Abstract: 3D shape anomaly detection is a crucial task for industrial inspection and geometric analysis. Existing deep learning approaches typically learn representations of normal shapes and identify anomalies via out-of-distribution feature detection or decoder-based reconstruction. They often fail to generalize across diverse anomaly types and scales, such as global geometric errors (e.g., planar shifts, angle misalignments), and are sensitive to noisy or incomplete local points during training. To address these limitations, we propose a hierarchical point-patch anomaly scoring network that jointly models regional part features and local point features for robust anomaly reasoning. An adaptive patchification module integrates self-supervised decomposition to capture complex structural deviations. Beyond evaluations on public benchmarks (Anomaly-ShapeNet and Real3D-AD), we release an industrial test set with real CAD models exhibiting planar, angular, and structural defects. Experiments on public and industrial datasets show superior AUC-ROC and AUC-PR performance, including over 40% point-level improvement on the new industrial anomaly type and average object-level gains of 7% on Real3D-AD and 4% on Anomaly-ShapeNet, demonstrating strong robustness and generalization.

Summary

  • The paper introduces a hierarchical network that fuses patch-level and point-level features to achieve precise per-point anomaly detection.
  • It employs multi-scale patchification with negative augmentation and an adaptive patch codebook to robustly capture both local defects and global structural deviations.
  • Experimental results demonstrate significant improvements, including over 40% point-level AUC-ROC gains on industrial datasets compared to state-of-the-art methods.

Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for Robust 3D Shape Anomaly Detection

Introduction

The detection and precise localization of 3D shape anomalies in point clouds underpins contemporary automated industrial inspection and geometric quality assurance. Capturing both fine-grained local defects and large-scale global structural deviations remains a persistent challenge, especially in the context of diverse anomaly scales, noisy/incomplete points, and the limited transferability of learned shape priors. Addressing these limitations, "Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection" (2604.03972) proposes a novel, scale-aware anomaly scoring network that synergizes hierarchical part (region) and point-level features. The framework leverages multi-scale patch decomposition, a memory-efficient patch codebook, and patch–point cross-attention fusion to achieve robust generalization across challenging real-world and synthetic anomaly regimes. Figure 1

Figure 1: Patch feature fusion for 3D shape anomaly detection in point clouds, visualizing patch-level cosine similarity heatmaps and point-wise anomaly localization.

Methodology

Multi-Scale Patchification and Negative Augmentation

The pipeline initiates by applying negative augmentation to normal point clouds to simulate a spectrum of pseudo-anomalous structures. This augmentation incorporates local Gaussian deformations, random holes, planar cut-offs, and bulging via normal-based displacements, thus reflecting real defect distributions encountered in manufacturing and geometric analysis. Both normal and pseudo-anomalous shapes are patchified into 3–scales (e.g., 8/32/64 or 32/64/192 patches). This adaptive patchification ensures coverage of both localized defects and macro-structural displacements.

Patch Feature Extraction and Codebook Structuring

Patch features are computed using a pre-trained 3D Minkowski UNet encoder. Distinctively, for each patch, all internal point coordinates are averaged (relative to the patch centroid), and the aggregated representation is passed through the encoder to yield a robust, translation-invariant descriptor. Normal patch features populate the adaptive codebook using a similarity-based, thresholded update strategy; redundancy is minimized via cosine thresholding, and entries are position-invariant, enabling efficient storage and retrieval.

Shape Anomaly Scoring via Hierarchical Fusion

At test time, the anomalous input is patchified, and patch features are extracted and matched against the codebook via maximum cosine similarity. At the optimal scale (determined via argmax\mathrm{argmax} on summed similarities), patch similarity scores are leveraged as modulation coefficients to guide point-wise feature fusion.

A RoPE-based multi-head cross-attention module fuses normal patch codebook features with test point features, embedding geometric priors across levels. The modulation network adjusts the contributions of each patch–point correspondence, based on patch-level similarity deltas, yielding per-point anomaly-aware embeddings. Figure 2

Figure 2: Overview of the proposed shape anomaly detection framework, including patchification, codebook querying, cross-attention fusion, and anomaly prediction.

Finally, a lightweight MLP predicts per-point anomaly vectors, which are normalized to produce scalar anomaly scores.

Experimental Results

Benchmark and Industrial Evaluation

The model is validated on Anomaly-ShapeNet, Real3D-AD, and a custom industrial test set featuring real CAD scans with structural defects (planar/gear misalignment, crack-like anomalies). Both object-level and point-level AUC-ROC/AUC-PR metrics are reported. On Anomaly-ShapeNet, the method attains a 4% mean object-level AUC-ROC improvement over the best published baselines. On Real3D-AD, it yields an average 7.5% increase in object-level AUC-ROC relative to the most competitive alternative.

On the introduced industrial test set, designed to stress-test generalization and structural sensitivity, the model provides a >40% absolute point-level AUC-ROC improvement over other state-of-the-art frameworks such as PO3AD and R3D-AD. Figure 3

Figure 3: Qualitative comparison on three datasets, including real industrial, Real3D-AD, and Anomaly-ShapeNet, showcasing localization of diverse defect types.

Ablation studies demonstrate:

  • Cross-attention with RoPE is essential; removal reduces object-level AUC-ROC by ~10%.
  • Mean point coordinate feature extraction provides superior patch descriptors compared to direct pooling of UNet features or mean feature pooling.
  • Multi-scale patchification outperforms alternative partitioning strategies such as semantic part segmentation or uniform grids, particularly in capturing cross-scale anomalies.
  • Training with negative augmentation is critical for bridging the domain gap between normal and anomalous samples. Figure 4

    Figure 4: AUC-ROC sensitivity to voxel size, patch number, and patch size, at both object- and point-level granularity.

    Figure 5

Figure 5

Figure 5: t-SNE visualization of point feature distributions, illustrating superior cluster separation and discrimination with patch-guided fusion.

Performance is competitive in computational cost, with total inference times nearly identical to prior art, and codebook management does not introduce prohibitive overhead.

Theoretical and Practical Implications

This method fills a critical gap in geometric anomaly detection by enabling joint, scalable modeling of both regional part features and high-resolution point-wise features using a compact, reusable codebook architecture. The resulting framework is inherently robust to scale variation, surface noise, and incomplete data, properties central to deployment in real industrial inspection pipelines.

The adaptive codebook strategy suggests new avenues for integrating explicit geometric memory into 3D vision models, analogous to prototype memory in OOD/novelty tasks, while the point–patch fusion paradigm could generalize to other structural reasoning tasks (e.g., segmentation, hierarchical completion, or defect propagation modeling).

Limitations and Future Directions

While the model demonstrates strong empirical and practical advantages, its reliance on random Farthest Point Sampling and geometric clustering makes it partially sensitive to sampling noise. Extension towards more semantically consistent patch generation and integration of topology-aware or functionally meaningful part cues (e.g., using mesh connectivity or part segmentation priors) is a promising trajectory. Expansion towards fully unstructured or in-the-wild industrial point cloud datasets is warranted for additional validation.

Conclusion

Hierarchical Point-Patch Fusion with Adaptive Patch Codebook (2604.03972) establishes a robust, efficient, and generalizable solution for 3D shape anomaly detection, with state-of-the-art results on both public and challenging custom datasets. Its fusion-centric approach and codebook memory mechanisms raise the bar for explicit scale-awareness and transferability in geometric anomaly reasoning, setting the stage for further advances in industrial visual inspection and 3D scene interpretation.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.