BEACON Navigates Occlusion Challenges

BEACON revolutionizes robot navigation by using Bird's-Eye View (BEV) affordance heatmaps to overcome occlusion challenges, achieving significant accuracy gains over image-space methods.

Mar 11 at 11:08 AM1 min read
Diagram illustrating BEACON's Bird's-Eye View (BEV) affordance heatmap prediction process for robot navigation.

Existing language-conditioned robot navigation methods falter when target locations are occluded, a common scenario in dynamic environments. These systems typically ground instructions in 2D image space, limiting their perception to visible pixels. The researchers behind BEACON propose a novel approach to overcome this fundamental limitation.

Bridging Vision-Language to Bird's-Eye View

BEACON re-imagines robot navigation by predicting an ego-centric Bird's-Eye View (BEV) affordance heatmap. This strategy inherently includes occluded areas within a bounded local region, a critical departure from image-space reasoning. By injecting spatial cues into a Vision-Language Model (VLM) and fusing its output with depth-derived BEV features, BEACON generates a more comprehensive spatial understanding.

Occlusion-Aware Navigation Performance Boost

The efficacy of BEACON is demonstrated through rigorous testing on an occlusion-aware dataset built in the Habitat simulator. The results show a substantial improvement, with BEACON achieving 22.74 percentage points higher accuracy on average across geodesic thresholds compared to state-of-the-art image-space baselines specifically on validation subsets with occluded target locations. This advancement signifies a leap forward for BEACON robot navigation in complex, real-world scenarios.