Visual target navigation is a critical task within the realm of embodied intelligence.
The Existing end-to-end and modular approaches often encounter high computational
demands, challenges in online updates, and limited generalization, which restrict their
deployment on resource-constrained devices. To overcome these challenges, we introduce a
knowledge-driven, lightweight image instance navigation framework, Dual Graph Navigation
(DGN). The DGN constructs an External Knowledge Graph (EKG) using a limited dataset to
capture prior correlation possibilities between objects, guiding the exploration
process. During exploration, DGN builds an Internal Knowledge Graph (IKG) from an
instance-aware module, recording explored regions and integrating with EKG to determine
the next navigation target. The EKG is dynamically updated based on the outcomes of IKG
exploration, continually enhancing the system's adaptability to the current environment.
Additionally, DGN's plug-and-play modular design supports independent training and
flexible replacement of target recognition, keypoint extraction, and path planning
algorithms, reducing training and deployment costs while improving adaptability across
diverse environments. We deploy DGN on three types of real-world robot platforms
(including edge devices without CUDA support) and simulation environments (AI2THOR,
Habitat), and our experimental results demonstrate that it operates stably, achieving
state-of-the-art performance on the ProcTHOR-10K dataset.
Videos
DGN in action
DGN deplay in differnt embodiments and environments.
Overview
Understanding the performance of DGN
Comparison between DGN and mainstream visual navigation methods. Mainstream visual
navigation methods (Other methods)
rely on RGBD input or driven by language, constructing semantic maps with tightly
coupled perception modules, resulting
in storage-consuming deployments. Our method (DGN) only utilizes RGBD input and
constructs an Internal Knowledge Graph
(IKG) using a plug-and-play instance-aware module, recording semantic information and
topological relationships of
instances while dynamically updating the External Knowledge Graph (EKG) with object
category correlation. This enables
efficient navigation that is deployable on low-power, low-computation edge devices.