A Benchmark for End-to-end Navigation in Unstructured Robotic Environments
Most existing end-to-end (E2E) autonomous driving algorithms are designed for standard vehicles in structured traffic scenarios.
However, these methods rarely explore robot navigation in unstructured environments such as:
- Auxiliary roads
- Campus paths
- Indoor areas
This repository introduces FreeWorld Dataset, the first dataset specifically targeting E2E robot navigation in unstructured scenarios.
The dataset is built through two complementary pipelines:
- Real-world data collection using mobile robots
- Synthetic data generation with a Unity-based simulator
We also provide benchmark baselines by fine-tuning two efficient E2E autonomous driving models:
- VAD
- LAW
- 📀 First dataset for unstructured robot navigation
- 🤖 Supports both real and synthetic data pipelines
- 🧪 Benchmark results show improved performance after fine-tuning on FreeWorld
- 🚚 Targeted towards logistics and service robots in real-world unstructured environments
By releasing both the dataset and baseline models, this project serves as a foundation for advancing vision-based E2E navigation technology.
Our research paper detailing the FreeAD system, dataset, and experimental results is available on arXiv.
We modified some APIs from the nuScenes dataset to enhance flexibility and support a wider variety of data and map scenarios. The modified code has been localized and named FreeWorld. The FreeWorld Dataset(Real Part) is available for access.
The FreeAskWorld Dataset(Virtual Part) is available for access.
We conducted fine-tuning experiments on both virtual and real datasets to evaluate our models. Three dataset configurations were considered:
- Virtual-only (V)
- Real-only (R)
- Combined virtual + real (V+R)
| Dataset | Fine-Tuning Setup |
|---|---|
| V | Fine-tuned for 3 epochs on the virtual dataset. |
| R | Fine-tuned for 1 epoch on the real dataset. |
| V+R | First fine-tuned for 3 epochs on the virtual dataset, followed by 1 epoch on the real dataset. |
Fine-tuning was performed in two stages:
| Dataset | Stage 1 | Stage 2 |
|---|---|---|
| V | Fine-tuned for 3 epochs on the virtual dataset | Fine-tuned for 1 epoch |
| R | Fine-tuned for 1 epoch on the real dataset | Fine-tuned for 1 epoch |
| V+R | Fine-tuned for 3 epochs on the virtual dataset and 1 epoch on the real dataset | Fine-tuned for 1 epoch on each dataset |
The models is available on Hugging Face.
This table presents a comparison between VAD-Tiny and VAD-Base using the Boundary + Divider map modeling strategy on the Full Warehouse map. This map structure closely aligns with the nuScenes map definition, providing a comprehensive evaluation in an open-loop scenario.
| Method | L2 (m) 1s ↓ | L2 (m) 2s ↓ | L2 (m) 3s ↓ | L2 (m) Avg. ↓ | AP Divider ↑ | AP Boundary ↑ | FPS | Collision (%) ↓ |
|---|---|---|---|---|---|---|---|---|
| VAD-Tiny | 1.772 | 3.291 | 5.008 | 3.357 | 0.004 | 0.000 | 7.6 | 0.00 |
| VAD-Base | 3.296 | 5.779 | 8.429 | 5.835 | 0.001 | 0.000 | 4.6 | 0.00 |
Note: AP Divider and AP Boundary are computed with a threshold of 1.5.
- Open-loop planning results on FreeWorld(Real Part). and FreeAskWorld Dataset(Virtual Part)
Help you use FreeAskWorld Dataset with nuScenes-like API. FreeAskWorldDataset
Help you use FreeWorld Dataset with nuScenes-like API. FreeWorldDataset
Note: We found that the VAD exaggerated the predicted distance of map objects, and the 3D box detection performance was average.
All code in this repository is under the Apache License 2.0.
FreeAD is based on the following projects: VAD, LAW and MapTR. Many thanks for their excellent contributions to the community.


