Benchmarks
This section defines the detection and trajectory prediction tasks, evaluation metrics, and experimental configurations used throughout EagleVision.
3D Detection Task
The 3D detection task localizes racing vehicles in the LiDAR coordinate frame using oriented 3D bounding boxes.
Only one semantic class (Car) is considered.
Evaluation range
x ∈ [-60, 60] m y ∈ [-60, 60] m z ∈ [-2, 4] m
Detection Metrics
- AP: Average Precision computed over center-distance thresholds {0.25, 0.5, 1.0, 2.0} meters.
- ATE: Average Translation Error (center distance).
- ASE: Average Scale Error.
- AOE: Average Orientation Error (yaw).
- NDS: reduced nuScenes-style detection score (velocity/attribute omitted).
Reduced NDS
NDS = (1/10) * ( 5*AP + 3 - ATE - ASE - AOE )
Detection Transfer Protocol
We denote A2RL Real as R, Indy as I, Simulator as S, and Waymo pretraining as W. For all experiments, the best checkpoint is selected according to validation NDS.
- R (Scratch): Training on R from random initialization.
- W → R: Waymo-pretrained model finetuned on R.
- W → (R+S)1:1: Joint finetuning on R and S with equal sampling ratio.
- W → S10 → R: Pretraining on S for 10 epochs, then finetuning on R.
- W → (R + 0.1S): Finetuning on R augmented with 10% simulator samples.
- W → (R + 0.1I): Finetuning on R augmented with 10% Indy samples.
- W → I10 → R: Pretraining on I for 10 epochs before finetuning on R.
- W → (S+I)30 → R: Joint pretraining on S and I for 30 epochs.
- W → I5 → S8 → R: Two-stage pretraining before final finetuning.
Detection Results (A2RL Real)
| Setup | AP ↑ | ATE ↓ | ASE ↓ | AOE ↓ | NDS ↑ |
|---|---|---|---|---|---|
| R | 0.843 | 0.1796 | 0.0716 | 0.0372 | 0.69266 |
| W → R | 0.890 | 0.1702 | 0.0333 | 0.0231 | 0.72234 |
| W → (R+S)1:1 | 0.847 | 0.1918 | 0.0348 | 0.0254 | 0.69830 |
| W → S10 → R | 0.879 | 0.1580 | 0.0294 | 0.0208 | 0.71868 |
| W → (R + 0.1S) | 0.873 | 0.1589 | 0.0310 | 0.0209 | 0.71542 |
| W → (R + 0.1I) | 0.883 | 0.1666 | 0.0413 | 0.0213 | 0.71858 |
| W → I10 → R | 0.895 | 0.1547 | 0.0320 | 0.0274 | 0.72609 |
| W → (S+I)30 → R | 0.876 | 0.1654 | 0.0337 | 0.0186 | 0.71623 |
| W → I5 → S8 → R | 0.878 | 0.1632 | 0.0263 | 0.0302 | 0.71703 |
Trajectory Prediction Results (ADE/FDE, lower is better)
| Dataset | Train | Validation | Test | R | ||||
|---|---|---|---|---|---|---|---|---|
| ADE ↓ | FDE ↓ | ADE ↓ | FDE ↓ | ADE ↓ | FDE ↓ | ADE ↓ | FDE ↓ | |
| Iv5 | 0.502 | 0.905 | 1.152 | 2.326 | 0.774 | 1.511 | 0.478 | 0.947 |
| Iv3 | 0.672 | 1.175 | 1.624 | 3.254 | 0.862 | 1.611 | 1.567 | 4.205 |
| I | 0.378 | 0.743 | 0.621 | 1.276 | 1.611 | 3.177 | 0.852 | 1.389 |
Cross-dataset Trajectory Prediction on R
Models are trained on R and I respectively and evaluated on R using ADE and FDE (↓).
| Dataset | Train | Validation | Test | |||
|---|---|---|---|---|---|---|
| ADE ↓ | FDE ↓ | ADE ↓ | FDE ↓ | ADE ↓ | FDE ↓ | |
| R | 0.266 | 0.414 | 0.214 | 0.302 | 0.484 | 1.24 |
| I | 0.502 | 0.905 | 1.152 | 2.326 | 0.478 | 0.947 |