diff --git a/--gui b/--gui
new file mode 100644
index 000000000..e69de29bb
diff --git a/--num-robots b/--num-robots
new file mode 100644
index 000000000..e69de29bb
diff --git a/--sim b/--sim
new file mode 100644
index 000000000..e69de29bb
diff --git a/--stress-iterations b/--stress-iterations
new file mode 100644
index 000000000..e69de29bb
diff --git a/--trajectory-types b/--trajectory-types
new file mode 100644
index 000000000..e69de29bb
diff --git a/-v b/-v
new file mode 100644
index 000000000..fa52f4143
--- /dev/null
+++ b/-v
@@ -0,0 +1,11 @@
+access control disabled, clients can connect from any host
+============================= test session starts ==============================
+platform linux -- Python 3.12.13, pytest-9.0.3, pluggy-1.6.0 -- /usr/local/bin/python3.12
+cachedir: /tmp/.pytest_cache
+rootdir: /home/pranavkumara/Desktop/AirStack/tests
+configfile: pytest.ini
+plugins: dependency-0.6.1, timeout-2.4.0
+collecting ... collected 0 items
+
+- generated xml file: /home/pranavkumara/Desktop/AirStack/tests/results/2026-05-28_14-14-04/results.xml -
+============================ no tests ran in 0.00s =============================
diff --git a/.agents/skills/run-system-tests/SKILL.md b/.agents/skills/run-system-tests/SKILL.md
index 453bbc953..f9b41b727 100644
--- a/.agents/skills/run-system-tests/SKILL.md
+++ b/.agents/skills/run-system-tests/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: run-system-tests
-description: Run, interpret, and extend AirStack's pytest system test suite (build_packages, build_docker, liveliness, sensors, takeoff_hover_land), trigger runs via /pytest PR comments, and read metrics.json regression reports. Use for invoking tests, debugging failures from results.xml/metrics.json, or adding a new system test.
+description: Run, interpret, and extend AirStack's pytest system test suite (build_packages, build_docker, liveliness, sensors, takeoff_hover_land, autonomy), trigger runs via /pytest PR comments, and read metrics.json regression reports. Use for invoking tests, debugging failures from results.xml/metrics.json, or adding a new system test.
 license: Apache-2.0
 metadata:
   author: AirLab CMU
@@ -14,7 +14,7 @@ metadata:
 Use this skill when you need to:
 
 - Invoke the pytest system tests locally (via `airstack test`) or on CI (via `/pytest` PR comment or `workflow_dispatch`)
-- Diagnose a failing system test — interpret `results.xml`, per-test logs, and `metrics.json` from `tests/results/<timestamp>/`
+- Diagnose a failing system test — interpret `summary.txt`, `results.xml`, and `metrics.json` from `tests/results/<timestamp>/`
 - Compare metrics against a baseline run (`parse_metrics.py --baseline`) to confirm a regression or improvement
 - Add a new system test to `tests/`: pick the right mark, wire up `airstack_env` parametrization, and record metrics with `MetricsRecorder`
 
@@ -24,7 +24,7 @@ This skill is about the **test harness itself** — pytest marks, fixtures, the
 
 The suite lives at `tests/` (repo root) and is fully pytest-based. Configuration is in `tests/pytest.ini` and shared infrastructure in `tests/conftest.py`.
 
-- **`tests/system/`** — Docker stack integration tests. Marks: `build_docker`, `build_packages`, `liveliness`, `sensors`, `takeoff_hover_land`.
+- **`tests/system/`** — Docker stack integration tests. Marks: `build_docker`, `build_packages`, `liveliness`, `sensors`, `takeoff_hover_land`, `autonomy`.
 - **`tests/robot/`** and **`tests/sim/`** — Hermetic **unit** tests (`@pytest.mark.unit`). These are **thin proxy files** that re-export tests from each ROS 2 package's own `test/` directory (co-located with the source, the ROS 2 / colcon convention). The proxy pattern keeps test source next to the code it tests while making tests discoverable by `pytest tests/`.
 
 ### Unit tests vs system tests
@@ -55,6 +55,7 @@ For details on the proxy pattern and adding new unit tests, see the
 | `tests/system/test_liveliness.py` | `liveliness` | Stack bring-up: containers Running, `/clock` readiness, tmux panes, sentinel ROS 2 nodes, compute, infra-only `test_stable` | Docker daemon, NVIDIA GPU + `nvidia-container-toolkit`, sim license / Omniverse creds |
 | `tests/system/test_sensors.py` | `sensors` | Topic Hz (Isaac: batched on sim + robot; LiDAR `echo-once` + cloud sanity), RTF, `test_sensor_streams_stable` | Docker daemon, NVIDIA GPU + `nvidia-container-toolkit`, sim license / Omniverse creds |
 | `tests/system/test_takeoff_hover_land.py` | `takeoff_hover_land` | 4-phase flight chain per `(sim, num_robots, iteration, velocity)`: `test_px4_ready` → `test_takeoff` → `test_hover` → `test_landing`. Records altitude error, overshoot, hover stability, landing accuracy, odometry drift | Docker daemon, NVIDIA GPU, sim license |
+| `tests/system/test_fixed_trajectory.py` | `autonomy` | 4-phase flight chain per `(sim, num_robots, iteration, trajectory_type)`: `test_px4_ready` → `test_takeoff` → `test_fixed_trajectory` → `test_landing`. Records cross-track error, path RMSE, trajectory success/time for Circle/Figure8/Racetrack/Line | Docker daemon, NVIDIA GPU, sim license |
 
 The marks are declared in `tests/pytest.ini`. **Do not invent new marks ad-hoc** — register any new mark there or pytest will warn about unknown marks.
 
@@ -63,10 +64,10 @@ The marks are declared in `tests/pytest.ini`. **Do not invent new marks ad-hoc**
 `conftest.py` enforces a deterministic global order so cheap-and-fast-failing tests surface first:
 
 ```
-system.test_build_docker → system.test_build_packages → system.test_liveliness → system.test_sensors → system.test_takeoff_hover_land
+system.test_build_docker → system.test_build_packages → system.test_liveliness → system.test_sensors → system.test_takeoff_hover_land → system.test_fixed_trajectory
 ```
 
-Within `system.test_takeoff_hover_land`, items are re-sorted to `(airstack_env, velocity, phase)` so each `(sim, robots, iter)` env brings the stack up once and the drone goes ground → air → ground per velocity before pytest moves to the next velocity.
+Within `system.test_takeoff_hover_land`, items are re-sorted to `(airstack_env, velocity, phase)` so each `(sim, robots, iter)` env brings the stack up once and the drone goes ground → air → ground per velocity before pytest moves to the next velocity. `system.test_fixed_trajectory` is re-sorted the same way by `(airstack_env, trajectory_type, phase)`.
 
 ### Isaac Sim (`sensors`): why Hz is batched and LiDAR uses `echo --once`
 
@@ -219,17 +220,17 @@ Every run (local or CI) produces a fresh timestamped directory under `tests/resu
 
 ```
 tests/results/2025-04-21_14-30-00/
+├── summary.txt        # Human-readable key metrics — open this first
 ├── results.xml        # JUnit XML — durations + pass/fail per test
-├── metrics.json       # Custom metrics keyed by test_node_id → metric_key
-└── logs/
-    ├── system.test_build_docker.TestDockerBuilds.test_build_robot_desktop.log
-    ├── system.test_sensors.TestSensors.test_sensor_streams_stable[msairsim-rob#1-iter0].log
-    ├── system.test_liveliness.TestLiveliness.test_stable[msairsim-rob#1-iter0].log
-    ├── airstack_env.system.test_liveliness.TestLiveliness.test_robot_containers_running[...].log
-    └── ...
+└── metrics.json       # Custom metrics keyed by test_node_id → metric_key
 ```
 
-**One log file per test execution**, plus separate `airstack_env.*.log` files for fixture narration (the `up`/`down` of each parametrize tuple). The fixture log file is named to track the rewritten test ID so it lands next to the triggering test.
+There is **no `logs/` subdirectory**. Live output streams to the terminal during
+the run (pytest `log_cli`), and each subprocess's combined stdout/stderr is held
+in memory so a failed assertion can include the tail of the last command's output
+inline. `summary.txt` is written once at session end by
+`run_summary.write_summary()`, so the key metrics land in one place without
+digging through raw output.
 
 ### `metrics.json` structure
 
@@ -345,7 +346,7 @@ Conventions:
 
 ### 6. Fixture extension
 
-If multiple tests need the same setup, add a fixture in `conftest.py` (not in your test file) so it's available repo-wide. Mirror the `airstack_env` pattern: yield a dict, narrate via `logger_to(log)`, record any setup/teardown timing as metrics.
+If multiple tests need the same setup, add a fixture in `conftest.py` (not in your test file) so it's available repo-wide. Mirror the `airstack_env` pattern: yield a dict, log progress via the shared `logger` (output streams to the terminal via `log_cli`), record any setup/teardown timing as metrics.
 
 ## Common Pitfalls
 
@@ -356,7 +357,7 @@ If multiple tests need the same setup, add a fixture in `conftest.py` (not in yo
 - **Not capturing metrics in a new test**. If a test fails silently (no metric recorded) the regression report has nothing to compare. Always record at least one scalar via `MetricsRecorder` so the test shows up in `metrics.json`.
 - **Letting parametrize cardinality explode**. Defaults `--sim msairsim,isaacsim --num-robots 1,3` with `--stress-iterations 3` multiply stack bring-ups for each selected mark (`liveliness`, `sensors`, `takeoff_hover_land`, …) — expensive. Override locally to a single tuple while iterating.
 - **Hardcoded container names**. Always use `find_container`, `get_robot_containers`, or `wait_for_container` — replica suffixes (`-1`, `-2`, `-3`) and compose project prefixes change.
-- **Asserting on stdout instead of using `read_log_tail`**. The conftest tees subprocess output to per-test log files; assertions should reference those logs (`f"airstack up failed:\n{read_log_tail()}"`) so failures attach the relevant context to the JUnit XML.
+- **Asserting on stdout instead of using `read_log_tail`**. The conftest captures each subprocess's combined stdout/stderr in memory; assertions should reference it via `read_log_tail()` (`f"airstack up failed:\n{read_log_tail()}"`) so failures attach the relevant context to the JUnit XML.
 - **Trying to SSH into a CI runner mid-job**. Workers are ephemeral OpenStack VMs destroyed within ~30s of job completion. Re-running the job creates a fresh VM. For genuine debugging on the runner, see `.github/orchestrator/README.md` (also exposed at `tests/ci-cd-orchestrator.md`) — but in 99% of cases, reproduce locally with `airstack test`.
 - **Forgetting to register a new mark**. Adding `@pytest.mark.my_new_mark` without updating `tests/pytest.ini` produces "PytestUnknownMarkWarning" and makes `-m my_new_mark` fail to filter as expected.
 
diff --git a/.env b/.env
index 82cc01ccb..770ca77e6 100644
--- a/.env
+++ b/.env
@@ -12,7 +12,7 @@ PROJECT_NAME="airstack"
 # If you've run ./airstack.sh setup, then this will auto-generate from the git commit hash every time a change is made 
 # to a Dockerfile or docker-compose.yaml file. Otherwise this can also be set explicitly to make a release version.
 # auto-generated from git commit hash
-VERSION="0.19.0-alpha.3"
+VERSION="0.19.0-alpha.4"
 # Choose "dev" or "prebuilt". "dev" is for mounted code that must be built live. "prebuilt" is for built ros_ws baked into the image
 DOCKER_IMAGE_BUILD_MODE="dev"  
 # Where to push and pull images from. Can replace with your docker hub username if using docker hub.
@@ -53,4 +53,4 @@ DEBUG_RVIZ="false"  # "true" or "false". If true, launches RViz alongside the ro
 
 # offboard API streaming out. this is so that ports don't conflict for multi-agent FCU communication. 
 OFFBOARD_BASE_PORT=14540
-ONBOARD_BASE_PORT=14580
+ONBOARD_BASE_PORT=14580
\ No newline at end of file
diff --git a/AGENTS.md b/AGENTS.md
index b53069e75..0a6b86015 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -196,7 +196,7 @@ docker exec airstack-robot-desktop-1 bash -c "ros2 topic echo <topic_name> --onc
    - Verify module behavior in isolation
    - Test with synthetic data
    - Located in module's `test/` directory
-   - **Run in the robot container** with `colcon test` (after `bws`), not via `airstack test -m unit`. The root [`tests/`](tests/) suite does **not** register a `unit` pytest mark; `airstack test -m <mark>` only selects marks declared in [`tests/pytest.ini`](tests/pytest.ini) (`build_docker`, `build_packages`, `liveliness`, `sensors`, `takeoff_hover_land`).
+   - **Run in the robot container** with `colcon test` (after `bws`) for the full ROS 2 build + test. The same co-located test source is re-exported to the root [`tests/`](tests/) suite via thin proxies (see Unit tests below), so `airstack test -m unit` runs it too. Marks are declared in [`tests/pytest.ini`](tests/pytest.ini) (`unit`, `build_docker`, `build_packages`, `liveliness`, `sensors`, `takeoff_hover_land`, `autonomy`).
 
    ```bash
    docker exec airstack-robot-desktop-1 bash -c "sws && colcon test --packages-select natnet_ros2 --event-handlers console_direct+"
@@ -221,8 +221,9 @@ Pytest-based system tests live under [`tests/system/`](tests/system/). They brin
 | [`tests/system/test_liveliness.py`](tests/system/test_liveliness.py) | `liveliness` | Stack bring-up: containers, ``/clock`` readiness, tmux, sentinel ROS 2 nodes, compute, infra-only stability poll | Docker, GPU, sim license |
 | [`tests/system/test_sensors.py`](tests/system/test_sensors.py) | `sensors` | Topic Hz (Isaac: batched sim + robot ``ros2 topic hz``; filtered LiDAR ``echo-once`` + validation script), RTF, sensor stability time-series | Docker, GPU, sim license |
 | [`tests/system/test_takeoff_hover_land.py`](tests/system/test_takeoff_hover_land.py) | `takeoff_hover_land` | 4-phase flight chain (PX4 ready → takeoff → hover → land) per (sim, num_robots, iter, velocity) | Docker, GPU, sim license |
+| [`tests/system/test_fixed_trajectory.py`](tests/system/test_fixed_trajectory.py) | `autonomy` | 4-phase flight chain (PX4 ready → takeoff → execute Circle/Figure8/Racetrack/Line trajectory → land) per (sim, num_robots, iter, trajectory_type); records cross-track error and path RMSE | Docker, GPU, sim license |
 
-Shared fixtures, the `airstack_env` parametrized fixture, and `MetricsRecorder` live in [`tests/conftest.py`](tests/conftest.py). Each run produces a timestamped directory under `tests/results/<timestamp>/` with `results.xml`, `metrics.json`, and per-test logs. [`tests/parse_metrics.py`](tests/parse_metrics.py) generates a markdown report (single-run or diff-vs-baseline; exits 1 on regression).
+Shared fixtures, the `airstack_env` parametrized fixture, and `MetricsRecorder` live in [`tests/conftest.py`](tests/conftest.py). Each run produces a timestamped directory under `tests/results/<timestamp>/` with `summary.txt`, `results.xml`, and `metrics.json` (no per-test log files — live output streams to the terminal via `log_cli`). [`tests/parse_metrics.py`](tests/parse_metrics.py) generates a markdown report (single-run or diff-vs-baseline; exits 1 on regression).
 
 **Run via the CLI** (containerized runner — no local Python needed):
 
@@ -232,6 +233,7 @@ airstack test -m "build_docker or build_packages" -v
 airstack test -m liveliness --sim msairsim --num-robots 1 --stress-iterations 1 -v
 airstack test -m sensors --sim isaacsim --num-robots 1 --stress-iterations 1 -v
 airstack test -m takeoff_hover_land --sim msairsim --takeoff-velocities 0.5,1,2 -v
+airstack test -m autonomy --sim msairsim --trajectory-types Circle,Figure8,Racetrack,Line -v
 ```
 
 Full reference: [`tests/README.md`](tests/README.md) — including **liveliness vs
diff --git a/docs/development/intermediate/testing/fixed_trajectory_testing.md b/docs/development/intermediate/testing/fixed_trajectory_testing.md
new file mode 100644
index 000000000..8d5ca8491
--- /dev/null
+++ b/docs/development/intermediate/testing/fixed_trajectory_testing.md
@@ -0,0 +1,511 @@
+# Fixed-Trajectory Path-Tracker Benchmark
+
+This guide documents the **fixed-trajectory evaluation test suite** (`tests/test_fixed_trajectory.py`): why it exists, how it is implemented, how to run it, how to interpret results, and how to use it to **compare path trackers** without rewriting tests.
+
+For the broader system-test suite, see [`tests/README.md`](../../../../tests/README.md).
+
+---
+
+## Purpose
+
+AirStack's local controls stack separates **reference-path generation**, **path tracking**, and **low-level control**:
+
+```mermaid
+flowchart LR
+    FT[FixedTrajectoryTask] --> TL[trajectory_library]
+    TL --> TC[trajectory_controller<br/>path tracker]
+    TC -->|~/tracking_point| PID[pid_controller]
+    PID --> FC[Flight computer / PX4]
+    FC --> ODOM[local_position/odom]
+    ODOM --> TC
+```
+
+The benchmark harness holds the **reference trajectory** and **flight procedure** constant so maintainers can:
+
+- **Swap or retune path trackers** and measure the same metrics every time.
+- **Compare execution time** — how long does a standard pattern take in sim-time?
+- **Compare tracking error** — mean/max cross-track error and path RMSE against a known ideal path.
+- **Detect regressions** — action timeouts, stalls, or catastrophic drift via `trajectory_success` and assertion thresholds.
+
+Today the default tracker is the **sphere-intersection pure-pursuit** implementation in `trajectory_controller` + `trajectory_library`. The downstream `pid_controller` is held fixed so changes isolate tracker behavior. A different tracker can replace the `trajectory_controller` node (or its parameters) in launch; the pytest module does not need to change as long as `FixedTrajectoryTask` and odom topics remain the same.
+
+---
+
+## What gets tested
+
+### Test module
+
+| Item | Value |
+| ---- | ----- |
+| File | [`tests/test_fixed_trajectory.py`](../../../../tests/test_fixed_trajectory.py) |
+| Pytest mark | `autonomy` |
+| Class | `TestFixedTrajectory` |
+| Timeout | 2400 s per test class invocation |
+
+### Parametrization
+
+Each run sweeps:
+
+```
+(sim, num_robots, iteration, trajectory_type)
+```
+
+| Parameter | CLI flag | Default |
+| --------- | -------- | ------- |
+| Simulator | `--sim` | `msairsim,isaacsim` |
+| Robot count | `--num-robots` | `1,3` |
+| Repeat count | `--stress-iterations` | `1` |
+| Trajectory type | `--trajectory-types` | `Circle,Figure8,Racetrack,Line` |
+
+!!! tip "Pin your sweep for local runs"
+    Defaults multiply configs and run for hours. For development, always set explicit values:
+
+    ```bash
+    airstack test -m autonomy \
+      --sim isaacsim \
+      --num-robots 1 \
+      --stress-iterations 1 \
+      --trajectory-types Circle \
+      -v
+    ```
+
+### Four-phase flight chain
+
+For every `(sim, num_robots, iteration, trajectory_type)` tuple the drone runs:
+
+| Phase | Test | Action | Pass criteria |
+| ----- | ---- | ------ | ------------- |
+| 1 | `test_px4_ready` | Wait for MAVROS + odom | All robots connected and publishing within 300 s wall-clock |
+| 2 | `test_takeoff` | `TakeoffTask` to 10 m @ 1 m/s | Steady-state altitude within ±10% of target |
+| 3 | `test_fixed_trajectory` | `FixedTrajectoryTask` | Cross-track mean &lt; 5 m; records success + timing |
+| 4 | `test_landing` | `LandTask` @ 1 m/s | Final altitude &lt; 0.5 m |
+
+```mermaid
+stateDiagram-v2
+    [*] --> PX4Ready
+    PX4Ready --> Takeoff
+    Takeoff --> ExecuteTrajectory
+    ExecuteTrajectory --> Land : always
+    ExecuteTrajectory --> Land : even on trajectory failure
+    Land --> [*]
+    Takeoff --> Poisoned : takeoff fails
+    Land --> Poisoned : landing fails
+    Poisoned --> [*] : skip remaining types in env
+```
+
+**Chain guard:** a failure in phase 3 (`test_fixed_trajectory`) does **not** poison the environment — landing always runs so the drone returns to the ground before the next trajectory type. Failures in takeoff or landing **do** poison the env and skip subsequent trajectory types for that `(sim, num_robots, iteration)`.
+
+Phase 1 (`test_px4_ready`) runs once per env regardless of how many trajectory types are swept.
+
+---
+
+## Reference trajectories
+
+The test uses the same patterns as the `FixedTrajectoryTask` action server in `trajectory_controller` (`fixed_trajectory_task.cpp`). Default parameters are defined in `TRAJECTORY_CONFIGS` inside `test_fixed_trajectory.py` and must stay in sync with the C++ generators.
+
+| Type | Parameters | Approx. path length | Expected sim-time* |
+| ---- | ---------- | ------------------- | ------------------ |
+| **Circle** | radius=10 m, velocity=2 m/s | ~63 m loop + return segments | **~45–50 s** |
+| **Figure8** | length=15 m, width=8 m, v=2 m/s, max_accel=1 m/s² | ~100+ m | **~50–70 s** |
+| **Racetrack** | length=30 m, width=10 m, v=3 m/s, turn_v=1.5 m/s | ~80+ m | **~30–50 s** |
+| **Line** | length=20 m, v=2 m/s, max_accel=1 m/s² | 20 m | **~12–15 s** |
+
+\*Sim-time from odom timestamps; wall-clock varies with sim real-time factor (RTF).
+
+### Circle geometry (ideal path)
+
+Python `_ideal_circle()` mirrors `generate_circle()` in C++:
+
+- Start at origin, move to `(radius, 0, 0)`.
+- Trace the circle in 10° steps.
+- Return to `(radius, 0, 0)` then origin.
+
+The trajectory is defined in **`base_link`** at dispatch; the test transforms it to **world frame** using the robot pose snapshot (see below).
+
+---
+
+## Metrics
+
+All metrics are recorded per robot as `robot_N.<key>` in `tests/results/<timestamp>/metrics.json` and rolled up into `summary.txt`.
+
+### Flight metrics
+
+| Key | Unit | Better | Description |
+| --- | ---- | ------ | ----------- |
+| `ready_duration_sys_s` | s | lower | Wall-clock time until PX4/MAVROS ready |
+| `takeoff_duration_sim_s` | s | lower | Sim-time from first motion to 95% of 10 m target |
+| `altitude_error_m` | m | lower | Signed steady-state altitude error after takeoff |
+| `overshoot_m` | m | lower | Unsigned overshoot above 10 m |
+| `trajectory_success` | — | **higher** | `1.0` if action returned `success: true`, else `0.0` |
+| `trajectory_execution_time_sim_s` | s | lower | Sim-time from action dispatch to completion |
+| `cross_track_error_mean_m` | m | lower | Mean 2-D lateral distance to nearest ideal point |
+| `cross_track_error_max_m` | m | lower | Worst 2-D lateral deviation |
+| `path_rmse_m` | m | lower | 2-D RMSE against ideal polyline |
+| `land_duration_sim_s` | s | lower | Sim-time from 80% peak descent to &lt; 0.5 m |
+| `final_altitude_m` | m | lower | Altitude when landing action completes |
+
+### How to read metrics when comparing trackers
+
+| Observation | Likely meaning |
+| ----------- | -------------- |
+| High `cross_track_error_max_m`, moderate mean | Turn/corner lag (common on Circle) |
+| High mean and max | Tracker not keeping up or wrong frame |
+| Long `trajectory_execution_time_sim_s` at same velocity | Virtual time stalling behind the robot |
+| `trajectory_success = 0` | Action timed out or aborted — fix before interpreting error |
+| Good mean, bad max | Occasional spikes — check sphere intersection on curves |
+
+### Observed baseline (Circle, Isaac Sim, 10 headless runs)
+
+Validated on branch `pkumaraTrajectoryTesting` — see `tests/results/2026-06-05_18-26-52/summary.txt`:
+
+| Metric | Typical value |
+| ------ | ------------- |
+| Tests | 40 passed / 0 failed (10 iter × 4 phases) |
+| `trajectory_success` | yes (every run) |
+| `trajectory_execution_time_sim_s` | ~46 s |
+| `cross_track_error_mean_m` | ~0.98 m |
+| `cross_track_error_max_m` | ~5.0 m |
+| `path_rmse_m` | ~1.55 m |
+| `final_altitude_m` | &lt; 0.05 m |
+
+The assertion tolerance is **`CROSS_TRACK_TOLERANCE_M = 5.0`** in `test_fixed_trajectory.py` — intentionally loose while the default tracker matures. Tighten this constant as tracking improves.
+
+---
+
+## Cross-track error algorithm
+
+The test measures **end-to-end** tracking (tracker + PID + sim physics), not the tracker in isolation.
+
+### Steps
+
+1. **Snapshot pose** — immediately before sending `FixedTrajectoryTask`, read one odom sample: `(x₀, y₀, z₀, yaw₀)`.
+2. **Build ideal path** — generate waypoints in `base_link` using the same equations as C++ (`_ideal_circle`, `_ideal_figure8`, etc.).
+3. **Transform to world** — rotate by `yaw₀` and translate by `(x₀, y₀, z₀)`.
+4. **Capture odom** — background `ros2 topic echo --csv` on `/robot_N/interface/mavros/local_position/odom` for the action duration (timeout 180 s).
+5. **Compute error** — for each odom sample, find the nearest ideal waypoint in **XY**; record distance statistics.
+
+Altitude is not part of cross-track error (these patterns are flat; altitude is checked at takeoff).
+
+### Why world-frame alignment matters
+
+`FixedTrajectoryTask` publishes the path in `base_link` relative to the robot at dispatch. Without transforming the ideal path to world frame, odom (world-fixed) would be compared against the wrong reference and error would be meaningless.
+
+---
+
+## Results pipeline
+
+Every `airstack test` run writes:
+
+```
+tests/results/<YYYY-MM-DD_HH-MM-SS>/
+├── summary.txt      ← open this first (human-readable)
+├── results.xml      ← JUnit pass/fail + durations
+└── metrics.json     ← structured metrics for diff tools
+```
+
+| Artifact | Producer | Use |
+| -------- | -------- | --- |
+| `summary.txt` | `tests/run_summary.py` (auto at session end via `conftest.py`) | Quick pass/fail + key numbers per trajectory type |
+| `results.xml` | pytest `--junitxml` | CI, phase wall times |
+| `metrics.json` | `MetricsRecorder` in `conftest.py` | Regression diffs |
+
+### Regenerate or inspect
+
+```bash
+# Latest run
+LATEST=$(ls -1t tests/results/ | head -1)
+
+# Human summary
+cat "tests/results/$LATEST/summary.txt"
+
+# Regenerate summary manually
+python3 tests/run_summary.py "tests/results/$LATEST/"
+
+# Markdown table of all metrics
+python3 tests/parse_metrics.py --current "tests/results/$LATEST/"
+
+# Compare two tracker configs
+python3 tests/parse_metrics.py \
+  --current  "tests/results/$NEW/" \
+  --baseline "tests/results/$OLD/" \
+  --threshold 20 \
+  --output report.md
+```
+
+`parse_metrics.py` exits **1** when any metric regresses beyond the threshold percentage.
+
+---
+
+## Running tests (complete CLI reference)
+
+### Prerequisites
+
+```bash
+cd /path/to/AirStack
+airstack setup
+```
+
+Required:
+
+- Docker daemon (user in `docker` group)
+- NVIDIA GPU + `nvidia-container-toolkit` for sim tests
+- Isaac Sim: `simulation/isaac-sim/docker/omni_pass.env` configured
+
+### Primary interface
+
+```bash
+airstack test [pytest options]
+```
+
+All arguments are forwarded to pytest inside the containerized test runner (`tests/docker/`).
+
+### Rebuild after C++ changes
+
+```bash
+airstack test -m build_packages -v
+```
+
+Always run this after modifying `trajectory_controller`, `trajectory_library`, or launch params before flight tests.
+
+### Fixed-trajectory commands
+
+```bash
+# Quick Circle regression (recommended smoke test)
+airstack test -m "build_packages or autonomy" \
+  --sim isaacsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle \
+  -v
+
+# All four trajectory types, ms-airsim
+airstack test -m autonomy \
+  --sim msairsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle,Figure8,Racetrack,Line \
+  -v
+
+# Stress: 10 iterations (statistical stability)
+airstack test -m autonomy \
+  --sim isaacsim \
+  --num-robots 1 \
+  --stress-iterations 10 \
+  --trajectory-types Circle \
+  -v
+
+# Visual debug (sim GUI)
+airstack test -m autonomy \
+  --sim isaacsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle \
+  --gui \
+  -v
+
+# Run only the trajectory phase (debugging)
+airstack test -m autonomy \
+  --sim isaacsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle \
+  -k test_fixed_trajectory \
+  -v
+```
+
+### Global CLI options
+
+| Option | Default | Description |
+| ------ | ------- | ----------- |
+| `--sim` | `msairsim,isaacsim` | Comma-separated sim targets |
+| `--num-robots` | `1,3` | Comma-separated robot counts |
+| `--stress-iterations` | `1` | Repeat count per `(sim, num_robots)` |
+| `--trajectory-types` | `Circle,Figure8,Racetrack,Line` | Trajectory sweep |
+| `--gui` | off | Show simulator windows |
+| `-v` | — | Verbose pytest |
+| `-k EXPR` | — | Filter test names |
+
+### Direct pytest (local Python env)
+
+For faster iteration when editing test code:
+
+```bash
+export AIRSTACK_ROOT=$(pwd)
+pip install -r tests/requirements.txt
+
+pytest tests/ -m autonomy \
+  --sim isaacsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle \
+  -v
+```
+
+### CI: `/pytest` PR comment
+
+Core contributors can trigger runs by commenting on the PR:
+
+```
+/pytest -m "build_packages or autonomy" --sim isaacsim --num-robots 1 --stress-iterations 1 --trajectory-types Circle -v
+```
+
+The workflow auto-prepends `build_packages` when not already specified.
+
+---
+
+## Comparing path trackers
+
+### What to change
+
+| Layer | Location | Examples |
+| ----- | -------- | -------- |
+| Tracker params | `robot/ros_ws/src/local/local_bringup/launch/local.launch.xml` (or `local_droan_cpu.launch.xml`) | `sphere_radius`, `look_ahead_time`, `search_ahead_factor`, `min_virtual_tracking_velocity` |
+| Tracker implementation | Replace or fork `trajectory_controller` node | Alternative pure-pursuit, different intersection logic |
+| Low-level control | Swap `pid_controller` for `attitude_controller` in launch | Changes end-to-end error, not tracker-only |
+
+Key `trajectory_controller` parameters today:
+
+| Param | Current value | Role |
+| ----- | ------------- | ---- |
+| `sphere_radius` | `2.0` | Lookahead sphere radius (m) |
+| `look_ahead_time` | `1.0` | Look-ahead horizon for local planner feed |
+| `virtual_tracking_ahead_time` | `0.5` | Virtual tracking search window |
+| `min_virtual_tracking_velocity` | `0.5` | Below this, time-advance mode instead of sphere mode |
+| `search_ahead_factor` | `1.5` | Multiplier on sphere radius when searching intersection |
+
+### Recommended A/B workflow
+
+```bash
+# 1. Baseline run
+airstack test -m "build_packages or autonomy" \
+  --sim isaacsim --num-robots 1 --stress-iterations 5 \
+  --trajectory-types Circle -v
+BASELINE=$(ls -1t tests/results/ | head -1)
+
+# 2. Edit tracker params in local.launch.xml, rebuild
+airstack test -m build_packages -v
+
+# 3. Candidate run
+airstack test -m autonomy \
+  --sim isaacsim --num-robots 1 --stress-iterations 5 \
+  --trajectory-types Circle -v
+CURRENT=$(ls -1t tests/results/ | head -1)
+
+# 4. Diff
+python3 tests/parse_metrics.py \
+  --current  "tests/results/$CURRENT/" \
+  --baseline "tests/results/$BASELINE/" \
+  --threshold 20
+```
+
+Focus on: `cross_track_error_mean_m`, `cross_track_error_max_m`, `path_rmse_m`, `trajectory_execution_time_sim_s`, `trajectory_success`.
+
+---
+
+## Path tracker bug fixes (this PR)
+
+The benchmark exposed failures in the default sphere-intersection tracker. Fixes included:
+
+### 1. Wrong first-segment sphere test (`trajectory_library.cpp`)
+
+`get_waypoint_sphere_intersection()` checked whether the **end** of the first segment was inside the sphere, not the **interpolated point at `initial_time`**. On curved paths the robot's projection often lies mid-segment, causing false "no intersection" results.
+
+**Fix:** interpolate `wp_start` to `initial_time`, then test distance from that point to the sphere center.
+
+### 2. Controller stall (`trajectory_controller.cpp`)
+
+When intersection failed, `virtual_time` could freeze and the tracking point collapsed onto the robot — the drone **stalled on closed loops** (Circle).
+
+**Fixes:**
+
+- Fallback to `get_waypoint_distance_ahead()` when sphere intersection fails.
+- On `AHEAD NOT VALID`, advance `virtual_time` by `time_multiplier × elapsed_sim_time`.
+- Throttled `WARN` instead of per-tick logging.
+
+### 3. Missing waypoint times on merge (`trajectory_library.cpp`)
+
+`Trajectory::merge()` into an empty trajectory now calls `generate_waypoint_times()`.
+
+### 4. Parameter tuning
+
+`sphere_radius` increased from `1.0` → `2.0` in both `local.launch.xml` and `local_droan_cpu.launch.xml`.
+
+---
+
+## Manual stack usage (without pytest)
+
+To fly a fixed trajectory interactively:
+
+```bash
+cd /path/to/AirStack
+
+# Bring up Isaac Sim + robot (1 robot, headless)
+COMPOSE_PROFILES=isaac-sim NUM_ROBOTS=1 airstack up
+
+# Takeoff (optional — or use RViz task panel)
+docker exec -it airstack-robot-desktop-1 bash -c '
+  source /opt/ros/jazzy/setup.bash &&
+  source /root/AirStack/robot/ros_ws/install/setup.bash &&
+  ros2 action send_goal /robot_1/tasks/takeoff task_msgs/action/TakeoffTask \
+    "{target_altitude_m: 10.0, velocity_m_s: 1.0}"
+'
+
+# Circle trajectory
+docker exec -it airstack-robot-desktop-1 bash -c '
+  source /opt/ros/jazzy/setup.bash &&
+  source /root/AirStack/robot/ros_ws/install/setup.bash &&
+  ros2 action send_goal --feedback /robot_1/tasks/fixed_trajectory \
+    task_msgs/action/FixedTrajectoryTask \
+    "{trajectory_spec: {type: Circle, attributes: [{key: radius, value: \"10.0\"}, {key: velocity, value: \"2.0\"}]}, loop: false}"
+'
+
+# Land
+docker exec -it airstack-robot-desktop-1 bash -c '
+  source /opt/ros/jazzy/setup.bash &&
+  source /root/AirStack/robot/ros_ws/install/setup.bash &&
+  ros2 action send_goal /robot_1/tasks/land task_msgs/action/LandTask \
+    "{velocity_m_s: 1.0}"
+'
+
+airstack down
+```
+
+Action server: `/{robot_name}/tasks/fixed_trajectory` — see also [Tasks and Task Executors](../../../robot/autonomy/tasks.md).
+
+---
+
+## Troubleshooting
+
+| Symptom | Likely cause | Fix |
+| ------- | ------------ | --- |
+| Sentinel nodes missing | Workspace not built in container | `-m "build_packages or autonomy"` |
+| PX4 ready timeout | Sim not running, GPU issue | Check `nvidia-smi`, Isaac `omni_pass.env` |
+| `trajectory_success = 0` | Tracker stall or timeout | Check trajectory_controller logs; verify bug fixes applied |
+| Cross-track error &gt;&gt; 5 m | Wrong tracker params or frame bug | Compare launch params; check world-frame transform |
+| Tests run for hours | Default `--sim` and `--num-robots` sweep | Pin `--sim isaacsim --num-robots 1 --stress-iterations 1` |
+| Unknown mark warning `autonomy` | Mark not in `pytest.ini` | Harmless; filter still works |
+
+---
+
+## Source file reference
+
+| File | Role |
+| ---- | ---- |
+| [`tests/test_fixed_trajectory.py`](../../../../tests/test_fixed_trajectory.py) | Test module, ideal paths, metrics |
+| [`tests/conftest.py`](../../../../tests/conftest.py) | Fixtures, `--trajectory-types`, summary hook, collection order |
+| [`tests/run_summary.py`](../../../../tests/run_summary.py) | `summary.txt` generator |
+| [`tests/parse_metrics.py`](../../../../tests/parse_metrics.py) | Markdown reports + regression diff |
+| [`tests/pytest.ini`](../../../../tests/pytest.ini) | Registered marks |
+| [`robot/.../fixed_trajectory_task.cpp`](../../../../robot/ros_ws/src/local/controls/trajectory_controller/src/fixed_trajectory_task.cpp) | C++ reference path generators |
+| [`robot/.../trajectory_controller.cpp`](../../../../robot/ros_ws/src/local/controls/trajectory_controller/src/trajectory_controller.cpp) | Pure-pursuit path tracker |
+| [`robot/.../trajectory_library.cpp`](../../../../robot/ros_ws/src/local/planners/trajectory_library/src/trajectory_library.cpp) | Trajectory math, sphere intersection |
+| [`robot/.../local.launch.xml`](../../../../robot/ros_ws/src/local/local_bringup/launch/local.launch.xml) | Tracker + PID params |
+
+---
+
+## Related documentation
+
+- [System tests overview (`tests/README.md`)](../../../../tests/README.md)
+- [Trajectory Controller README](../../../../robot/ros_ws/src/local/controls/trajectory_controller/README.md)
+- [Tasks and Task Executors](../../../robot/autonomy/tasks.md)
+- [CI/CD orchestrator](../../../../tests/ci-cd-orchestrator.md)
diff --git a/docs/development/intermediate/testing/index.md b/docs/development/intermediate/testing/index.md
index 0769e2b09..4fd0ec033 100644
--- a/docs/development/intermediate/testing/index.md
+++ b/docs/development/intermediate/testing/index.md
@@ -1,6 +1,6 @@
 # Testing
 
-AirStack uses three complementary test layers, each with a distinct scope and
+AirStack uses several complementary test layers, each with a distinct scope and
 hardware requirement:
 
 | Layer | Where | Mark / Tool | Hardware |
@@ -34,13 +34,17 @@ Full Docker-stack integration tests. The canonical reference is
 
 | Mark | Module | Role |
 |---|---|---|
+| `build_docker` | `system/test_build_docker.py` | Docker image builds |
+| `build_packages` | `system/test_build_packages.py` | `colcon build` inside containers |
 | `liveliness` | `system/test_liveliness.py` | Containers, `/clock` readiness, tmux, sentinel ROS 2 nodes, compute, infra-only stability poll |
 | `sensors` | `system/test_sensors.py` | Sim + robot stereo/depth Hz, filtered LiDAR (`echo --once` + validation script on Isaac), sim RTF, sensor stability time-series |
-| `takeoff_hover_land` | `system/test_takeoff_hover_land.py` | Four-phase flight chain per configuration |
+| `takeoff_hover_land` | `system/test_takeoff_hover_land.py` | Four-phase flight chain per configuration (takeoff → hover → land) |
+| `autonomy` | `system/test_fixed_trajectory.py` | Fixed-pattern path-tracker benchmark (takeoff → trajectory → land) |
 
-Collection order is defined in `tests/conftest.py` (`liveliness` before `sensors`
-before `takeoff_hover_land`). Each mark's test class uses **class-scoped**
-`airstack_env`, so combining marks with `and` runs multiple full stack bring-ups
+Collection order is defined in `tests/conftest.py` (unit tests first, then
+`build_docker` → `build_packages` → `liveliness` → `sensors` → `takeoff_hover_land`
+→ `test_fixed_trajectory`). Each mark's test **class** uses **class-scoped**
+`airstack_env`, so combining marks with **`or`** runs multiple full stack bring-ups
 per `(sim, num_robots, iteration)` — see *Bring-up scope* in `tests/README.md`.
 
 **Isaac Sim:** the `sensors` implementation batches `ros2 topic hz` on sim and
@@ -48,6 +52,22 @@ robot paths and avoids `hz` on filtered `PointCloud2`; pytest enables `ENABLE_LI
 for the multi-drone Pegasus script. Details: **`tests/README.md`** → *Isaac Sim and
 the sensors mark*.
 
+### Fixed-trajectory path-tracker benchmark
+
+For the full guide — purpose, metrics, CLI, comparing trackers, bug fixes, and
+baselines — see **[Fixed-Trajectory Path-Tracker Benchmark](fixed_trajectory_testing.md)**.
+
+Quick smoke test:
+
+```bash
+airstack test -m "build_packages or autonomy" \
+  --sim isaacsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle \
+  -v
+```
+
 ## Other testing docs
 
 - [Unit Testing](unit_testing.md) — `@pytest.mark.unit`, proxy pattern, CI workflow
diff --git a/git-hooks/docker-versioning/update-docker-image-tag_BACKUP_3660135.pre-commit b/git-hooks/docker-versioning/update-docker-image-tag_BACKUP_3660135.pre-commit
deleted file mode 100755
index 43c8e1d00..000000000
--- a/git-hooks/docker-versioning/update-docker-image-tag_BACKUP_3660135.pre-commit
+++ /dev/null
@@ -1,150 +0,0 @@
-<<<<<<< HEAD
-<<<<<<< HEAD
-
-=======
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-=======
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-#!/bin/bash
-# Pre-commit hook to update VERSION in .env file with git commit hash
-# when Dockerfile is modified or docker-compose.yaml has changes under build: key
-
-# Check if any Dockerfile files are staged for commit
-DOCKERFILE_CHANGED=$(git diff --cached --name-only | grep -E 'Dockerfile$')
-<<<<<<< HEAD
-<<<<<<< HEAD
-=======
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-
-# Check if docker-compose.yaml has changes under build: key
-COMPOSE_BUILD_CHANGED=false
-COMPOSE_FILES=$(git diff --cached --name-only | grep -E 'docker-compose\.yaml$')
-
-if [ -n "$COMPOSE_FILES" ]; then
-    for file in $COMPOSE_FILES; do
-        # Get the diff for the docker-compose.yaml file
-        DIFF_OUTPUT=$(git diff --cached "$file")
-<<<<<<< HEAD
-        
-=======
-
-# Check if docker-compose.yaml has changes under build: key
-COMPOSE_BUILD_CHANGED=false
-COMPOSE_FILES=$(git diff --cached --name-only | grep -E 'docker-compose\.yaml$')
-
-if [ -n "$COMPOSE_FILES" ]; then
-    for file in $COMPOSE_FILES; do
-        # Get the diff for the docker-compose.yaml file
-        DIFF_OUTPUT=$(git diff --cached "$file")
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-        # Check if any lines with changes (+ or -) contain build: or are indented under a build: section
-        # This regex looks for lines that:
-        # 1. Start with + or - (indicating changes)
-        # 2. Either contain "build:" directly, or
-        # 3. Are indented lines that could be under a build: section
-        if echo "$DIFF_OUTPUT" | grep -E '^[+-].*build:' > /dev/null; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-<<<<<<< HEAD
-        
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-        # More sophisticated check: look for changes in build context
-        # Extract the full diff and check for build-related changes
-        BUILD_SECTION_CHANGED=$(echo "$DIFF_OUTPUT" | awk '
-            /^[+-].*build:/ { in_build=1; print; next }
-            /^[+-]/ && in_build && /^[+-][[:space:]]+/ { print; next }
-            /^[+-]/ && !/^[+-][[:space:]]/ { in_build=0 }
-            /^[+-].*build:/ { print }
-        ')
-<<<<<<< HEAD
-        
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-=======
-
-        # Check if any lines with changes (+ or -) contain build: or are indented under a build: section
-        # This regex looks for lines that:
-        # 1. Start with + or - (indicating changes)
-        # 2. Either contain "build:" directly, or
-        # 3. Are indented lines that could be under a build: section
-        if echo "$DIFF_OUTPUT" | grep -E '^[+-].*build:' > /dev/null; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-
-        # More sophisticated check: look for changes in build context
-        # Extract the full diff and check for build-related changes
-        BUILD_SECTION_CHANGED=$(echo "$DIFF_OUTPUT" | awk '
-            /^[+-].*build:/ { in_build=1; print; next }
-            /^[+-]/ && in_build && /^[+-][[:space:]]+/ { print; next }
-            /^[+-]/ && !/^[+-][[:space:]]/ { in_build=0 }
-            /^[+-].*build:/ { print }
-        ')
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-        if [ -n "$BUILD_SECTION_CHANGED" ]; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-    done
-fi
-
-on_gh_pages=$([ "$(git rev-parse --abbrev-ref HEAD)" = "gh-pages" ] && echo true || echo false)
-
-# Trigger update if we're not on gh-pages and either Dockerfile changed or build section in docker-compose.yaml changed
-if [ "$on_gh_pages" = false ] && ([ -n "$DOCKERFILE_CHANGED" ] || [ "$COMPOSE_BUILD_CHANGED" = true ]); then
-    if [ -n "$DOCKERFILE_CHANGED" ]; then
-        echo "Dockerfile changed. Updating VERSION in .env file..."
-    fi
-    if [ "$COMPOSE_BUILD_CHANGED" = true ]; then
-        echo "docker-compose.yaml build configuration changed. Updating VERSION in .env file..."
-    fi
-<<<<<<< HEAD
-<<<<<<< HEAD
-    
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-    # Get the current commit hash (short version)
-    COMMIT_HASH=$(git rev-parse --short HEAD)
-
-    # Update the VERSION in .env file
-    if [ -f ".env" ]; then
-        # Check if VERSION line exists
-        if grep -q "^VERSION=" .env; then
-            # Replace the existing VERSION line and ensure comment is above it
-            # First, remove any existing auto-generated comment
-            sed -i '/^# auto-generated from git commit hash$/d' .env
-            # Add the comment above the VERSION line
-            sed -i '/^VERSION=/i\# auto-generated from git commit hash' .env
-            # Update the VERSION value
-            sed -i "s/^VERSION=.*$/VERSION=\"$COMMIT_HASH\"/" .env
-            echo "Updated VERSION to $COMMIT_HASH in .env file"
-
-            # Stage the modified .env file for commit
-            git add .env
-        else
-            echo "Error: VERSION line not found in .env file"
-            exit 1
-        fi
-    else
-        echo "Error: .env file not found"
-        exit 1
-    fi
-fi
-<<<<<<< HEAD
-<<<<<<< HEAD
-exit 0
-=======
-=======
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-exit 0
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
diff --git a/git-hooks/docker-versioning/update-docker-image-tag_BASE_3660135.pre-commit b/git-hooks/docker-versioning/update-docker-image-tag_BASE_3660135.pre-commit
deleted file mode 100644
index a7bd36a2b..000000000
--- a/git-hooks/docker-versioning/update-docker-image-tag_BASE_3660135.pre-commit
+++ /dev/null
@@ -1,42 +0,0 @@
-#!/bin/bash
-
-# Pre-commit hook to update VERSION in .env file with git commit hash
-# when Dockerfile or docker-compose.yaml files are modified
-
-# Check if any Dockerfile or docker-compose.yaml files are staged for commit
-DOCKER_FILES_CHANGED=$(git diff --cached --name-only | grep -E 'Dockerfile|docker-compose\.yaml$')
-
-on_gh_pages=$([ "$(git rev-parse --abbrev-ref HEAD)" = "gh-pages" ] && echo true || echo false)
-
-if [ "$on_gh_pages" = false ] && [ -n "$DOCKER_FILES_CHANGED" ]; then
-    echo "Docker-related files changed. Updating VERSION in .env file..."
-    
-    # Get the current commit hash (short version)
-    COMMIT_HASH=$(git rev-parse --short HEAD)
-    
-    # Update the VERSION in .env file
-    if [ -f ".env" ]; then
-        # Check if VERSION line exists
-        if grep -q "^VERSION=" .env; then
-            # Replace the existing VERSION line and ensure comment is above it
-            # First, remove any existing auto-generated comment
-            sed -i '/^# auto-generated from git commit hash$/d' .env
-            # Add the comment above the VERSION line
-            sed -i '/^VERSION=/i\# auto-generated from git commit hash' .env
-            # Update the VERSION value
-            sed -i "s/^VERSION=.*$/VERSION=\"$COMMIT_HASH\"/" .env
-            echo "Updated VERSION to $COMMIT_HASH in .env file"
-            
-            # Stage the modified .env file for commit
-            git add .env
-        else
-            echo "Error: VERSION line not found in .env file"
-            exit 1
-        fi
-    else
-        echo "Error: .env file not found"
-        exit 1
-    fi
-fi
-
-exit 0
diff --git a/git-hooks/docker-versioning/update-docker-image-tag_LOCAL_3660135.pre-commit b/git-hooks/docker-versioning/update-docker-image-tag_LOCAL_3660135.pre-commit
deleted file mode 100644
index 8cddc2b65..000000000
--- a/git-hooks/docker-versioning/update-docker-image-tag_LOCAL_3660135.pre-commit
+++ /dev/null
@@ -1,114 +0,0 @@
-<<<<<<< HEAD
-
-=======
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-#!/bin/bash
-# Pre-commit hook to update VERSION in .env file with git commit hash
-# when Dockerfile is modified or docker-compose.yaml has changes under build: key
-
-# Check if any Dockerfile files are staged for commit
-DOCKERFILE_CHANGED=$(git diff --cached --name-only | grep -E 'Dockerfile$')
-<<<<<<< HEAD
-
-# Check if docker-compose.yaml has changes under build: key
-COMPOSE_BUILD_CHANGED=false
-COMPOSE_FILES=$(git diff --cached --name-only | grep -E 'docker-compose\.yaml$')
-
-if [ -n "$COMPOSE_FILES" ]; then
-    for file in $COMPOSE_FILES; do
-        # Get the diff for the docker-compose.yaml file
-        DIFF_OUTPUT=$(git diff --cached "$file")
-        
-=======
-
-# Check if docker-compose.yaml has changes under build: key
-COMPOSE_BUILD_CHANGED=false
-COMPOSE_FILES=$(git diff --cached --name-only | grep -E 'docker-compose\.yaml$')
-
-if [ -n "$COMPOSE_FILES" ]; then
-    for file in $COMPOSE_FILES; do
-        # Get the diff for the docker-compose.yaml file
-        DIFF_OUTPUT=$(git diff --cached "$file")
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-        # Check if any lines with changes (+ or -) contain build: or are indented under a build: section
-        # This regex looks for lines that:
-        # 1. Start with + or - (indicating changes)
-        # 2. Either contain "build:" directly, or
-        # 3. Are indented lines that could be under a build: section
-        if echo "$DIFF_OUTPUT" | grep -E '^[+-].*build:' > /dev/null; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-<<<<<<< HEAD
-        
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-        # More sophisticated check: look for changes in build context
-        # Extract the full diff and check for build-related changes
-        BUILD_SECTION_CHANGED=$(echo "$DIFF_OUTPUT" | awk '
-            /^[+-].*build:/ { in_build=1; print; next }
-            /^[+-]/ && in_build && /^[+-][[:space:]]+/ { print; next }
-            /^[+-]/ && !/^[+-][[:space:]]/ { in_build=0 }
-            /^[+-].*build:/ { print }
-        ')
-<<<<<<< HEAD
-        
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-        if [ -n "$BUILD_SECTION_CHANGED" ]; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-    done
-fi
-
-on_gh_pages=$([ "$(git rev-parse --abbrev-ref HEAD)" = "gh-pages" ] && echo true || echo false)
-
-# Trigger update if we're not on gh-pages and either Dockerfile changed or build section in docker-compose.yaml changed
-if [ "$on_gh_pages" = false ] && ([ -n "$DOCKERFILE_CHANGED" ] || [ "$COMPOSE_BUILD_CHANGED" = true ]); then
-    if [ -n "$DOCKERFILE_CHANGED" ]; then
-        echo "Dockerfile changed. Updating VERSION in .env file..."
-    fi
-    if [ "$COMPOSE_BUILD_CHANGED" = true ]; then
-        echo "docker-compose.yaml build configuration changed. Updating VERSION in .env file..."
-    fi
-<<<<<<< HEAD
-    
-=======
-
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
-    # Get the current commit hash (short version)
-    COMMIT_HASH=$(git rev-parse --short HEAD)
-
-    # Update the VERSION in .env file
-    if [ -f ".env" ]; then
-        # Check if VERSION line exists
-        if grep -q "^VERSION=" .env; then
-            # Replace the existing VERSION line and ensure comment is above it
-            # First, remove any existing auto-generated comment
-            sed -i '/^# auto-generated from git commit hash$/d' .env
-            # Add the comment above the VERSION line
-            sed -i '/^VERSION=/i\# auto-generated from git commit hash' .env
-            # Update the VERSION value
-            sed -i "s/^VERSION=.*$/VERSION=\"$COMMIT_HASH\"/" .env
-            echo "Updated VERSION to $COMMIT_HASH in .env file"
-
-            # Stage the modified .env file for commit
-            git add .env
-        else
-            echo "Error: VERSION line not found in .env file"
-            exit 1
-        fi
-    else
-        echo "Error: .env file not found"
-        exit 1
-    fi
-fi
-<<<<<<< HEAD
-exit 0
-=======
-exit 0
->>>>>>> 9181923d... Only update image tag if docker-compose.yaml has a change to the 'build:' key
diff --git a/git-hooks/docker-versioning/update-docker-image-tag_REMOTE_3660135.pre-commit b/git-hooks/docker-versioning/update-docker-image-tag_REMOTE_3660135.pre-commit
deleted file mode 100644
index 656e19706..000000000
--- a/git-hooks/docker-versioning/update-docker-image-tag_REMOTE_3660135.pre-commit
+++ /dev/null
@@ -1,81 +0,0 @@
-#!/bin/bash
-# Pre-commit hook to update VERSION in .env file with git commit hash
-# when Dockerfile is modified or docker-compose.yaml has changes under build: key
-
-# Check if any Dockerfile files are staged for commit
-DOCKERFILE_CHANGED=$(git diff --cached --name-only | grep -E 'Dockerfile$')
-
-# Check if docker-compose.yaml has changes under build: key
-COMPOSE_BUILD_CHANGED=false
-COMPOSE_FILES=$(git diff --cached --name-only | grep -E 'docker-compose\.yaml$')
-
-if [ -n "$COMPOSE_FILES" ]; then
-    for file in $COMPOSE_FILES; do
-        # Get the diff for the docker-compose.yaml file
-        DIFF_OUTPUT=$(git diff --cached "$file")
-
-        # Check if any lines with changes (+ or -) contain build: or are indented under a build: section
-        # This regex looks for lines that:
-        # 1. Start with + or - (indicating changes)
-        # 2. Either contain "build:" directly, or
-        # 3. Are indented lines that could be under a build: section
-        if echo "$DIFF_OUTPUT" | grep -E '^[+-].*build:' > /dev/null; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-
-        # More sophisticated check: look for changes in build context
-        # Extract the full diff and check for build-related changes
-        BUILD_SECTION_CHANGED=$(echo "$DIFF_OUTPUT" | awk '
-            /^[+-].*build:/ { in_build=1; print; next }
-            /^[+-]/ && in_build && /^[+-][[:space:]]+/ { print; next }
-            /^[+-]/ && !/^[+-][[:space:]]/ { in_build=0 }
-            /^[+-].*build:/ { print }
-        ')
-
-        if [ -n "$BUILD_SECTION_CHANGED" ]; then
-            COMPOSE_BUILD_CHANGED=true
-            break
-        fi
-    done
-fi
-
-on_gh_pages=$([ "$(git rev-parse --abbrev-ref HEAD)" = "gh-pages" ] && echo true || echo false)
-
-# Trigger update if we're not on gh-pages and either Dockerfile changed or build section in docker-compose.yaml changed
-if [ "$on_gh_pages" = false ] && ([ -n "$DOCKERFILE_CHANGED" ] || [ "$COMPOSE_BUILD_CHANGED" = true ]); then
-    if [ -n "$DOCKERFILE_CHANGED" ]; then
-        echo "Dockerfile changed. Updating VERSION in .env file..."
-    fi
-    if [ "$COMPOSE_BUILD_CHANGED" = true ]; then
-        echo "docker-compose.yaml build configuration changed. Updating VERSION in .env file..."
-    fi
-
-    # Get the current commit hash (short version)
-    COMMIT_HASH=$(git rev-parse --short HEAD)
-
-    # Update the VERSION in .env file
-    if [ -f ".env" ]; then
-        # Check if VERSION line exists
-        if grep -q "^VERSION=" .env; then
-            # Replace the existing VERSION line and ensure comment is above it
-            # First, remove any existing auto-generated comment
-            sed -i '/^# auto-generated from git commit hash$/d' .env
-            # Add the comment above the VERSION line
-            sed -i '/^VERSION=/i\# auto-generated from git commit hash' .env
-            # Update the VERSION value
-            sed -i "s/^VERSION=.*$/VERSION=\"$COMMIT_HASH\"/" .env
-            echo "Updated VERSION to $COMMIT_HASH in .env file"
-
-            # Stage the modified .env file for commit
-            git add .env
-        else
-            echo "Error: VERSION line not found in .env file"
-            exit 1
-        fi
-    else
-        echo "Error: .env file not found"
-        exit 1
-    fi
-fi
-exit 0
diff --git a/mkdocs.yml b/mkdocs.yml
index 4dc57b961..80e516ded 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -69,6 +69,7 @@ nav:
               - Overview: docs/development/intermediate/testing/index.md
               - Unit Testing: docs/development/intermediate/testing/unit_testing.md
               - System Tests: tests/README.md
+              - Fixed-Trajectory Benchmark: docs/development/intermediate/testing/fixed_trajectory_testing.md
               - CI/CD Orchestrator: tests/ci-cd-orchestrator.md
           - Frame Conventions: docs/development/intermediate/frame_conventions.md
           - Docker Build Profiles: docs/development/intermediate/docker-build-profiles.md
diff --git a/robot/ros_ws/src/local/controls/trajectory_controller/src/trajectory_controller.cpp b/robot/ros_ws/src/local/controls/trajectory_controller/src/trajectory_controller.cpp
index ffc3fac2b..1dd389fe3 100644
--- a/robot/ros_ws/src/local/controls/trajectory_controller/src/trajectory_controller.cpp
+++ b/robot/ros_ws/src/local/controls/trajectory_controller/src/trajectory_controller.cpp
@@ -302,7 +302,18 @@ void TrajectoryControlNode::timer_callback() {
                     virtual_time, search_ahead_factor * get_sphere_radius(closest_ahead_wp.velocity().length()),
                     prev_vtp_time + look_ahead_time, robot_point, get_sphere_radius(closest_ahead_wp.velocity().length()),
                     min_virtual_tracking_velocity, &vtp_wp, &end_wp);
-                if (vtp_valid) current_virtual_ahead_time = vtp_wp.get_time() - virtual_time;
+                if (vtp_valid) {
+                    current_virtual_ahead_time = vtp_wp.get_time() - virtual_time;
+                } else {
+                    // Keep the tracking point ahead when sphere intersection fails so the
+                    // controller does not collapse onto the robot projection and stall.
+                    Waypoint ahead_wp(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+                    if (trajectory->get_waypoint_distance_ahead(
+                            virtual_time, get_sphere_radius(closest_ahead_wp.velocity().length()),
+                            &ahead_wp)) {
+                        current_virtual_ahead_time = ahead_wp.get_time() - virtual_time;
+                    }
+                }
 
                 // visualization
                 if (vtp_valid)
@@ -320,14 +331,17 @@ void TrajectoryControlNode::timer_callback() {
                     .add_sphere(target_frame, now,
                                 end_wp.get_x(), end_wp.get_y(), end_wp.get_z(), 0.025f)
                     .set_color(0.f, 0.f, 1.f);
-            } else{
-                RCLCPP_INFO(this->get_logger(), "AHEAD NOT VALID");
-		
-	        markers
-  		    .add_sphere(target_frame, now, robot_point.x(),
-		  	        robot_point.y(), robot_point.z(), 1.)
-		    .set_color(1.f, 0.f, 0.f, 0.7f);
-	    }
+            } else {
+                RCLCPP_WARN_THROTTLE(this->get_logger(), *this->get_clock(), 5000,
+                                     "AHEAD NOT VALID — advancing virtual_time by elapsed sim time");
+                virtual_time = std::min(trajectory->get_duration(),
+                                        virtual_time + time_multiplier * execute_elapsed);
+
+                markers
+                    .add_sphere(target_frame, now, robot_point.x(), robot_point.y(), robot_point.z(),
+                                1.)
+                    .set_color(1.f, 0.f, 0.f, 0.7f);
+            }
         } else {
             if (new_rewind) {
                 float before = virtual_time;
diff --git a/robot/ros_ws/src/local/local_bringup/launch/local.launch.xml b/robot/ros_ws/src/local/local_bringup/launch/local.launch.xml
index fd46b9282..f80e09400 100644
--- a/robot/ros_ws/src/local/local_bringup/launch/local.launch.xml
+++ b/robot/ros_ws/src/local/local_bringup/launch/local.launch.xml
@@ -112,7 +112,7 @@
         <param name="look_ahead_time" value="1.0" />
         <param name="virtual_tracking_ahead_time" value="0.5" />
         <param name="min_virtual_tracking_velocity" value="0.5" />
-        <param name="sphere_radius" value="1.0" />
+        <param name="sphere_radius" value="2.0" />
         <param name="ff_min_velocity" value="0." />
         <param name="search_ahead_factor" value="1.5" />
         <param name="transition_velocity_scale" value="1.0" />
diff --git a/robot/ros_ws/src/local/local_bringup/launch/local_droan_cpu.launch.xml b/robot/ros_ws/src/local/local_bringup/launch/local_droan_cpu.launch.xml
index 045f6a08e..ca1285738 100644
--- a/robot/ros_ws/src/local/local_bringup/launch/local_droan_cpu.launch.xml
+++ b/robot/ros_ws/src/local/local_bringup/launch/local_droan_cpu.launch.xml
@@ -97,7 +97,7 @@
         <param name="look_ahead_time" value="1.0" />
         <param name="virtual_tracking_ahead_time" value="0.5" />
         <param name="min_virtual_tracking_velocity" value="0.5" />
-        <param name="sphere_radius" value="1.0" />
+        <param name="sphere_radius" value="2.0" />
         <param name="ff_min_velocity" value="0." />
         <param name="search_ahead_factor" value="1.5" />
         <param name="transition_velocity_scale" value="1.0" />
diff --git a/robot/ros_ws/src/local/planners/trajectory_library/src/trajectory_library.cpp b/robot/ros_ws/src/local/planners/trajectory_library/src/trajectory_library.cpp
index 5f35a379a..f3ac7bd60 100644
--- a/robot/ros_ws/src/local/planners/trajectory_library/src/trajectory_library.cpp
+++ b/robot/ros_ws/src/local/planners/trajectory_library/src/trajectory_library.cpp
@@ -390,6 +390,7 @@ bool Trajectory::merge(Trajectory traj, double min_time) {
     if (waypoints.size() == 0) {
         waypoints.insert(waypoints.end(), transformed_traj.waypoints.begin(),
                          transformed_traj.waypoints.end());
+        generate_waypoint_times();
         return true;
     }
 
@@ -501,15 +502,15 @@ bool Trajectory::get_waypoint_sphere_intersection(double initial_time, double ah
         Waypoint wp_end = waypoints[i];
 	last_waypoint_index = i;
 
-	// if the very first waypoint we check isn't within the sphere, then return not found
-	if(i == 1 && wp_start.position().distance(sphere_center) > sphere_radius)
-	  return false;
-
         // handle the case that the initial_time is between waypoint i-1 and waypoint i
         if (wp_start.get_time() < initial_time)
             wp_start = wp_start.interpolate(wp_end, (initial_time - wp_start.get_time()) /
                                                         (wp_end.get_time() - wp_start.get_time()));
 
+        // if the first segment we check starts outside the sphere, there is no intersection
+        if (i == 1 && wp_start.position().distance(sphere_center) > sphere_radius)
+            return false;
+
         // sphere line intersection equations:
         // http://www.ambrsoft.com/TrigoCalc/Sphere/SpherLineIntersection_.htm
         double x1 = wp_start.get_x();
diff --git a/tests/README.md b/tests/README.md
index f10942ff7..b9c88999d 100644
--- a/tests/README.md
+++ b/tests/README.md
@@ -23,6 +23,7 @@ Shared fixtures live in `tests/conftest.py`. Use `airstack test -m unit -v` for
 | [`system/test_liveliness.py`](system/test_liveliness.py) | `liveliness` | Stack bring-up: container Running state, ``/clock`` readiness, tmux panes, sentinel ROS 2 nodes, compute snapshot, infra-only ``test_stable`` (tmux + nodes + compute) | Docker daemon, GPU, sim license |
 | [`system/test_sensors.py`](system/test_sensors.py) | `sensors` | After liveliness in collection order: sim + robot stereo/depth Hz (**Isaac:** batched ``ros2 topic hz`` to avoid bridge overload; **ms-airsim:** single batch), filtered LiDAR via ``echo --once`` + cloud sanity (isaacsim), sim RTF, ``test_sensor_streams_stable`` | Docker daemon, GPU, sim license |
 | [`system/test_takeoff_hover_land.py`](system/test_takeoff_hover_land.py) | `takeoff_hover_land` | End-to-end flight: PX4 readiness gate, takeoff to 10 m, hover stability, land — one chain per (sim, num_robots, iteration, velocity) | Docker daemon, GPU, sim license |
+| [`system/test_fixed_trajectory.py`](system/test_fixed_trajectory.py) | `autonomy` | Fixed-pattern trajectory evaluation: takeoff, execute a trajectory (Circle, Figure8, Racetrack, Line), record path deviation metrics, land — one chain per (sim, num_robots, iteration, trajectory_type) | Docker daemon, GPU, sim license |
 
 ### Unit tests (`tests/robot/`, `tests/sim/`)
 
@@ -43,11 +44,11 @@ See [Unit Testing Guide](../docs/development/intermediate/testing/unit_testing.m
 and the `add-unit-tests` agent skill for full details.
 
 Marks can be combined with pytest logic:
-`-m unit`, `-m "build_docker or build_packages"`, `-m liveliness`, `-m sensors`, `-m takeoff_hover_land`, or e.g. `-m "liveliness or sensors"` (see **Bring-up scope** below).
+`-m unit`, `-m "build_docker or build_packages"`, `-m liveliness`, `-m sensors`, `-m takeoff_hover_land`, `-m autonomy`, or e.g. `-m "liveliness or sensors"` (see **Bring-up scope** below).
 
 ### Bring-up scope (`airstack_env`)
 
-`airstack_env` is **class-scoped** and parametrized per `(sim, num_robots, iteration)`. Each test **class** that uses it (`TestLiveliness`, `TestSensors`, `TestTakeoffHoverLand`, …) performs its **own** ``airstack up`` / ``airstack down`` for that parametrization. Selecting both classes (for example, ``-m "liveliness or sensors"``) runs **two** full stack cycles per tuple (liveliness class, then sensors class). Collection order (see ``conftest.py``) runs **liveliness before sensors** when both are selected. To save wall time, run ``-m liveliness`` or ``-m sensors`` alone when one suite is enough.
+`airstack_env` is **class-scoped** and parametrized per `(sim, num_robots, iteration)`. Each test **class** that uses it (`TestLiveliness`, `TestSensors`, `TestTakeoffHoverLand`, `TestFixedTrajectory`, …) performs its **own** ``airstack up`` / ``airstack down`` for that parametrization. Selecting both classes (for example, ``-m "liveliness or sensors"``) runs **two** full stack cycles per tuple (liveliness class, then sensors class). Collection order (see ``conftest.py``) runs **liveliness before sensors** when both are selected. To save wall time, run ``-m liveliness`` or ``-m sensors`` alone when one suite is enough.
 
 ---
 
@@ -93,35 +94,23 @@ Writes custom metrics to `tests/results/<timestamp>/metrics.json` after each `re
 
 ### Output files
 
-Every test run produces a timestamped directory. **Per-test logs** — for each
-pytest function, `pytest_runtest_setup` in `conftest.py` attaches the shared
-logger to `logs/test_<module>.<Class>.<test>[<param-id>].log` (param ids are
-rewritten for readability, e.g. `msairsim-rob#1-iter0`; see
-`pytest_collection_modifyitems`).
-
-**`airstack_env.<…>.log`** — the class-scoped `airstack_env` fixture wraps
-`airstack up` / `airstack down` in `logger_to("airstack_env." + <current nodeid>)`
-(see `conftest.py`). So you get an extra file whose name is the word
-`airstack_env.` plus the **node id of whichever test was running when the
-fixture first ran** for that class. For `TestLiveliness` that is almost always
-`test_robot_containers_running` (first test in the class), not `test_stable`.
-That file holds compose / `airstack` subprocess output; each test still has its
-own log for assertions and `docker exec` / `ros2` lines.
+Every test run produces a timestamped directory containing only `summary.txt`,
+`results.xml`, and `metrics.json` — there is **no** `logs/` subdirectory and no
+per-test log files are written under the run directory.
 
 ```
 tests/results/
 └── 2025-04-21_14-30-00/
+    ├── summary.txt        # Human-readable key metrics — open this first
     ├── results.xml        # JUnit XML — test durations and pass/fail status
-    ├── metrics.json       # Custom metrics (image sizes, Hz, compute, timing)
-    └── logs/
-        ├── system.test_build_docker.TestDockerBuilds.test_build_robot_desktop.log
-        ├── airstack_env.system.test_liveliness.TestLiveliness.test_robot_containers_running[msairsim-rob#1-iter0].log
-        ├── system.test_liveliness.TestLiveliness.test_robot_containers_running[msairsim-rob#1-iter0].log
-        ├── system.test_liveliness.TestLiveliness.test_stable[msairsim-rob#1-iter0].log
-        ├── system.test_sensors.TestSensors.test_sensor_streams_stable[msairsim-rob#1-iter0].log
-        └── ...            # More per-test logs; another airstack_env.* per class using the fixture
+    └── metrics.json       # Custom metrics (image sizes, Hz, compute, timing)
 ```
 
+Live test output goes to the terminal (pytest `log_cli`). On failure, assertion
+messages include the tail of the last subprocess output (the in-memory
+`read_log_tail` of the relevant `docker` / `ros2` subprocess) — no per-test log
+files are written under the run directory.
+
 ---
 
 ## Running Tests
@@ -175,7 +164,7 @@ can reach the host X server; it is a no-op when `DISPLAY` is not set.
 ### Prerequisites
 
 - Docker daemon running with your user in the `docker` group
-- NVIDIA drivers + `nvidia-container-toolkit` for liveliness, sensors, and takeoff_hover_land tests
+- NVIDIA drivers + `nvidia-container-toolkit` for liveliness, sensors, takeoff_hover_land, and autonomy tests
 - `airstack setup` completed (adds `airstack` to `PATH`)
 
 ### Direct pytest (for development / debugging)
@@ -282,6 +271,87 @@ airstack test -m takeoff_hover_land \
 
 ---
 
+## Fixed Trajectory Tests (`system/test_fixed_trajectory.py`)
+
+!!! note "Detailed guide"
+    For the full path-tracker benchmark documentation — architecture, metrics, CLI reference, comparing trackers, bug fixes, and baselines — see **[Fixed-Trajectory Path-Tracker Benchmark](../docs/development/intermediate/testing/fixed_trajectory_testing.md)**.
+
+`TestFixedTrajectory` runs a **4-phase flight chain** for every combination of
+`(sim, num_robots, iteration, trajectory_type)`. For each trajectory type the drone
+takes off, executes the pattern, then lands — regardless of whether the trajectory
+phase passes or fails (a trajectory failure does not skip landing).
+
+Supported trajectory types: `Circle`, `Figure8`, `Racetrack`, `Line` (same patterns as
+the `fixed_trajectory_task` ROS 2 action server in `trajectory_controller`).
+
+### Phase order
+
+| Phase | Test | What happens |
+| ----- | ---- | ------------ |
+| 1 | `test_px4_ready` | Waits for MAVROS + PX4 EKF ready; once per env |
+| 2 | `test_takeoff` | Takeoff to 10 m at 1 m/s; asserts altitude within 10 % |
+| 3 | `test_fixed_trajectory` | Sends `FixedTrajectoryTask`; captures odom; asserts cross-track error |
+| 4 | `test_landing` | Sends `LandTask`; asserts final altitude < 0.5 m |
+
+A failure in `test_fixed_trajectory` does **not** poison the chain — `test_landing` always
+runs so the drone returns to the ground before the next trajectory type starts.
+
+### Recorded metrics
+
+| Metric key | Unit | Description |
+| ---------- | ---- | ----------- |
+| `ready_duration_sys_s` | s | Wall-clock time from test start until PX4 ready |
+| `takeoff_duration_sim_s` | s | Sim-time from first motion to 95 % of target altitude |
+| `altitude_error_m` | m | Signed steady-state altitude error after takeoff |
+| `overshoot_m` | m | Unsigned transient overshoot above target |
+| `trajectory_success` | — | 1.0 if action returned `success: true`, 0.0 otherwise (`higher_is_better`) |
+| `trajectory_execution_time_sim_s` | s | Sim-time elapsed from action dispatch to completion |
+| `cross_track_error_mean_m` | m | Mean 2-D lateral distance from nearest ideal-path point |
+| `cross_track_error_max_m` | m | Worst-case lateral deviation |
+| `path_rmse_m` | m | 2-D RMSE against the ideal path |
+| `final_altitude_m` | m | Altitude at landing action completion |
+| `land_duration_sim_s` | s | Sim-time from 80 % peak descent to < 0.5 m |
+
+
+Metrics reported in one .txt file called summary.txt which automatically populates once your run completes 
+
+### Default trajectory parameters
+
+| Type | Parameters |
+| ---- | ---------- |
+| Circle | radius=10 m, velocity=2 m/s |
+| Figure8 | length=15 m, width=8 m, height=0 m, velocity=2 m/s, max_acceleration=1 m/s² |
+| Racetrack | length=30 m, width=10 m, height=0 m, velocity=3 m/s, turn_velocity=1.5 m/s, max_acceleration=1 m/s² |
+| Line | length=20 m, height=0 m, velocity=2 m/s, max_acceleration=1 m/s² |
+
+### Running fixed trajectory tests
+
+```bash
+# All four trajectory types; ms-airsim; 1 robot
+airstack test -m autonomy \
+  --sim msairsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle,Figure8,Racetrack,Line \
+  -v
+
+# Circle only (quick check of the known failure case)
+airstack test -m autonomy \
+  --sim msairsim \
+  --num-robots 1 \
+  --stress-iterations 1 \
+  --trajectory-types Circle \
+  -v
+```
+
+### CLI option reference (trajectory-specific)
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--trajectory-types` | `Circle,Figure8,Racetrack,Line` | Comma-separated trajectory types to sweep |
+
+---
+
 ## Metrics Reporting (`parse_metrics.py`)
 
 [`parse_metrics.py`](parse_metrics.py) reads `results.xml` and `metrics.json` from a run directory and produces a markdown report. It has two modes:
diff --git a/tests/conftest.py b/tests/conftest.py
index 31fd29076..2a51c569a 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -88,8 +88,9 @@ def colcon_test_robot_command(workspace="robot"):
 # `pytest tests/` and `airstack test -m unit` discover them without any
 # sys.path manipulation here.  Each proxy file sets up its own paths.
 RUN_DIR = None
-LOGS_DIR = None
 ROS_DISTRO_SETUP = "/opt/ros/jazzy/setup.bash"
+_LAST_CMD_OUTPUT: dict[str, str] = {}
+_DEFAULT_LOG_KEY = "_last"
 
 # Track the currently-running pytest item so current_log() and current_test_id()
 # can pick up the parametrize id without tests having to pass `request` around.
@@ -99,8 +100,6 @@ def colcon_test_robot_command(workspace="robot"):
 
 logger = logging.getLogger("airstack")
 logger.setLevel(logging.INFO)
-_LOG_FORMAT = logging.Formatter("[%(asctime)s] %(levelname)s %(message)s", "%H:%M:%S")
-_test_log_handler = None
 
 
 # ── pytest config / hooks ──────────────────────────────────────────────────
@@ -122,36 +121,42 @@ def pytest_addoption(parser):
     parser.addoption("--takeoff-velocities", default="0.5",
                      help="Comma-separated takeoff/land velocities (m/s) to "
                           "sweep in test_takeoff_hover_land. Default: 0.5,1,2")
+    parser.addoption("--trajectory-types", default="Circle,Figure8,Racetrack,Line",
+                     help="Comma-separated fixed trajectory types to sweep in "
+                          "test_fixed_trajectory. Default: Circle,Figure8,Racetrack,Line")
 
 
 def pytest_configure(config):
-    global RUN_DIR, LOGS_DIR
+    global RUN_DIR
     timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
     results_root = Path(AIRSTACK_ROOT) / "tests" / "results"
     RUN_DIR = results_root / timestamp
-    LOGS_DIR = RUN_DIR / "logs"
-    LOGS_DIR.mkdir(parents=True, exist_ok=True)
+    RUN_DIR.mkdir(parents=True, exist_ok=True)
     config.option.xmlpath = str(RUN_DIR / "results.xml")
 
 
 def pytest_runtest_setup(item):
-    global _CURRENT_ITEM, _test_log_handler
+    global _CURRENT_ITEM
     _CURRENT_ITEM = item
-    log_path = LOGS_DIR / f"{current_log()}.log"
-    _test_log_handler = logging.FileHandler(log_path)
-    _test_log_handler.setFormatter(_LOG_FORMAT)
-    logger.addHandler(_test_log_handler)
 
 
 def pytest_runtest_teardown(item):
-    global _CURRENT_ITEM, _test_log_handler
-    if _test_log_handler is not None:
-        logger.removeHandler(_test_log_handler)
-        _test_log_handler.close()
-        _test_log_handler = None
+    global _CURRENT_ITEM
     _CURRENT_ITEM = None
 
 
+def pytest_sessionfinish(session, exitstatus):
+    """Write summary.txt with key metrics so users don't need to dig through logs."""
+    if RUN_DIR is None:
+        return
+    try:
+        from run_summary import write_summary
+        summary_path = write_summary(RUN_DIR)
+        logger.info("Wrote run summary to %s", summary_path)
+    except Exception as exc:
+        logger.warning("Failed to write run summary: %s", exc)
+
+
 @pytest.hookimpl(hookwrapper=True)
 def pytest_runtest_makereport(item, call):
     """Attach phase reports to the item so fixtures can inspect pass/fail."""
@@ -162,21 +167,8 @@ def pytest_runtest_makereport(item, call):
 
 @contextmanager
 def logger_to(log_name):
-    """Temporarily route `logger` to a different file. Suspends any handlers
-    already attached so narration isn't duplicated across files."""
-    existing = list(logger.handlers)
-    for h in existing:
-        logger.removeHandler(h)
-    fh = logging.FileHandler(LOGS_DIR / f"{log_name}.log")
-    fh.setFormatter(_LOG_FORMAT)
-    logger.addHandler(fh)
-    try:
-        yield
-    finally:
-        logger.removeHandler(fh)
-        fh.close()
-        for h in existing:
-            logger.addHandler(h)
+    """No-op kept for fixture call sites; output goes to pytest log_cli only."""
+    yield
 
 
 def pytest_generate_tests(metafunc):
@@ -209,6 +201,7 @@ def pytest_generate_tests(metafunc):
     "system.test_liveliness",
     "system.test_sensors",
     "system.test_takeoff_hover_land",
+    "system.test_fixed_trajectory",
 ]
 
 # Within test_takeoff_hover_land, each (env, velocity) runs phases in this chain order.
@@ -219,6 +212,20 @@ def pytest_generate_tests(metafunc):
     "test_landing",
 ]
 
+# Within test_fixed_trajectory, each (env, trajectory_type) runs phases in this order.
+_FIXED_TRAJ_PHASE_ORDER = [
+    "test_px4_ready",
+    "test_takeoff",
+    "test_fixed_trajectory",
+    "test_landing",
+]
+
+# Maps module name → phase order list for per-module chain sorting.
+_MODULE_PHASE_ORDERS = {
+    "system.test_takeoff_hover_land": _AUTONOMY_PHASE_ORDER,
+    "system.test_fixed_trajectory": _FIXED_TRAJ_PHASE_ORDER,
+}
+
 
 def _rank(name, order):
     """Index of `name` in `order`; `len(order)` if unknown (i.e., sort last)."""
@@ -242,32 +249,34 @@ def pytest_collection_modifyitems(items):
     #    order intact, so pytest's default file/class order survives.
     items.sort(key=_module_key)
 
-    # 2. Within test_takeoff_hover_land: sort by (airstack_env, velocity, phase) so each
-    #    (sim, robots, iter) env brings up the stack once and the drone goes
-    #    ground→air→ground per velocity.
-    def phase(item):
-        if getattr(item.module, "__name__", "") != "system.test_takeoff_hover_land":
-            return None
-        name = item.originalname or item.name.split("[", 1)[0]
-        return _rank(name, _AUTONOMY_PHASE_ORDER)
+    # 2. Within each parametrized autonomy-style module, sort by
+    #    (airstack_env, secondary_param, phase) so each env brings up the stack
+    #    once and the drone goes ground→air→ground per secondary parameter.
+    for mod_name, phase_order in _MODULE_PHASE_ORDERS.items():
+        def _phase(item, _order=phase_order, _mod=mod_name):
+            if getattr(item.module, "__name__", "") != _mod:
+                return None
+            name = item.originalname or item.name.split("[", 1)[0]
+            return _rank(name, _order)
+
+        def _sort_key(item, _mod=mod_name):
+            cs = getattr(item, "callspec", None)
+            env = cs.params.get("airstack_env", ()) if cs else ()
+            # test_takeoff_hover_land sweeps velocity; test_fixed_trajectory sweeps type
+            secondary = (
+                float(cs.params["velocity"]) if cs and "velocity" in cs.params
+                else (cs.params.get("trajectory_type", "") if cs else "")
+            )
+            return (env, secondary, _phase(item))
 
-    def sort_key(item):
-        cs = getattr(item, "callspec", None)
-        env = cs.params.get("airstack_env", ()) if cs else ()
-        vel = float(cs.params.get("velocity", 0.0)) if cs else 0.0
-        return (env, vel, phase(item))
-
-    slots = [(i, it) for i, it in enumerate(items) if phase(it) is not None]
-    if slots:
-        sorted_items = sorted((it for _, it in slots), key=sort_key)
-        for (i, _), new_item in zip(slots, sorted_items):
-            items[i] = new_item
-
-    # 3. Rewrite bracketed test IDs into a consistent hierarchy: sim > robots >
-    # velocity > iteration. Bypasses pytest's own concatenation (which would
-    # otherwise order by reverse-parametrize-call order). Keeps pytest console,
-    # JUnit XML, and metrics.json all in the same natural order without
-    # refactoring the parametrize structure.
+        slots = [(i, it) for i, it in enumerate(items) if _phase(it) is not None]
+        if slots:
+            sorted_items = sorted((it for _, it in slots), key=_sort_key)
+            for (i, _), new_item in zip(slots, sorted_items):
+                items[i] = new_item
+
+    # 3. Rewrite bracketed test IDs into a consistent hierarchy:
+    #    sim > robots > secondary param > iteration.
     for item in items:
         cs = getattr(item, "callspec", None)
         if cs is None:
@@ -279,6 +288,8 @@ def sort_key(item):
             parts.append(f"{sim}-rob#{n}")
         if "velocity" in cs.params:
             parts.append(f"v{cs.params['velocity']}")
+        if "trajectory_type" in cs.params:
+            parts.append(f"traj{cs.params['trajectory_type']}")
         if env:
             parts.append(f"iter{i}")
         if not parts:
@@ -310,31 +321,26 @@ def current_log():
 
 
 def read_log_tail(log_name=None, lines=50):
-    log_name = log_name or current_log()
-    if not log_name:
+    """Return the tail of the most recent subprocess output for this context."""
+    key = log_name or _DEFAULT_LOG_KEY
+    text = _LAST_CMD_OUTPUT.get(key) or _LAST_CMD_OUTPUT.get(_DEFAULT_LOG_KEY, "")
+    if not text:
         return ""
-    log_path = LOGS_DIR / f"{log_name}.log"
-    if log_path.exists():
-        all_lines = log_path.read_text().splitlines()
-        return "\n".join(all_lines[-lines:])
-    return ""
+    return "\n".join(text.splitlines()[-lines:])
 
 
 def _run_teed(cmd_list, timeout, log_name=None, env=None, cwd=None):
-    """Run a subprocess, teeing stdout+stderr live to the log file and
-    capturing them for parsing."""
-    log_name = log_name or current_log()
-    if not log_name:
-        return subprocess.run(cmd_list, capture_output=True, text=True,
-                              timeout=timeout, env=env, cwd=cwd)
-    log_path = LOGS_DIR / f"{log_name}.log"
+    """Run a subprocess and capture stdout+stderr for parsing and failure messages."""
     quoted = " ".join(shlex.quote(a) for a in cmd_list)
-    with open(log_path, "a") as f:
-        f.write(f"\n$ {quoted}\n")
-    shell_cmd = f"set -o pipefail; {quoted} 2>&1 | tee -a {shlex.quote(str(log_path))}"
-    return subprocess.run(["bash", "-c", shell_cmd],
-                          capture_output=True, text=True,
-                          timeout=timeout, env=env, cwd=cwd)
+    logger.info("$ %s", quoted)
+    result = subprocess.run(
+        cmd_list, capture_output=True, text=True, timeout=timeout, env=env, cwd=cwd,
+    )
+    combined = (result.stdout or "") + (result.stderr or "")
+    key = log_name or _DEFAULT_LOG_KEY
+    _LAST_CMD_OUTPUT[key] = combined
+    _LAST_CMD_OUTPUT[_DEFAULT_LOG_KEY] = combined
+    return result
 
 
 def docker_exec(container, cmd, timeout=60, log_name=None):
diff --git a/tests/pytest.ini b/tests/pytest.ini
index a664ccbf3..03fee8c32 100644
--- a/tests/pytest.ini
+++ b/tests/pytest.ini
@@ -6,6 +6,7 @@ markers =
     liveliness: Container and process health (Docker, tmux, sentinel ROS 2 nodes)
     sensors: Sim and robot sensor topic rates, LiDAR validation, sim RTF
     takeoff_hover_land: End-to-end takeoff / hover / land action tests
+    autonomy: Fixed-pattern trajectory path-tracker benchmark (test_fixed_trajectory.py)
 testpaths = .
 addopts = -v --durations=0
 cache_dir = /tmp/.pytest_cache
diff --git a/tests/run_summary.py b/tests/run_summary.py
new file mode 100644
index 000000000..733cecf03
--- /dev/null
+++ b/tests/run_summary.py
@@ -0,0 +1,360 @@
+"""Write a human-readable summary.txt for each test run.
+
+Called automatically at pytest session end (see conftest.py). Users can also
+regenerate manually:
+
+    python3 tests/run_summary.py tests/results/<run-dir>/
+"""
+from __future__ import annotations
+
+import argparse
+import json
+import re
+import statistics
+import xml.etree.ElementTree as ET
+from pathlib import Path
+
+PARAM_RE = re.compile(r"\[(.+)\]$")
+ITER_RE = re.compile(r"-iter\d+$")
+ROBOT_METRIC_RE = re.compile(r"^robot_\d+\.(.+)$")
+
+# Ordered (metric_key, label) groups per test module. Only scalar metrics with
+# a numeric "value" field are emitted.
+FLIGHT_METRICS = [
+    ("ready_duration_sys_s", "PX4 ready time"),
+    ("takeoff_duration_sim_s", "Takeoff duration"),
+    ("altitude_error_m", "Altitude error after takeoff"),
+    ("overshoot_m", "Takeoff overshoot"),
+    ("trajectory_success", "Trajectory success"),
+    ("trajectory_execution_time_sim_s", "Trajectory duration"),
+    ("cross_track_error_mean_m", "Cross-track error (mean)"),
+    ("cross_track_error_max_m", "Cross-track error (max)"),
+    ("path_rmse_m", "Path RMSE"),
+    ("hover_duration_sim_s", "Hover duration"),
+    ("hover_altitude_error_m", "Hover altitude error"),
+    ("land_duration_sim_s", "Landing duration"),
+    ("final_altitude_m", "Final altitude"),
+]
+
+LIVELINESS_METRICS = [
+    ("sim_ready_duration_s", "Sim ready time"),
+    ("sensors_sim_ready_duration_s", "Sensors sim ready time"),
+]
+
+# Some metrics were recorded with wrong units before METRIC_UNITS was updated.
+UNIT_OVERRIDES = {
+    "ready_duration_sys_s": "s",
+    "airstack_up_duration_s": "s",
+    "airstack_down_duration_s": "s",
+}
+
+PHASE_ORDER = {
+    "test_px4_ready": 0,
+    "test_takeoff": 1,
+    "test_fixed_trajectory": 2,
+    "test_hover": 2,
+    "test_landing": 3,
+    "test_land": 3,
+}
+
+
+def _parse_results_xml(path: Path) -> tuple[dict[str, str], dict[str, float]]:
+    """Return ({full_test_name: status}, {full_test_name: wall_time_s})."""
+    if not path.exists():
+        return {}, {}
+    statuses: dict[str, str] = {}
+    durations: dict[str, float] = {}
+    for tc in ET.parse(path).iter("testcase"):
+        full = f"{tc.get('classname')}.{tc.get('name')}"
+        if tc.find("failure") is not None or tc.find("error") is not None:
+            statuses[full] = "FAILED"
+        elif tc.find("skipped") is not None:
+            statuses[full] = "SKIPPED"
+        else:
+            statuses[full] = "PASSED"
+        if tc.get("time"):
+            try:
+                durations[full] = float(tc.get("time"))
+            except ValueError:
+                pass
+    return statuses, durations
+
+
+def _load_metrics(path: Path) -> dict:
+    if not path.exists():
+        return {}
+    return json.loads(path.read_text())
+
+
+def _param_id(test_name: str) -> str:
+    m = PARAM_RE.search(test_name)
+    return m.group(1) if m else test_name
+
+
+def _module_name(test_name: str) -> str:
+    return test_name.split(".", 1)[0]
+
+
+def _phase_name(test_name: str) -> str:
+    """test_fixed_trajectory.TestFixedTrajectory.test_takeoff[...] -> test_takeoff"""
+    parts = test_name.split(".")
+    if len(parts) >= 3:
+        phase = parts[2]
+        return phase.split("[", 1)[0]
+    return test_name
+
+
+def _base_param_id(param: str) -> str:
+    """isaacsim-rob#1-trajCircle-iter3 -> isaacsim-rob#1-trajCircle"""
+    return ITER_RE.sub("", param)
+
+
+def _format_scalar(key: str, value: float | int, unit: str) -> str:
+    if key == "trajectory_success":
+        if value == 1.0:
+            return "yes"
+        if value == 0.0:
+            return "no"
+    text = f"{value:g}"
+    return f"{text} {unit}".strip() if unit else text
+
+
+def _format_value(key: str, entry: dict) -> str:
+    value = entry.get("value")
+    unit = UNIT_OVERRIDES.get(key, entry.get("unit", ""))
+    if isinstance(value, (int, float)):
+        return _format_scalar(key, value, unit)
+    if value is None:
+        return "n/a"
+    return str(value)
+
+
+def _format_aggregated(key: str, values: list[float], unit: str) -> str:
+    if not values:
+        return "n/a"
+    if key == "trajectory_success":
+        passed = sum(1 for v in values if v >= 1.0)
+        return f"{passed}/{len(values)} passed"
+    mean = statistics.mean(values)
+    if len(values) == 1:
+        return _format_scalar(key, round(mean, 3), unit)
+    std = statistics.pstdev(values)
+    base = _format_scalar(key, round(mean, 3), unit)
+    return f"{base} ± {std:.3g} {unit}".strip() if unit else f"{base} ± {std:.3g} (n={len(values)})"
+
+
+def _collect_scalar_metrics(metrics_blob: dict) -> dict[str, dict]:
+    """Flatten per-test metrics.json into {metric_key: entry}."""
+    out: dict[str, dict] = {}
+    for key, entry in metrics_blob.items():
+        if not isinstance(entry, dict) or "value" not in entry:
+            continue
+        m = ROBOT_METRIC_RE.match(key)
+        metric_key = m.group(1) if m else key
+        out[metric_key] = entry
+    return out
+
+
+def _aggregate_metrics(
+    test_names: list[str],
+    metrics: dict,
+    schema: list[tuple[str, str]],
+) -> dict[str, list[float]]:
+    """Collect numeric metric values across all test phases / iterations."""
+    buckets: dict[str, list[float]] = {key: [] for key, _ in schema}
+    for name in test_names:
+        for metric_key, entry in _collect_scalar_metrics(metrics.get(name, {})).items():
+            if metric_key not in buckets:
+                continue
+            value = entry.get("value")
+            if isinstance(value, (int, float)):
+                buckets[metric_key].append(float(value))
+    return buckets
+
+
+def _chain_title(module: str, param: str) -> str:
+    if module == "test_fixed_trajectory":
+        traj = re.search(r"traj(\w+)", param)
+        traj_label = traj.group(1) if traj else "trajectory"
+        sim = param.split("-", 1)[0]
+        robots = re.search(r"rob#(\d+)", param)
+        n_robots = robots.group(1) if robots else "?"
+        return f"{traj_label} | {sim} | {n_robots} robot(s)"
+    if module == "test_takeoff_hover_land":
+        vel = re.search(r"v([\d.]+)", param)
+        vel_label = f"{vel.group(1)} m/s" if vel else param
+        sim = param.split("-", 1)[0]
+        return f"takeoff-hover-land @ {vel_label} | {sim}"
+    return param
+
+
+def _metric_schema(module: str) -> list[tuple[str, str]]:
+    if module in ("test_fixed_trajectory", "test_takeoff_hover_land"):
+        if module == "test_takeoff_hover_land":
+            return [m for m in FLIGHT_METRICS if m[0] != "trajectory_success"
+                    and not m[0].startswith("cross_track")
+                    and m[0] != "path_rmse_m"
+                    and m[0] != "trajectory_execution_time_sim_s"]
+        return FLIGHT_METRICS
+    if module in ("test_liveliness", "test_sensors"):
+        return LIVELINESS_METRICS
+    return []
+
+
+def _group_tests(
+    metrics: dict,
+    statuses: dict[str, str],
+    durations: dict[str, float],
+) -> dict[tuple[str, str], list[str]]:
+    """Group full test names by (module, base_param_id) across stress iterations."""
+    groups: dict[tuple[str, str], list[str]] = {}
+    all_names = set(metrics) | set(statuses)
+    for name in sorted(all_names):
+        module = _module_name(name)
+        param = _base_param_id(_param_id(name))
+        groups.setdefault((module, param), []).append(name)
+    for names in groups.values():
+        names.sort(key=lambda n: (
+            int(ITER_RE.search(_param_id(n)).group(0).replace("-iter", ""))
+            if ITER_RE.search(_param_id(n)) else 0,
+            PHASE_ORDER.get(_phase_name(n), 99),
+        ))
+    return groups
+
+
+def _iteration_count(test_names: list[str]) -> int:
+    iters = set()
+    for name in test_names:
+        m = ITER_RE.search(_param_id(name))
+        if m:
+            iters.add(m.group(0))
+    return len(iters) or 1
+
+
+def _chain_status(test_names: list[str], statuses: dict[str, str]) -> str:
+    n_iter = _iteration_count(test_names)
+    if n_iter > 1:
+        landing_phases = [n for n in test_names if _phase_name(n) in ("test_landing", "test_land")]
+        check = landing_phases or test_names
+        passed = sum(1 for n in check if statuses.get(n) == "PASSED")
+        total = len(check)
+        return f"{passed}/{total} flight cycles passed ({n_iter} iterations)"
+    if any(statuses.get(n) == "FAILED" for n in test_names):
+        return "FAILED"
+    if test_names and all(statuses.get(n) == "PASSED" for n in test_names):
+        return "PASSED"
+    if any(statuses.get(n) == "SKIPPED" for n in test_names):
+        return "SKIPPED"
+    return "UNKNOWN"
+
+
+def build_summary_lines(run_dir: Path) -> list[str]:
+    metrics_path = run_dir / "metrics.json"
+    results_path = run_dir / "results.xml"
+    statuses, durations = _parse_results_xml(results_path)
+    metrics = _load_metrics(metrics_path)
+
+    passed = sum(1 for s in statuses.values() if s == "PASSED")
+    failed = sum(1 for s in statuses.values() if s == "FAILED")
+    skipped = sum(1 for s in statuses.values() if s == "SKIPPED")
+    total = len(statuses)
+
+    lines = [
+        "AirStack Test Run Summary",
+        f"Run directory: {run_dir.name}",
+        f"Overall: {passed} passed, {failed} failed, {skipped} skipped ({total} tests)",
+        "",
+    ]
+
+    groups = _group_tests(metrics, statuses, durations)
+    if not groups:
+        lines.append("No metrics or test results recorded for this run.")
+        return lines
+
+    for (module, param), test_names in sorted(groups.items()):
+        title = _chain_title(module, param)
+        chain_status = _chain_status(test_names, statuses)
+        lines.append(f"── {title} ──")
+        lines.append(f"Result: {chain_status}")
+        lines.append("")
+
+        schema = _metric_schema(module)
+        aggregated = _aggregate_metrics(test_names, metrics, schema)
+        n_iter = _iteration_count(test_names)
+        emitted = False
+        for metric_key, label in schema:
+            values = aggregated.get(metric_key, [])
+            if not values:
+                continue
+            unit = UNIT_OVERRIDES.get(
+                metric_key,
+                next(
+                    (e.get("unit", "") for name in test_names
+                     for k, e in _collect_scalar_metrics(metrics.get(name, {})).items()
+                     if k == metric_key and isinstance(e, dict)),
+                    "",
+                ),
+            )
+            if n_iter > 1:
+                lines.append(f"{label}: {_format_aggregated(metric_key, values, unit)}")
+            else:
+                entry = {"value": values[-1], "unit": unit}
+                lines.append(f"{label}: {_format_value(metric_key, entry)}")
+            emitted = True
+
+        if not emitted:
+            lines.append("(no key metrics recorded)")
+
+        if n_iter > 1:
+            lines.append("")
+            lines.append(f"Aggregated over {n_iter} stress iterations (mean ± stddev).")
+
+        # Phase wall times help debugging without opening results.xml.
+        phase_wall: dict[str, list[float]] = {}
+        for name in test_names:
+            phase = _phase_name(name)
+            wall = durations.get(name)
+            if wall is not None:
+                phase_wall.setdefault(phase, []).append(wall)
+        if phase_wall:
+            lines.append("")
+            lines.append("Phase wall times:")
+            for phase, walls in sorted(phase_wall.items(), key=lambda x: PHASE_ORDER.get(x[0], 99)):
+                if n_iter > 1 and len(walls) > 1:
+                    mean = statistics.mean(walls)
+                    std = statistics.pstdev(walls)
+                    lines.append(f"  {phase}: {mean:.1f}s ± {std:.1f}s (n={len(walls)})")
+                else:
+                    status = statuses.get(
+                        next((n for n in test_names if _phase_name(n) == phase), ""),
+                        "?",
+                    )
+                    lines.append(f"  {phase}: {walls[0]:.1f}s ({status})")
+
+        lines.append("")
+
+    # Trim trailing blank line
+    if lines and lines[-1] == "":
+        lines.pop()
+    return lines
+
+
+def write_summary(run_dir: Path) -> Path:
+    run_dir = Path(run_dir)
+    out_path = run_dir / "summary.txt"
+    lines = build_summary_lines(run_dir)
+    out_path.write_text("\n".join(lines) + "\n")
+    return out_path
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(description="Generate summary.txt for a test run")
+    parser.add_argument("run_dir", type=Path, help="Path to tests/results/<timestamp>/")
+    args = parser.parse_args()
+    out = write_summary(args.run_dir)
+    print(out.read_text())
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tests/system/test_fixed_trajectory.py b/tests/system/test_fixed_trajectory.py
new file mode 100644
index 000000000..169574cf2
--- /dev/null
+++ b/tests/system/test_fixed_trajectory.py
@@ -0,0 +1,678 @@
+"""Fixed-trajectory performance tests.
+
+Per (sim, num_robots, iter, trajectory_type): ready → takeoff → execute trajectory → land.
+
+The drone takes off to TARGET_ALTITUDE_M, executes one fixed-pattern trajectory
+(Circle, Figure8, Racetrack, or Line), then lands. Odometry is captured throughout
+the trajectory phase and compared against an ideal reference path (generated in Python
+from the same equations as fixed_trajectory_task.cpp) to measure cross-track error.
+
+Each trajectory type is an independent full-cycle test so failures in one type do not
+prevent the remaining types from running — the drone always returns to the ground at
+the end of each cycle via the landing phase.
+"""
+
+import math
+import statistics
+import subprocess
+import time
+from concurrent.futures import ThreadPoolExecutor
+from io import StringIO
+from pathlib import Path
+
+import pandas as pd
+import pytest
+
+from conftest import (
+    ROS_DISTRO_SETUP,
+    current_test_id,
+    get_metrics,
+    get_robot_containers,
+    logger,
+    ros2_exec,
+)
+
+# ── constants ─────────────────────────────────────────────────────────────
+
+TARGET_ALTITUDE_M = 10.0
+PX4_READY_TIMEOUT_S = 300.0
+PX4_POLL_INTERVAL_S = 2.0
+TAKEOFF_MOTION_THRESHOLD_M = 0.3   # z rise above starting z to count as "moving"
+SETTLING_WINDOW_S = 1.0            # trailing window for steady-state altitude check
+MAX_GT_MATCH_AGE_S = 0.1
+
+# Cross-track tolerance is intentionally loose: we know the circle trajectory
+# currently fails, so the assertion documents the failure without blocking landing.
+CROSS_TRACK_TOLERANCE_M = 5.0
+
+# Generous timeout covers the full trajectory execution at low velocity in slow sims.
+TRAJ_EXEC_TIMEOUT_S = 180.0
+
+# Odom CSV schema (ros2 topic echo --csv flattens all primitives in declaration order).
+ODOM_SCHEMA = (
+    ["header.stamp.sec", "header.stamp.nanosec",
+     "header.frame_id", "child_frame_id",
+     "pose.pose.position.x", "pose.pose.position.y", "pose.pose.position.z",
+     "pose.pose.orientation.x", "pose.pose.orientation.y",
+     "pose.pose.orientation.z", "pose.pose.orientation.w"]
+    + [f"pose.covariance[{i}]" for i in range(36)]
+    + ["twist.twist.linear.x", "twist.twist.linear.y", "twist.twist.linear.z",
+       "twist.twist.angular.x", "twist.twist.angular.y", "twist.twist.angular.z"]
+    + [f"twist.covariance[{i}]" for i in range(36)]
+)
+
+METRIC_UNITS = {
+    "ready_duration_sys_s": "s",
+    "trajectory_execution_time_sim_s": "s",
+    "takeoff_duration_sim_s": "s",
+    "land_duration_sim_s": "s",
+    "trajectory_success": "",
+    # Everything else defaults to "m".
+}
+
+# ── default trajectory parameters ────────────────────────────────────────
+
+# These mirror the attributes consumed by fixed_trajectory_task.cpp.
+# frame_id is omitted — the action server defaults to "base_link".
+TRAJECTORY_CONFIGS: dict[str, dict[str, str]] = {
+    "Circle": {
+        "radius": "10.0",
+        "velocity": "2.0",
+    },
+    "Figure8": {
+        "length": "15.0",
+        "width": "8.0",
+        "height": "0.0",
+        "velocity": "2.0",
+        "max_acceleration": "1.0",
+    },
+    "Racetrack": {
+        "length": "30.0",
+        "width": "10.0",
+        "height": "0.0",
+        "velocity": "3.0",
+        "turn_velocity": "1.5",
+        "max_acceleration": "1.0",
+    },
+    "Line": {
+        "length": "20.0",
+        "height": "0.0",
+        "velocity": "2.0",
+        "max_acceleration": "1.0",
+    },
+}
+
+
+# ── pytest hooks ──────────────────────────────────────────────────────────
+
+def pytest_generate_tests(metafunc):
+    """Parametrize tests that request `trajectory_type` from --trajectory-types."""
+    if "trajectory_type" in metafunc.fixturenames:
+        raw = metafunc.config.getoption("--trajectory-types")
+        types = [t.strip() for t in raw.split(",") if t.strip()]
+        metafunc.parametrize("trajectory_type", types, ids=types)
+
+
+# ── ideal-path generators (Python mirrors of fixed_trajectory_task.cpp) ──
+
+def _ideal_circle(radius: float) -> list[tuple[float, float, float]]:
+    """Circle waypoints in base_link frame matching generate_circle() in C++."""
+    pts: list[tuple[float, float, float]] = []
+    pts.append((0.0, 0.0, 0.0))
+    pts.append((radius, 0.0, 0.0))
+    angle = 0.0
+    step = 10.0 * math.pi / 180.0
+    while angle < 2.0 * math.pi:
+        pts.append((radius * math.cos(angle), radius * math.sin(angle), 0.0))
+        angle += step
+    pts.append((radius, 0.0, 0.0))
+    pts.append((0.0, 0.0, 0.0))
+    return pts
+
+
+def _ideal_figure8(length: float, width: float, height: float) -> list[tuple[float, float, float]]:
+    """Figure-8 waypoints in base_link frame matching generate_figure8() in C++."""
+    n = 600
+    pts: list[tuple[float, float, float]] = []
+    for i in range(n - 1):
+        t = 2.0 * math.pi * i / n
+        x = math.cos(t) * length - length
+        y = math.cos(t) * math.sin(t) * 2.0 * width
+        pts.append((x, y, height))
+    return pts
+
+
+def _ideal_racetrack(length: float, width: float, height: float) -> list[tuple[float, float, float]]:
+    """Racetrack waypoints in base_link frame matching generate_racetrack() in C++."""
+    sl = length - width
+    pts: list[tuple[float, float, float]] = []
+
+    for i in range(80):
+        x = sl * i / 79.0
+        pts.append((x, 0.0, height))
+
+    turn_n = 48
+    for i in range(1, turn_n + 1):
+        t = -math.pi / 2.0 + math.pi * i / (turn_n + 1)
+        x = width / 2.0 * math.cos(t) + sl
+        y = width / 2.0 * math.sin(t) + width / 2.0
+        pts.append((x, y, height))
+
+    for i in range(80):
+        x = sl * (1.0 - i / 79.0)
+        pts.append((x, width, height))
+
+    for i in range(1, turn_n + 1):
+        t = math.pi / 2.0 + math.pi * i / (turn_n + 1)
+        x = width / 2.0 * math.cos(t)
+        y = width / 2.0 * math.sin(t) + width / 2.0
+        pts.append((x, y, height))
+
+    return pts
+
+
+def _ideal_line(length: float, height: float) -> list[tuple[float, float, float]]:
+    """Line waypoints in base_link frame matching generate_line() in C++.
+
+    C++ iterates `y` from 0 down to -length in steps of 0.5 and sets x = -y,
+    so the drone moves along +x from 0 to length.
+    """
+    pts: list[tuple[float, float, float]] = []
+    y = 0.0
+    while y > -length:
+        pts.append((-y, 0.0, height))
+        y -= 0.5
+    return pts
+
+
+def _generate_ideal_path(traj_type: str, config: dict[str, str]) -> list[tuple[float, float, float]]:
+    """Dispatch to the correct ideal-path generator."""
+    if traj_type == "Circle":
+        return _ideal_circle(float(config["radius"]))
+    if traj_type == "Figure8":
+        return _ideal_figure8(float(config["length"]), float(config["width"]),
+                               float(config.get("height", "0")))
+    if traj_type == "Racetrack":
+        return _ideal_racetrack(float(config["length"]), float(config["width"]),
+                                 float(config.get("height", "0")))
+    if traj_type == "Line":
+        return _ideal_line(float(config["length"]), float(config.get("height", "0")))
+    return []
+
+
+# ── geometry helpers ───────────────────────────────────────────────────────
+
+def _quat_to_yaw(qx: float, qy: float, qz: float, qw: float) -> float:
+    """Extract yaw (heading) from a unit quaternion."""
+    return math.atan2(2.0 * (qw * qz + qx * qy),
+                      1.0 - 2.0 * (qy * qy + qz * qz))
+
+
+def _transform_to_world(
+    base_link_pts: list[tuple[float, float, float]],
+    x0: float, y0: float, z0: float, yaw0: float,
+) -> list[tuple[float, float, float]]:
+    """Rotate+translate base_link-frame points into the world frame.
+
+    The trajectory controller publishes the trajectory in base_link at the
+    moment of dispatch, so the reference frame origin is (x0, y0, z0) with
+    heading yaw0.
+    """
+    cos_y = math.cos(yaw0)
+    sin_y = math.sin(yaw0)
+    world_pts: list[tuple[float, float, float]] = []
+    for lx, ly, lz in base_link_pts:
+        wx = x0 + lx * cos_y - ly * sin_y
+        wy = y0 + lx * sin_y + ly * cos_y
+        wz = z0 + lz
+        world_pts.append((wx, wy, wz))
+    return world_pts
+
+
+# ── metric computations ───────────────────────────────────────────────────
+
+def _cross_track_metrics(
+    odom_rows: list[dict],
+    ideal_world_pts: list[tuple[float, float, float]],
+) -> dict:
+    """Cross-track error statistics: mean, max, and RMSE against ideal path.
+
+    Error is measured in the XY plane (these trajectories are flat; altitude
+    hold is evaluated separately by the takeoff/hover tests).
+    """
+    if not odom_rows or not ideal_world_pts:
+        return {}
+
+    ideal_xy = [(px, py) for px, py, _ in ideal_world_pts]
+    sq_dists: list[float] = []
+    for row in odom_rows:
+        ox = row["pose.pose.position.x"]
+        oy = row["pose.pose.position.y"]
+        sq_dists.append(min((ox - px) ** 2 + (oy - py) ** 2 for px, py in ideal_xy))
+
+    dists = [math.sqrt(d) for d in sq_dists]
+    return {
+        "cross_track_error_mean_m": round(statistics.mean(dists), 3),
+        "cross_track_error_max_m": round(max(dists), 3),
+        "path_rmse_m": round(math.sqrt(statistics.mean(sq_dists)), 3),
+    }
+
+
+def _takeoff_metrics(odom: list[dict], target: float, velocity: float) -> dict:
+    """Altitude error and duration from takeoff odom samples."""
+    zs = [r["pose.pose.position.z"] for r in odom]
+    ts = [_stamp(r) for r in odom]
+    peak = max(zs)
+    cutoff = ts[-1] - SETTLING_WINDOW_S
+    settled = [z for z, t in zip(zs, ts) if t >= cutoff]
+    out: dict = {
+        "altitude_error_m": round(statistics.mean(settled) - target, 3),
+        "overshoot_m": round(max(0.0, peak - target), 3),
+    }
+    z0 = zs[0]
+    first_motion = next((i for i, z in enumerate(zs) if z > z0 + TAKEOFF_MOTION_THRESHOLD_M), None)
+    first_at_target = next((i for i, z in enumerate(zs) if z >= target * 0.95), None)
+    if first_motion is not None and first_at_target is not None and first_at_target > first_motion:
+        out["takeoff_duration_sim_s"] = round(ts[first_at_target] - ts[first_motion], 3)
+    return out
+
+
+def _landing_metrics(odom: list[dict]) -> dict:
+    """Final altitude and landing duration from landing odom samples."""
+    zs = [r["pose.pose.position.z"] for r in odom]
+    ts = [_stamp(r) for r in odom]
+    out: dict = {"final_altitude_m": round(zs[-1], 3)}
+    peak = max(zs)
+    first_descent = next((i for i, z in enumerate(zs) if z < peak * 0.8), None)
+    first_at_ground = next((i for i, z in enumerate(zs) if z < 0.5), None)
+    if first_descent is not None and first_at_ground is not None and first_at_ground > first_descent:
+        out["land_duration_sim_s"] = round(ts[first_at_ground] - ts[first_descent], 3)
+    return out
+
+
+def _record(robot_n: int, metrics_dict: dict) -> None:
+    """Record per-robot scalar metrics; unit inferred from the METRIC_UNITS table."""
+    m = get_metrics()
+    tid = current_test_id()
+    for key, value in metrics_dict.items():
+        if value is None:
+            continue
+        unit = METRIC_UNITS.get(key, "m")
+        direction = "higher_is_better" if key == "trajectory_success" else "lower_is_better"
+        m.record(tid, f"robot_{robot_n}.{key}", value, unit=unit, direction=direction)
+
+
+# ── CSV / subprocess helpers ───────────────────────────────────────────────
+
+def _stamp(row: dict) -> float:
+    """Sim-time seconds from a parsed odometry CSV row."""
+    return row["header.stamp.sec"] + row["header.stamp.nanosec"] * 1e-9
+
+
+def _start_csv_stream(
+    container: str, topic: str, domain: int, setup_bash: str,
+    duration_s: float, out_path: str,
+) -> tuple:
+    """Background `ros2 topic echo --csv` stream to out_path.
+
+    Returns (popen, file_handle, err_file_handle). Caller must close both
+    handles after the process terminates (see _finish_captures).
+    """
+    cmd = (
+        f"source {ROS_DISTRO_SETUP} && source {setup_bash} && "
+        f"export ROS_DOMAIN_ID={domain} && "
+        f"timeout {int(duration_s)} ros2 topic echo --csv {topic}"
+    )
+    f = open(out_path, "w")
+    ef = open(out_path + ".err", "w")
+    try:
+        proc = subprocess.Popen(
+            ["docker", "exec", container, "bash", "-c", cmd],
+            stdout=f, stderr=ef,
+        )
+    except BaseException:
+        f.close()
+        ef.close()
+        raise
+    return proc, f, ef
+
+
+def _parse_csv(path: str, schema: list[str]) -> list[dict]:
+    """Read ros2 `--csv` output, filtering non-CSV lines ros2 emits to stdout."""
+    with open(path) as fh:
+        good = [line for line in fh if line.count(",") >= len(schema) - 1]
+    if not good:
+        return []
+    df = pd.read_csv(StringIO("".join(good)), header=None, names=schema)
+    return df.to_dict("records")
+
+
+def _start_captures(
+    robot_container: str, setup_bash: str, domain: int, duration_s: float, tag: str,
+) -> dict:
+    """Start odom CSV stream for one robot. Returns a handle for _finish_captures."""
+    odom_path = f"/tmp/traj_r{domain}_{tag}_odom.csv"
+    odom_proc, odom_fh, odom_ef = _start_csv_stream(
+        robot_container,
+        f"/robot_{domain}/interface/mavros/local_position/odom",
+        domain, setup_bash, duration_s, odom_path,
+    )
+    return {"duration_s": duration_s, "odom": (odom_proc, odom_fh, odom_ef, odom_path)}
+
+
+def _finish_captures(streams: dict) -> list[dict]:
+    """Terminate capture and return parsed odom rows."""
+    odom_proc, odom_fh, odom_ef, odom_path = streams["odom"]
+    try:
+        odom_proc.terminate()
+        try:
+            odom_proc.wait(timeout=5)
+        except subprocess.TimeoutExpired:
+            odom_proc.kill()
+            odom_proc.wait(timeout=5)
+    finally:
+        odom_fh.close()
+        odom_ef.close()
+    odom = _parse_csv(odom_path, ODOM_SCHEMA)
+    if not odom:
+        logger.warning(
+            "odom capture empty. stdout=%r stderr=%r",
+            Path(odom_path).read_text()[:500],
+            Path(odom_path + ".err").read_text()[:500],
+        )
+    return odom
+
+
+def _action_ok(stdout: str) -> bool:
+    return "success: true" in stdout
+
+
+def _action_message(stdout: str) -> str:
+    for line in stdout.splitlines():
+        s = line.strip()
+        if s.startswith("message:"):
+            return s[len("message:"):].strip().strip("'\"")
+    return "\n".join(stdout.strip().splitlines()[-5:])
+
+
+def _run_parallel(num_robots: int, fn) -> None:
+    """Run fn(n) for n=1..num_robots concurrently."""
+    if num_robots == 1:
+        fn(1)
+        return
+    with ThreadPoolExecutor(max_workers=num_robots) as ex:
+        list(ex.map(fn, range(1, num_robots + 1)))
+
+
+def _build_traj_goal(traj_type: str, config: dict[str, str]) -> str:
+    """Build the YAML goal string for a FixedTrajectoryTask action send_goal call."""
+    attrs = ", ".join(f"{{key: {k}, value: '{v}'}}" for k, v in config.items())
+    return f"{{trajectory_spec: {{type: {traj_type}, attributes: [{attrs}]}}, loop: false}}"
+
+
+# ── per-robot workers ──────────────────────────────────────────────────────
+
+def _takeoff_one_robot(n: int, robot_container: str, cfg: dict, target: float) -> None:
+    velocity = 1.0  # fixed takeoff velocity for trajectory tests
+    timeout = max(30.0, target / velocity + 15.0)
+    streams = _start_captures(robot_container, cfg["robot_setup_bash"],
+                               n, timeout + 5, "traj_takeoff")
+    goal = f"{{target_altitude_m: {target}, velocity_m_s: {velocity}}}"
+    result = ros2_exec(
+        robot_container,
+        f'ros2 action send_goal --feedback /robot_{n}/tasks/takeoff '
+        f'task_msgs/action/TakeoffTask "{goal}"',
+        domain_id=n, setup_bash=cfg["robot_setup_bash"],
+        timeout=int(timeout + 10),
+    )
+    odom = _finish_captures(streams)
+    if not _action_ok(result.stdout):
+        pytest.fail(f"robot_{n} takeoff failed: {_action_message(result.stdout)}")
+    if not odom:
+        pytest.fail(f"robot_{n} takeoff: no odom samples captured")
+    metrics = _takeoff_metrics(odom, target, velocity)
+    _record(n, metrics)
+    err = metrics.get("altitude_error_m", 0.0)
+    assert abs(err) <= target * 0.1, (
+        f"robot_{n} settled altitude {target + err:.2f}m differs from "
+        f"target {target:.1f}m by more than 10%"
+    )
+
+
+def _trajectory_one_robot(
+    n: int, robot_container: str, cfg: dict, traj_type: str,
+) -> None:
+    config = TRAJECTORY_CONFIGS[traj_type]
+    streams = _start_captures(robot_container, cfg["robot_setup_bash"],
+                               n, TRAJ_EXEC_TIMEOUT_S + 10, f"traj_{traj_type.lower()}")
+    goal = _build_traj_goal(traj_type, config)
+
+    # Snapshot the robot's world-frame pose immediately before dispatch so we can
+    # transform the base_link ideal path to world frame for metric computation.
+    odom_snap = ros2_exec(
+        robot_container,
+        f"timeout 5 ros2 topic echo --once --csv "
+        f"/robot_{n}/interface/mavros/local_position/odom",
+        domain_id=n, setup_bash=cfg["robot_setup_bash"], timeout=10,
+    )
+    x0, y0, z0, yaw0 = 0.0, 0.0, TARGET_ALTITUDE_M, 0.0
+    for line in odom_snap.stdout.splitlines():
+        parts = line.strip().split(",")
+        if len(parts) >= len(ODOM_SCHEMA):
+            try:
+                row = dict(zip(ODOM_SCHEMA, parts))
+                x0 = float(row["pose.pose.position.x"])
+                y0 = float(row["pose.pose.position.y"])
+                z0 = float(row["pose.pose.position.z"])
+                yaw0 = _quat_to_yaw(
+                    float(row["pose.pose.orientation.x"]),
+                    float(row["pose.pose.orientation.y"]),
+                    float(row["pose.pose.orientation.z"]),
+                    float(row["pose.pose.orientation.w"]),
+                )
+                break
+            except (ValueError, KeyError):
+                pass
+
+    t_start = time.monotonic()
+    result = ros2_exec(
+        robot_container,
+        f'ros2 action send_goal --feedback /robot_{n}/tasks/fixed_trajectory '
+        f'task_msgs/action/FixedTrajectoryTask "{goal}"',
+        domain_id=n, setup_bash=cfg["robot_setup_bash"],
+        timeout=int(TRAJ_EXEC_TIMEOUT_S + 15),
+    )
+    exec_time_s = round(time.monotonic() - t_start, 3)
+
+    odom = _finish_captures(streams)
+
+    success = _action_ok(result.stdout)
+    _record(n, {"trajectory_success": 1.0 if success else 0.0})
+
+    if not success:
+        logger.warning("robot_%d %s trajectory did not succeed: %s",
+                       n, traj_type, _action_message(result.stdout))
+
+    if odom:
+        ts = [_stamp(r) for r in odom]
+        exec_sim_s = round(ts[-1] - ts[0], 3) if len(ts) > 1 else exec_time_s
+        _record(n, {"trajectory_execution_time_sim_s": exec_sim_s})
+
+        ideal_base = _generate_ideal_path(traj_type, config)
+        if ideal_base:
+            ideal_world = _transform_to_world(ideal_base, x0, y0, z0, yaw0)
+            ct_metrics = _cross_track_metrics(odom, ideal_world)
+            _record(n, ct_metrics)
+
+            mean_err = ct_metrics.get("cross_track_error_mean_m")
+            if mean_err is not None:
+                assert mean_err < CROSS_TRACK_TOLERANCE_M, (
+                    f"robot_{n} {traj_type}: mean cross-track error {mean_err:.2f}m "
+                    f"exceeds tolerance {CROSS_TRACK_TOLERANCE_M:.1f}m"
+                )
+    else:
+        logger.warning("robot_%d %s: no odom samples captured", n, traj_type)
+
+
+def _landing_one_robot(n: int, robot_container: str, cfg: dict) -> None:
+    velocity = 1.0
+    timeout = max(30.0, TARGET_ALTITUDE_M / velocity + 15.0)
+    streams = _start_captures(robot_container, cfg["robot_setup_bash"],
+                               n, timeout + 5, "traj_land")
+    goal = f"{{velocity_m_s: {velocity}}}"
+    result = ros2_exec(
+        robot_container,
+        f'ros2 action send_goal --feedback /robot_{n}/tasks/land '
+        f'task_msgs/action/LandTask "{goal}"',
+        domain_id=n, setup_bash=cfg["robot_setup_bash"],
+        timeout=int(timeout + 10),
+    )
+    odom = _finish_captures(streams)
+    if not _action_ok(result.stdout):
+        pytest.fail(f"robot_{n} landing failed: {_action_message(result.stdout)}")
+    if not odom:
+        pytest.fail(f"robot_{n} landing: no odom samples captured")
+    metrics = _landing_metrics(odom)
+    _record(n, metrics)
+    final = metrics.get("final_altitude_m", 1.0)
+    assert final < 0.5, f"robot_{n} final altitude {final:.2f}m > 0.5m"
+
+
+# ── test class ─────────────────────────────────────────────────────────────
+
+@pytest.mark.autonomy
+@pytest.mark.timeout(2400)
+class TestFixedTrajectory:
+    """Full takeoff → fixed trajectory → land evaluation suite.
+
+    Parametrized over trajectory_type (from --trajectory-types).
+    Each trajectory type runs as an independent flight cycle so a failure on
+    one type does not prevent other types from being evaluated.
+    """
+
+    @pytest.fixture(scope="session")
+    def _failed_envs(self):
+        return set()
+
+    @pytest.fixture(scope="session")
+    def _ready_envs(self):
+        return set()
+
+    @pytest.fixture(autouse=True)
+    def _chain_guard(self, request, airstack_env, _failed_envs):
+        """Skip tests whose env was poisoned by an earlier failure.
+
+        Trajectory execution failures do NOT poison the env — landing always
+        runs after a successful takeoff, keeping the drone from being stranded.
+        Takeoff or landing failures do poison subsequent tests in the same env.
+        """
+        env_id = (airstack_env["sim"], airstack_env["num_robots"],
+                  airstack_env["iteration"])
+        if env_id in _failed_envs:
+            pytest.skip(f"earlier fixed-trajectory test failed in {env_id}")
+        yield
+        rep = getattr(request.node, "_rep_call", None)
+        if rep is not None and rep.failed:
+            if "test_fixed_trajectory" not in request.node.name:
+                _failed_envs.add(env_id)
+
+    @pytest.mark.dependency(name="ftraj_ready")
+    def test_px4_ready(self, airstack_env, trajectory_type, _ready_envs):
+        """Wait until MAVROS is connected and local_position/odom is publishing.
+
+        Runs only once per (sim, num_robots, iteration) env regardless of how
+        many trajectory types are being tested.
+        """
+        env_id = (airstack_env["sim"], airstack_env["num_robots"],
+                  airstack_env["iteration"])
+        if env_id in _ready_envs:
+            logger.info("px4_ready already confirmed for %s; skipping", env_id)
+            return
+
+        cfg = airstack_env["cfg"]
+        robot_container = get_robot_containers(airstack_env["robot_pattern"])[0]
+        num_robots = airstack_env["num_robots"]
+
+        started = time.time()
+        connected: set[int] = set()
+        pending = list(range(1, num_robots + 1))
+        ready_at: dict[int, float] = {}
+        deadline = started + PX4_READY_TIMEOUT_S
+
+        while pending and time.time() < deadline:
+            for n in list(pending):
+                if n not in connected:
+                    r = ros2_exec(
+                        robot_container,
+                        f"timeout 5 ros2 topic echo --once --csv "
+                        f"--field connected /robot_{n}/interface/mavros/state",
+                        domain_id=n, setup_bash=cfg["robot_setup_bash"], timeout=10,
+                    )
+                    if any(line.strip() == "True" for line in r.stdout.splitlines()):
+                        connected.add(n)
+                    else:
+                        continue
+
+                r = ros2_exec(
+                    robot_container,
+                    f"timeout 5 ros2 topic echo --once "
+                    f"/robot_{n}/interface/mavros/local_position/odom",
+                    domain_id=n, setup_bash=cfg["robot_setup_bash"], timeout=10,
+                )
+                if r.returncode == 0 and "pose:" in r.stdout:
+                    ready_at[n] = round(time.time() - started, 2)
+                    pending.remove(n)
+
+            if pending:
+                logger.info("px4_ready: connected=%s pending=%s elapsed=%.0fs",
+                            sorted(connected), pending, time.time() - started)
+                time.sleep(PX4_POLL_INTERVAL_S)
+
+        if pending:
+            not_connected = [n for n in pending if n not in connected]
+            if not_connected:
+                pytest.fail(
+                    f"robots {sorted(not_connected)} never reported MAVROS connected=True "
+                    f"within {PX4_READY_TIMEOUT_S:.0f}s"
+                )
+            pytest.fail(
+                f"robots {sorted(pending)} connected but never published "
+                f"local_position/odom within {PX4_READY_TIMEOUT_S:.0f}s"
+            )
+
+        for n, dur in ready_at.items():
+            _record(n, {"ready_duration_sys_s": dur})
+        _ready_envs.add(env_id)
+
+    @pytest.mark.dependency(name="ftraj_takeoff", depends=["ftraj_ready"])
+    def test_takeoff(self, airstack_env, trajectory_type):
+        """Take off to TARGET_ALTITUDE_M at a fixed velocity of 1 m/s."""
+        cfg = airstack_env["cfg"]
+        robot_container = get_robot_containers(airstack_env["robot_pattern"])[0]
+        num_robots = airstack_env["num_robots"]
+        _run_parallel(
+            num_robots,
+            lambda n: _takeoff_one_robot(n, robot_container, cfg, TARGET_ALTITUDE_M),
+        )
+
+    @pytest.mark.dependency(name="ftraj_execute", depends=["ftraj_takeoff"])
+    def test_fixed_trajectory(self, airstack_env, trajectory_type):
+        """Send FixedTrajectoryTask, capture odom, compute and record path deviation."""
+        cfg = airstack_env["cfg"]
+        robot_container = get_robot_containers(airstack_env["robot_pattern"])[0]
+        num_robots = airstack_env["num_robots"]
+        _run_parallel(
+            num_robots,
+            lambda n: _trajectory_one_robot(n, robot_container, cfg, trajectory_type),
+        )
+
+    @pytest.mark.dependency(name="ftraj_land", depends=["ftraj_takeoff"])
+    def test_landing(self, airstack_env, trajectory_type):
+        """Land the drone; runs even when test_fixed_trajectory fails."""
+        cfg = airstack_env["cfg"]
+        robot_container = get_robot_containers(airstack_env["robot_pattern"])[0]
+        num_robots = airstack_env["num_robots"]
+        _run_parallel(
+            num_robots,
+            lambda n: _landing_one_robot(n, robot_container, cfg),
+        )
diff --git a/tests/test_sensors.py b/tests/test_sensors.py
new file mode 100644
index 000000000..f00facb33
--- /dev/null
+++ b/tests/test_sensors.py
@@ -0,0 +1,171 @@
+"""Sensor stream and LiDAR validation — runs after ``test_liveliness`` (see ``_MODULE_ORDER``).
+
+Uses the same ``airstack_env`` parametrization as liveliness. With ``class``-scoped
+fixtures this module performs its **own** stack bring-up when selected; combined
+selecting both classes (for example ``-m "liveliness or sensors"``) therefore runs two bring-up cycles per
+``(sim, num_robots, iteration)``. Use ``-m sensors`` alone to exercise only sensor
+checks (still one full ``airstack up``).
+"""
+import time
+
+import pytest
+
+from conftest import current_test_id, get_metrics, logger, wait_for_first_message
+from sensor_probes import (
+    STABLE_HZ_DURATION_S,
+    STABLE_HZ_WINDOW,
+    check_lidar_filtered_cloud_sanity,
+    check_realtime_factor,
+    check_robot_filtered_lidar,
+    check_robot_stereo_hz,
+    check_sim_publishing,
+)
+from test_liveliness import _check_sentinel_nodes, _poll_until
+
+
+@pytest.mark.sensors
+@pytest.mark.timeout(1800)
+class TestSensors:
+
+    @pytest.mark.dependency(name="sensors_sim_ready")
+    def test_sim_clock_available(self, airstack_env):
+        """Wait for ``/clock`` on the sim container (same readiness gate as liveliness)."""
+        cfg = airstack_env["cfg"]
+        m = get_metrics()
+        tid = current_test_id()
+        start = airstack_env["up_started_at"]
+        if (
+            wait_for_first_message(
+                airstack_env["sim_container"],
+                "/clock",
+                domain_id=1,
+                setup_bash=cfg["sim_setup_bash"],
+                timeout=600,
+            )
+            is None
+        ):
+            m.record(tid, "sensors_sim_ready_duration_s", "timeout", unit="s")
+            pytest.fail("sim never published /clock within 600s")
+        m.record(tid, "sensors_sim_ready_duration_s", round(time.time() - start, 2), unit="s")
+
+    @pytest.mark.dependency(name="sensors_nodes", depends=["sensors_sim_ready"])
+    def test_sentinel_nodes_present(self, airstack_env):
+        """ROS 2 nodes required before trusting sensor pipelines."""
+        last_msg = [""]
+
+        def ready():
+            ok, msg = _check_sentinel_nodes(airstack_env)
+            last_msg[0] = msg
+            return ok
+
+        _poll_until(
+            ready,
+            timeout=300,
+            interval=5,
+            fail_msg=lambda: f"sentinel nodes not ready after 300s: {last_msg[0]}",
+        )
+
+    @pytest.mark.dependency(name="sensors_sim_hz", depends=["sensors_sim_ready"])
+    def test_sim_topic_publish_rates(self, airstack_env):
+        """Hz on sim ``/clock`` + stereo + depth (batched for Isaac)."""
+        ok, msg, _ = check_sim_publishing(airstack_env)
+        assert ok, msg
+
+    @pytest.mark.dependency(name="sensors_robot_stereo", depends=["sensors_sim_hz"])
+    def test_robot_stereo_bridge_rates(self, airstack_env):
+        """Hz on robot DDS for stereo + depth (bridge path)."""
+        ok, msg, _ = check_robot_stereo_hz(airstack_env)
+        assert ok, msg
+
+    @pytest.mark.dependency(name="sensors_robot_lidar", depends=["sensors_robot_stereo"])
+    def test_robot_filtered_lidar_stream(self, airstack_env):
+        """Filtered LiDAR ``echo --once`` per robot (isaacsim only; skipped elsewhere)."""
+        ok, msg, _ = check_robot_filtered_lidar(airstack_env)
+        assert ok, msg
+
+    @pytest.mark.dependency(name="sensors_rtf", depends=["sensors_sim_ready"])
+    def test_sim_clock_realtime_factor(self, airstack_env):
+        """RTF from ``/clock``; fails only if sim near-stalled (RTF < 0.1)."""
+        ok, msg, rtf = check_realtime_factor(airstack_env)
+        if rtf is not None:
+            get_metrics().record(
+                current_test_id(),
+                "sim.realtime_factor",
+                round(rtf, 3),
+                unit="",
+                direction="higher_is_better",
+            )
+        assert ok, msg
+
+    @pytest.mark.dependency(
+        name="sensors_lidar_cloud_sanity",
+        depends=["sensors_robot_lidar", "sensors_nodes"],
+    )
+    def test_lidar_filtered_cloud_sanity(self, airstack_env):
+        """Point cloud geometry vs ``near_range_m`` (isaacsim only)."""
+        ok, msg, _ = check_lidar_filtered_cloud_sanity(airstack_env)
+        assert ok, msg
+
+    @pytest.mark.dependency(
+        depends=[
+            "sensors_sim_hz",
+            "sensors_robot_stereo",
+            "sensors_robot_lidar",
+            "sensors_lidar_cloud_sanity",
+        ]
+    )
+    def test_sensor_streams_stable(self, airstack_env, request):
+        """Poll sensors over ``--stable-duration``: stereo/depth as Hz series, LiDAR as ``.received``."""
+        duration = request.config.getoption("--stable-duration")
+        interval = request.config.getoption("--stable-interval")
+        m = get_metrics()
+        tid = current_test_id()
+        series = {}
+        elapsed = 0
+        try:
+            while elapsed < duration:
+                time.sleep(interval)
+                elapsed += interval
+                ok_sim, msg_sim, rates_sim = check_sim_publishing(
+                    airstack_env,
+                    duration=STABLE_HZ_DURATION_S,
+                    window=STABLE_HZ_WINDOW,
+                )
+                ok_rsd, msg_rsd, rates_rsd = check_robot_stereo_hz(
+                    airstack_env,
+                    duration=STABLE_HZ_DURATION_S,
+                    window=STABLE_HZ_WINDOW,
+                )
+                ok_lidar, msg_lidar, rates_lidar = check_robot_filtered_lidar(
+                    airstack_env,
+                    duration=STABLE_HZ_DURATION_S,
+                    window=STABLE_HZ_WINDOW,
+                )
+                # Sim and robot Hz probes share the same topic strings; namespace so
+                # metrics.json time-series are not merged/overwritten.
+                for prefix, rates_dict in (
+                    ("sim", rates_sim),
+                    ("robot", rates_rsd),
+                ):
+                    for topic, hz in rates_dict.items():
+                        tail = topic.lstrip("/").replace("/", ".") + ".hz"
+                        key = f"{prefix}.{tail}"
+                        series.setdefault(key, []).append(
+                            {"t": elapsed, "value": hz or 0.0}
+                        )
+                # LiDAR uses echo-once (placeholder 1.0); record as .received not .hz.
+                for topic, got in rates_lidar.items():
+                    tail = topic.lstrip("/").replace("/", ".") + ".received"
+                    key = f"lidar.{tail}"
+                    series.setdefault(key, []).append(
+                        {"t": elapsed, "value": 1.0 if got else 0.0}
+                    )
+                if not (ok_sim and ok_rsd and ok_lidar):
+                    pytest.fail(
+                        f"sensor instability at t={elapsed}s: sim_hz={msg_sim} | "
+                        f"robot_stereo_hz={msg_rsd} | robot_lidar={msg_lidar}"
+                    )
+        finally:
+            for key, samples in series.items():
+                if samples:
+                    m.record_list(tid, f"{key}_samples", samples)