Skip to content

feat: nav2 sensor fix OTA demo#58

Draft
bburda wants to merge 1 commit into
mainfrom
feat/ota-nav2-sensor-fix
Draft

feat: nav2 sensor fix OTA demo#58
bburda wants to merge 1 commit into
mainfrom
feat/ota-nav2-sensor-fix

Conversation

@bburda

@bburda bburda commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

Description

Adds demos/ota_nav2_sensor_fix/ - an end-to-end OTA-over-SOVD demo (gateway plugin + FastAPI artifact server + 4 ROS 2 demo packages) exercising the /updates lifecycle: update a broken lidar node, install a new safety classifier, and uninstall a deprecated package, all over HTTP and spec-compliant.

OTA demo

  • Adds demos/ota_nav2_sensor_fix/ end-to-end OTA demo
  • Bundles a dev-grade ota_update_plugin C++ gateway plugin (UpdateProvider + GatewayPlugin)
  • Update / Install / Uninstall operations derived from SOVD ISO 17978-3 components metadata (updated_components / added_components / removed_components)
  • Minimal FastAPI artifact server + pack_artifact.py CLI for building tarballs and catalog entries
  • Two-service docker-compose.yml (gateway + update server); nav2 / Foxglove are bring-your-own (documented in README)

Out of scope (deliberate, dev-grade positioning)

  • Artifact signing / verification
  • Atomic swap or A/B partition rollout
  • Persistent update state across gateway restarts
  • Fleet-wide staging
  • Audit logging
  • Automated health-gated rollback

Test plan / verification

Unit & integration tests (all clean):

  • pytest -v for pack_artifact.py (16 tests)
  • pytest -v for ota_update_server (5 tests)
  • colcon test for ota_update_plugin (24 GTest cases)
  • All four demo ROS 2 packages build clean under -Wall -Wextra -Wpedantic -Wshadow -Wconversion
  • build_artifacts.sh produces a 3-entry catalog + tarballs end-to-end

End-to-end smoke:

  • Plugin loads and registers as UpdateProvider (gateway logs: "Update backend provided by plugin")
  • Boot poll fetches /catalog and registers all 3 catalog entries
  • Update flow: PUT /updates/fixed_lidar_2_1_0/prepare && /execute kills broken_lidar_node and spawns fixed_lidar_node
  • Install flow: PUT /updates/obstacle_classifier_v2_1_0_0/prepare && /execute swaps files and spawns obstacle_classifier_node
  • Uninstall flow: PUT /updates/broken_lidar_legacy_remove/prepare && /execute returns status: completed and the legacy process is gone
  • tests/smoke_test_ota.sh - 25/25 pass on a fresh stack
  • tests/smoke_test_demo_narrative.sh - 8/8 pass on a fresh stack
  • trigger-update.sh, trigger-install.sh, trigger-uninstall.sh, check-demo.sh, stop-demo.sh exercised end-to-end

Notes

@bburda bburda self-assigned this Apr 29, 2026
@bburda bburda changed the title feat: nav2 sensor fix OTA demo feat: nav2 sensor fix OTA demo + cross-demo script regressions May 26, 2026
@bburda bburda force-pushed the feat/ota-nav2-sensor-fix branch from 39a0925 to fff9581 Compare May 29, 2026 20:00
@bburda bburda changed the title feat: nav2 sensor fix OTA demo + cross-demo script regressions feat: nav2 sensor fix OTA demo May 29, 2026
@bburda bburda force-pushed the feat/ota-nav2-sensor-fix branch from fff9581 to 7a37cc6 Compare May 30, 2026 10:10
…pdate, publish-and-apply a hotfix

Dev-grade OTA demo on the ros2_medkit gateway's ota_update_plugin and the SOVD
/updates resource. An RB-Theron AMR runs Nav2 in an AWS small-warehouse world. A
regressing lidar update (broken_lidar_3_0_0) is auto-applied at boot; a few
metres into a mission the scan sensor develops a stuck sector, Nav2 can no
longer make progress, and navigate_to_pose aborts. Generic log/action-status
bridges surface that as SOVD faults on bt-navigator and controller-server with a
freeze-frame and an MCAP capture - the sensor node never reports itself.

The operator diagnoses over SOVD: downloads the MCAP, runs a suite of
health-check operations (lidar/localization/drivetrain/costmap) to confirm the
lidar is the cause, then publishes the forward hotfix (fixed_lidar_3_0_1) as a
hand-provided update descriptor via POST /updates and applies it (prepare +
execute), and clears the latched faults. A custom nav_to_pose behaviour tree
keeps the abort prompt. Foxglove panels + curl scripts drive the loop; smoke
tests cover the plugin, the SOVD envelope, the publish-then-apply flow, and the
single-publisher /scan remap. Runs on CycloneDDS (FastDDS segfaults
amcl/controller_server on Jazzy).
@bburda bburda force-pushed the feat/ota-nav2-sensor-fix branch from 7a37cc6 to 7a85e20 Compare July 4, 2026 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant