Skip to content

Merge upstream OpenMS/streamlit-template into FLASHApp#95

Merged
t0mdavid-m merged 22 commits into
developfrom
merge/upstream-streamlit-template
Jun 24, 2026
Merged

Merge upstream OpenMS/streamlit-template into FLASHApp#95
t0mdavid-m merged 22 commits into
developfrom
merge/upstream-streamlit-template

Conversation

@t0mdavid-m

@t0mdavid-m t0mdavid-m commented Jun 24, 2026

Copy link
Copy Markdown
Member

Summary

Sync the 21 new upstream commits from OpenMS/streamlit-template since the last template merge (merge-base 6ca8e97, template PR #390). Most upstream changes were CI/Docker multi-arch work that FLASHApp had already implemented independently, so the net change is small and focused on the genuinely new template features.

Net change vs develop: 11 files, +366 / −20.

What's brought in

Area Change
Legal-links feature (new) get_legal_links() / DEFAULT_LEGAL_LINKS in common.py, Impressum/Privacy/Terms links in the sidebar footer, privacy-policy link wired into the GDPR consent banner (captcha_.py, gdpr_consent/*), and tests/test_legal_links.py
settings.json Kept FLASHApp version/repository-name; added the new legal_links block (OpenMS defaults)
src/workflow/CommandExecutor.py Adopted upstream's type-aware bool-flag handling (via the now-merged bool_param_paths_from_param_xml_ini) plus a value-based fallback. Also fixes a latent FLASHApp bug — checkbox booleans (persisted as Python True/False) were emitted as -k True/-k False instead of valueless TOPP flags
ParameterManager.py New bool_param_paths_from_param_xml_ini helper
README.md Documented the legal_links override
k8s/base/traefik-ingressroute.yaml, .gitignore Generic upstream hardening (traefik websecure + secure cookie; .venv/)

Conflict resolution decisions

All five conflicts were analyzed and adversarially verified before applying:

  • build-and-test.yml, Dockerfile.arm → kept FLASHApp's side (builds ghcr.io/openms/flashapp from the custom OpenMS fork + Vue component). All genuine upstream CI fixes were confirmed already present in FLASHApp; no streamlit-template identifiers leaked in. These resolve identical to develop (no diff).
  • Dockerfile_simple.armdropped. Orphaned template file that builds streamlit-template and is unreferenced by FLASHApp's full-only arm CI matrix.

Verification

  • 46/46 pytest pass in the ghcr.io/openms/flashapp:latest Docker image (pyopenms 3.5.0, real deps)
  • pylint errors-only clean on changed files
  • ✅ bool-flag command construction confirmed correct across all 18 input cases (key in/out of the .ini bool set)

Note for reviewers

ParameterManager.py also pulled in three currently-unreferenced helpers from upstream (bool_pairs_session_key, get_bool_param_pairs, _merge_bool_params_from_ini + two create_ini call sites) — dead code in upstream too. They were left as-merged to stay faithful to the sync; happy to strip them in a follow-up if preferred.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added configurable legal page links for Impressum, Privacy Policy, and Terms of Use.
    • Displayed legal links in the app sidebar and connected the consent banner to the configured privacy policy URL.
    • Improved HTTPS handling for the app’s sticky session behavior.
  • Bug Fixes

    • Legal link defaults now remain in place when custom settings are missing or empty.
    • Consent messaging now stays aligned with the configured privacy policy link.

hjn0415a and others added 22 commits May 19, 2026 17:01
Mirror FLASHApp's split-build / manifest-merge approach so both
linux/amd64 and linux/arm64 are published for the full and simple
variants. Existing `<ref>-full` / `<ref>-simple` / `latest` tags
become multi-arch manifests — k8s overlays, docker-compose users,
and direct `docker pull` callers transparently get the right arch.

Dockerfile.arm (delta from Dockerfile):
 - aarch64 miniforge installer
 - conditional THIRDPARTY/Linux/aarch64 copy (some OpenMS releases
   ship an empty/missing aarch64 dir)
 - pruned thirdparty PATH to tools that actually have ARM builds:
   LuciPHOr2, MSGFPlus, ThermoRawFileParser, Comet, Percolator, Sage

Dockerfile_simple.arm (delta from Dockerfile_simple):
 - aarch64 miniforge installer only — pyOpenMS ships aarch64 wheels
   on PyPI, so `pip install -r requirements.txt` works as-is

The shared docker/entrypoint.sh is reused as-is on ARM: its
apptainer/read-only-root handling is arch-neutral and worth keeping.
Base stays ubuntu:22.04 (Redis 6.0 predates the ARM64-COW-BUG
warning, so no `--ignore-warnings` flag needed).

Workflow changes (build-and-test.yml):
 - `build` renamed `build-amd64`; per-arch tags carry `-amd64`.
 - New `build-arm64` job runs on `ubuntu-24.04-arm`, builds the
   `.arm` Dockerfiles for both variants, ends with a pull-back +
   /_stcore/health probe on push events.
 - New `create-manifest` job stitches `<ref>-<variant>-amd64` +
   `<ref>-<variant>-arm64` into multi-arch `<ref>-<variant>` and
   `latest` manifests.
 - test-apptainer / test-nginx / test-traefik / publish-apptainer
   keep consuming the amd64 artifact only. SIF publishing stays
   amd64-only this iteration.
 - PRs build both arches (registry cache keeps warm runs cheap) but
   don't push; manifest creation also skipped on PRs.

Branch-protection note: the `build` required check is renamed to
`build-amd64`. Admins should update protected-branch rules and add
`build-arm64` / `create-manifest` if those should also be required.
Previously the apptainer/nginx/traefik integration tests only ran
against the amd64 artifact, so the arm64 image was validated solely
by its build succeeding plus a post-push /_stcore/health probe. Now
all three integration matrices fan out over arch=[amd64, arm64] with
a matrix-driven runs-on, exercising the read-only-root apptainer
contract and both kind-based ingress paths on a native ARM runner
too.

Changes:
 - `build-amd64` artifact renamed from
   `openms-streamlit-<variant>-image` to
   `openms-streamlit-<variant>-amd64-image` for symmetry.
 - `build-arm64` now also `load: true`'s the built image, retags to
   the kind-friendly `openms-streamlit:test`, saves it as a tar, and
   uploads it as `openms-streamlit-<variant>-arm64-image`. The
   post-push pull-back smoke test is removed — the new apptainer/
   nginx/traefik runs subsume it and avoid the slow GHCR pull.
 - `test-apptainer`, `test-nginx`, `test-traefik` matrices switched
   from `variant: [full, simple]` to an `include:` list with
   {variant, arch, runner} tuples; `runs-on: ${{ matrix.runner }}`
   selects `ubuntu-latest` for amd64 and `ubuntu-24.04-arm` for arm64.
   Artifact download names get `${{ matrix.arch }}` interpolated.
 - SIF upload at the tail of `test-apptainer` gated on
   `matrix.arch == 'amd64'`: arm64 still runs the full apptainer
   contract end-to-end, but only amd64 produces the SIF that
   `publish-apptainer` ships to GHCR (HPC SIF consumers are amd64).

Note on `publish-apptainer`: it stays on `needs: test-apptainer`,
which now waits for the arm64 matrix entries too — meaning an arm64
apptainer regression will block amd64 SIF publishing. Conservative
on purpose; happy to decouple via separate jobs if that turns out to
be too strict in practice.
The ARM build of `make -j4 TOPP` failed at the link step with

    /usr/bin/ld: /root/miniforge3/lib/libyaml-cpp.so.0.8: undefined
    reference to `std::ios_base_library_init()@GLIBCXX_3.4.32'

The conda-forge libyaml-cpp wheel for aarch64 is built against
GLIBCXX_3.4.32 (gcc 13+), but Ubuntu 22.04's system g++ ships with an
older libstdc++. Running cmake inside the mamba shell lets it discover
/root/miniforge3/lib first, so the conda-forge yaml-cpp gets linked
into every TOPP binary and breaks. amd64 happens to work because the
conda-forge amd64 yaml-cpp build is older.

Fix mirrors FLASHApp's Dockerfile.arm: configure OpenMS in two cmake
passes — pass 1 under plain `/bin/bash` with
`-DCMAKE_IGNORE_PREFIX_PATH=/root/miniforge3` so cmake resolves C++
deps from the system tree (libyaml-cpp from contrib, boost from apt,
etc.); pass 2 under `mamba run` with `-DPYOPENMS=ON` so the Python
bindings still find conda-forge Python / Cython / NumPy. The
IGNORE_PREFIX_PATH flag is repeated on pass 2 to keep the cached C++
link command unchanged.

Only Dockerfile.arm changes; Dockerfile (amd64) keeps its single-pass
cmake to avoid disturbing the working x86 path.
The two-pass cmake split from 1d73b67 runs pass 1 under
`SHELL ["/bin/bash", "-c"]`, but the only cmake on the image is the
one from `mamba install cmake` at /root/miniforge3/envs/streamlit-env/bin/cmake
— not on plain bash's PATH. Result: exit 127 (command not found) the
moment pass 1 invokes cmake.

FLASHApp.arm sidesteps this by installing cmake via apt; do the same
here (just append `cmake` to the existing apt-get install line). The
mamba cmake install stays, so pass 2 under the mamba shell continues
to use the conda-forge cmake exactly as it did before. Ubuntu 22.04
ships cmake 3.22, comfortably above OpenMS 3.5's 3.15 floor.
The previous fix (install cmake via apt) didn't actually help: OpenMS
3.5's CMakeLists.txt requires cmake >= 3.24, and Ubuntu 22.04's apt
cmake is 3.22.1, which fails configure with

    CMake Error at src/openms/extern/CMakeLists.txt:11 (cmake_minimum_required):
      CMake 3.24 or higher is required. You are running version 3.22.1

That's exactly why the existing x86 Dockerfile installs cmake via
mamba (the conda-forge build is 3.30+). FLASHApp.arm escapes this by
using ubuntu:24.04 (apt cmake 3.28); we stay on 22.04 to minimize
churn vs. the working x86 Dockerfile.

Fix: in pass 1, call the mamba-env cmake by its full path
`/root/miniforge3/envs/streamlit-env/bin/cmake`. The plain-bash SHELL
is still in effect, so cmake doesn't pick up any conda-forge
environment side effects, and CMAKE_IGNORE_PREFIX_PATH keeps it from
auto-discovering miniforge libraries during find_package. The cmake
binary itself runs against miniforge's libstdc++, but that's a
runtime detail of cmake — it doesn't leak into the configured
project's link command.

The apt cmake addition from f11bc99 is now redundant but harmless;
leaving it in place to keep this diff focused.
Two failures in the previous run (test-traefik full, test-nginx
simple) ended with the runner reporting "No space left on device"
while flushing its diagnostic log. ubuntu-latest starts with ~14 GB
free; downloading the full image artifact (5-8 GB), loading it into
docker (decompressed, larger), pulling kind's node image, then
loading the OCI tar into the kind cluster easily exceeds that budget.

Mirror the cleanup already used by `build-arm64`: drop the runner's
preinstalled dotnet / android SDK / ghc / hostedtoolcache to recover
~30 GB. Same step now runs at the top of test-apptainer, test-nginx,
and test-traefik on both amd64 (ubuntu-latest) and arm64
(ubuntu-24.04-arm) matrix entries — the arm runner is at least as
tight as amd64.
After the two-pass cmake configure landed in 5185c3e, the next
attempt got past `make -j4 TOPP` (the link error is fixed) but
failed fast in `make -j4 pyopenms` with:

    CMake Error: Not a file: /openms-build/CMakeFiles/VerifyGlobs.cmake
    CMake Error: Error processing file: /openms-build/CMakeFiles/VerifyGlobs.cmake
    make: *** [Makefile:11553: cmake_check_build_system] Error 1

`VerifyGlobs.cmake` is generated by cmake for `file(GLOB CONFIGURE_DEPENDS ...)`
targets and is consulted by `cmake_check_build_system` at the top of
every subsequent `make` invocation. The intermediate cleanup line

    RUN rm -rf src doc CMakeFiles

deleted it, which is fine on the x86 single-pass build (different
cmake codepath when PYOPENMS=ON is set in the initial configure, no
VerifyGlobs.cmake generated) but breaks the ARM two-pass build.

Stop deleting CMakeFiles/ between `make TOPP` and `make pyopenms`.
We still drop `src/` and `doc/` for disk savings; keeping CMakeFiles
costs only a few hundred MB on the intermediate layer.
eWaterCycle/setup-apptainer@v2 installs apptainer from the upstream
.deb asset on the GitHub release. Upstream apptainer only publishes
amd64 .debs (verified: every v1.3.x release lists only
`apptainer_<ver>_amd64.deb`, no _arm64 / _aarch64 variant). On the
ubuntu-24.04-arm runner the action's `apt-get install ./apptainer_*.deb`
fails with sudo exit code 100 because the package can't be resolved.

Building apptainer from source on the ARM runner would add ~15 minutes
and a maintenance surface (Go toolchain, suid configuration) for
limited value — HPC SIF consumers remain amd64. Revert test-apptainer
to amd64-only and document why. test-nginx and test-traefik still
exercise the ARM image via kind, which gives us functional ARM
coverage at the docker-runtime level even without apptainer.

Side cleanups now that arm64 is gone from this matrix:
- artifact name back to a literal `*-amd64-image` (no matrix.arch)
- SIF upload gate drops the `matrix.arch == 'amd64'` check
kind/kubectl/helm setup actions fail with "Cache directory
'/opt/hostedtoolcache' does not exist". Drop just dotnet/android/ghc
(~34 GB) and leave the tool cache in place.
curl exit-22 doesn't tell us whether the pod, service, or ingress is
the broken link. Dump pods/logs/ingress/controller logs on failure so
the next run surfaces the actual cause.
\`docker load\` + \`kind load docker-image\` keeps the image in both host
docker AND each kind node's containerd. With a 5-8 GB image and two
kind nodes that's ~25 GB of duplicated storage, which trips the
"no space left on device" error in kind's ctr import.

Switch to \`kind load image-archive\` so the tar streams directly into
each node, and rm the tar after to reclaim /tmp.
503s in test-nginx/test-traefik traced to two issues:

1. The prod overlay maps openms-streamlit ->
   ghcr.io/openms/streamlit-template:main-full, but the build job was
   re-tagging the local image as openms-streamlit:test. Rendered
   manifests pointed at the registry name; kind only had :test loaded;
   pods stayed ErrImagePull. Retag as :main-full so kind has exactly
   the ref the manifests use.

2. Three of the four pod specs declare imagePullPolicy: Always; the
   existing sed only rewrote IfNotPresent. With Always and no registry
   creds in kind, pods loop on ImagePullBackOff. Extend the sed to
   catch both.
Add ARM64 (aarch64) multi-architecture Docker build support
…fest

create-manifest fails with "ghcr.io/openms/streamlit-template:main-full-amd64
is a manifest list" because docker/build-push-action v5 adds a provenance
attestation by default, which buildx packs as a manifest list (image +
attestation entries). docker manifest create rejects manifest lists as
components.

build-arm64 already sets provenance: false for the same reason; mirror
that on the amd64 path so both per-arch tags are flat image manifests
that can be merged into the multi-arch manifest.
fix(ci): disable provenance on build-amd64 to keep pushes single-mani…
support flag parameters without valuesAllow flags to work with only key names, omitting values.
Add configurable legal links (Impressum, Privacy, Terms)
Sync the 21 new template commits since merge-base 6ca8e97 (template PR #390).
Most upstream changes were CI/Docker multi-arch work that FLASHApp had already
implemented independently; the genuinely new template features were brought in.

Conflicts resolved (analyzed + adversarially verified):
- settings.json: keep FLASHApp version/repository-name; add the new legal_links block.
- src/workflow/CommandExecutor.py: adopt upstream's type-aware bool-flag handling
  (via the now-merged bool_param_paths_from_param_xml_ini) plus a value-based
  fallback. This also fixes a latent FLASHApp bug: checkbox bools (persisted as
  Python True/False) were emitted as "-k True"/"-k False" instead of valueless flags.
- .github/workflows/build-and-test.yml and Dockerfile.arm: keep FLASHApp's side
  (builds ghcr.io/openms/flashapp from the custom OpenMS fork + Vue component). All
  genuine upstream CI fixes were already present in FLASHApp; no streamlit-template
  identifiers leaked in.
- README.md: keep FLASHApp content; document the new legal_links override.

New from upstream (clean auto-merges): the legal-links feature (get_legal_links in
common.py, captcha/consent privacy-policy link, tests/test_legal_links.py), traefik
websecure + secure-cookie hardening, and .gitignore .venv.

Dropped: Dockerfile_simple.arm (orphaned template file that builds streamlit-template
and is unreferenced by FLASHApp's full-only arm CI matrix).

Verified: 46/46 pytest pass in the flashapp Docker image (pyopenms 3.5.0); pylint
errors-only clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016Z9gnbqpmuU9z6BmoaXmuY
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The PR adds configurable legal page links (Impressum, Privacy, Terms) with sidebar rendering in the Streamlit UI, wires the privacy URL into the Klaro GDPR consent banner via captcha_control, introduces XML-based detection and session-state caching of TOPP boolean parameters for correct CLI flag emission, and updates the Traefik IngressRoute to accept HTTPS traffic with a secure sticky cookie.

Changes

Legal Links and GDPR Consent Privacy URL

Layer / File(s) Summary
Legal links contract, defaults, and docs
settings.json, src/common/common.py, README.md, .gitignore
settings.json gains a legal_links object with impressum, privacy, and terms URLs. common.py adds DEFAULT_LEGAL_LINKS constant and get_legal_links() that merges hardcoded defaults with settings overrides, ignoring falsy values. README documents the mechanism and GDPR banner reuse. .gitignore excludes .venv/.
GDPR consent banner privacy URL propagation
src/common/captcha_.py, src/common/common.py, gdpr_consent/src/main.ts
captcha_control gains an optional privacy_policy_url parameter; both call sites in common.py pass get_legal_links()["privacy"]. gdpr_consent/src/main.ts extends klaroConfig with an optional translations field and sets klaroConfig.translations.zz.privacyPolicyUrl from data.args['privacy_policy'] in onRender.
Sidebar legal links rendering and unit tests
src/common/common.py, tests/test_legal_links.py
common.py injects an HTML sidebar bar with Impressum/Privacy Policy/Terms anchors via st.markdown. tests/test_legal_links.py adds FakeSessionState, sys.modules mocking for Streamlit-free unit testing, and six test cases covering defaults, full overrides, partial overrides, and empty/None suppression.

TOPP ini-driven Boolean Parameter Detection

Layer / File(s) Summary
bool_param_paths_from_param_xml_ini and session-state caching
src/workflow/ParameterManager.py
Adds bool_param_paths_from_param_xml_ini that parses ParamXML .ini files, walks NODE/ITEM elements, and returns a set of colon-delimited short keys for type="bool" items. ParameterManager gains helpers to compute the session key, retrieve the cached set, and merge discovered pairs; create_ini calls _merge_bool_params_from_ini after detecting or writing an ini.
CommandExecutor ini-driven boolean flag emission
src/workflow/CommandExecutor.py
Imports bool_param_paths_from_param_xml_ini; run_topp gains optional tool_instance_name; loads params and computes topp_bool_flag_param_keys from the ini. Replaces prior string-based boolean detection with a combined ini-type and stored-value strategy that emits valueless -key flags only when the parameter is enabled.

Traefik IngressRoute HTTPS Configuration

Layer / File(s) Summary
IngressRoute HTTPS entry points and secure cookie
k8s/base/traefik-ingressroute.yaml
spec.entryPoints is expanded to include both web and websecure; the sticky affinity cookie gains secure: true to restrict transmission to HTTPS.

Poem

🐇 Hop, hop, the legal links are here,
Privacy banners made crystal clear!
Bool params sniffed from each .ini file,
HTTPS cookies sent with style.
From Klaro banners to Traefik's gate,
The warren's compliance is up-to-date! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the PR as merging upstream OpenMS/streamlit-template changes into FLASHApp.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch merge/upstream-streamlit-template

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/workflow/CommandExecutor.py (2)

329-338: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Apply bool flag handling to custom_params too.

The docstring says custom bool params are emitted as valueless flags, but this loop still emits -key True / -key False, which can break TOPP tools expecting flag-style booleans.

Proposed fix
             # Add custom parameters
             for k, v in custom_params.items():
+                is_bool_value = isinstance(v, bool) or (
+                    isinstance(v, str) and v.lower() in ("true", "false")
+                )
+                if (k in topp_bool_flag_param_keys and v != "") or is_bool_value:
+                    if isinstance(v, str):
+                        is_enabled = v.lower() == "true"
+                    else:
+                        is_enabled = bool(v)
+                    if is_enabled:
+                        command += [f"-{k}"]
+                    continue
+
                 command += [f"-{k}"]
                 
                 # Skip only empty strings (pass flag with no value)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/CommandExecutor.py` around lines 329 - 338, The custom_params
loop in CommandExecutor currently treats booleans like normal values, so it
emits "-key True/False" instead of a valueless flag as described by the
docstring. Update the parameter assembly logic in CommandExecutor so custom bool
values are handled the same way as other boolean params: add only the "-key"
flag and do not append a stringified value, while preserving the existing
handling for lists, empty strings, None, and numeric zero values.

268-297: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use tool_instance_name when loading persisted params.

The new argument is documented as the params.json key when it differs from tool, but Line 296 still checks params[tool]. Any caller passing tool_instance_name will silently drop those saved parameters.

Proposed fix
         params = self.parameter_manager.get_parameters_from_json()
+        params_key = tool_instance_name or tool
 
         topp_tool_ini_path = Path(self.parameter_manager.ini_dir, f"{tool}.ini")
@@
-            if tool in params.keys():
-                for k, v in params[tool].items():
+            if params_key in params:
+                for k, v in params[params_key].items():
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/CommandExecutor.py` around lines 268 - 297, Persisted parameter
loading in CommandExecutor is still keyed off tool instead of the new
tool_instance_name argument, so saved params are skipped when the instance name
differs. Update the parameter lookup inside the command-building logic to use
tool_instance_name consistently when reading from params, while keeping tool
only for executable/INI resolution, so any caller passing a custom instance name
gets the saved options applied.
🧹 Nitpick comments (1)
src/common/common.py (1)

828-840: 🔒 Security & Privacy | 🔵 Trivial | 💤 Low value

Consider escaping legal-link URLs before HTML interpolation.

links["impressum"|"privacy"|"terms"] are interpolated directly into an href attribute with unsafe_allow_html=True. These values are operator-controlled (settings.json/defaults), so this is not a user-facing XSS vector, but a URL containing a " would break the attribute and produce malformed markup. A quick html.escape(url, quote=True) makes rendering robust against config typos.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/common/common.py` around lines 828 - 840, The legal links rendering in
common.py interpolates operator-controlled URLs directly into the anchor href
attributes, which can break the HTML if a value contains quotes. Update the
get_legal_links()/st.markdown block to escape each URL with html.escape(...,
quote=True) before inserting links["impressum"], links["privacy"], and
links["terms"] into the markup so the footer remains well-formed even with
malformed settings.json values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/workflow/ParameterManager.py`:
- Around line 6-18: `bool_param_paths_from_param_xml_ini` currently parses
ParamXML with `xml.etree.ElementTree`, but it should use `defusedxml` to block
unsafe XML parsing while preserving the existing empty-set fallback on
parse/load failures. Update the import and parsing path in `ParameterManager.py`
to use the defused XML API, and add the missing `defusedxml` dependency to the
project manifest so the module is available at runtime.

---

Outside diff comments:
In `@src/workflow/CommandExecutor.py`:
- Around line 329-338: The custom_params loop in CommandExecutor currently
treats booleans like normal values, so it emits "-key True/False" instead of a
valueless flag as described by the docstring. Update the parameter assembly
logic in CommandExecutor so custom bool values are handled the same way as other
boolean params: add only the "-key" flag and do not append a stringified value,
while preserving the existing handling for lists, empty strings, None, and
numeric zero values.
- Around line 268-297: Persisted parameter loading in CommandExecutor is still
keyed off tool instead of the new tool_instance_name argument, so saved params
are skipped when the instance name differs. Update the parameter lookup inside
the command-building logic to use tool_instance_name consistently when reading
from params, while keeping tool only for executable/INI resolution, so any
caller passing a custom instance name gets the saved options applied.

---

Nitpick comments:
In `@src/common/common.py`:
- Around line 828-840: The legal links rendering in common.py interpolates
operator-controlled URLs directly into the anchor href attributes, which can
break the HTML if a value contains quotes. Update the
get_legal_links()/st.markdown block to escape each URL with html.escape(...,
quote=True) before inserting links["impressum"], links["privacy"], and
links["terms"] into the markup so the footer remains well-formed even with
malformed settings.json values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8e76fa14-9b25-45d1-a550-332b53316ffb

📥 Commits

Reviewing files that changed from the base of the PR and between 1bbf6ac and b9422d7.

⛔ Files ignored due to path filters (1)
  • gdpr_consent/dist/bundle.js is excluded by !**/dist/**
📒 Files selected for processing (10)
  • .gitignore
  • README.md
  • gdpr_consent/src/main.ts
  • k8s/base/traefik-ingressroute.yaml
  • settings.json
  • src/common/captcha_.py
  • src/common/common.py
  • src/workflow/CommandExecutor.py
  • src/workflow/ParameterManager.py
  • tests/test_legal_links.py

Comment on lines +6 to +18
import xml.etree.ElementTree as ET
from pathlib import Path


def bool_param_paths_from_param_xml_ini(ini_path: Path, tool_stem: str) -> set[str]:
"""
Return short parameter paths for every ``<ITEM type="bool">`` in a ParamXML .ini file.

Paths match the suffix after ``Tool:1:`` in pyOpenMS (e.g. ``algorithm:epd:masstrace_snr_filtering``).
"""
try:
root = ET.parse(ini_path).getroot()
except (ET.ParseError, OSError):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

fd -t f -i 'requirements.*\.txt|pyproject\.toml|setup\.py|environment\.ya?ml' . \
  | xargs -r rg -n 'defusedxml' || true

rg -n 'xml\.etree\.ElementTree|from defusedxml' src/workflow/ParameterManager.py

Repository: OpenMS/FLASHApp

Length of output: 190


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== manifests =="
git ls-files | rg '(^|/)(pyproject\.toml|requirements.*\.txt|setup\.py|environment\.ya?ml|Pipfile|poetry\.lock)$' || true

echo "== defusedxml references =="
rg -n 'defusedxml' . || true

echo "== ParameterManager snippet =="
sed -n '1,170p' src/workflow/ParameterManager.py | cat -n

Repository: OpenMS/FLASHApp

Length of output: 8526


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== requirements.txt =="
sed -n '1,220p' requirements.txt | cat -n

echo "== callers of bool_param_paths_from_param_xml_ini =="
rg -n 'bool_param_paths_from_param_xml_ini|_merge_bool_params_from_ini|create_ini\(' src/workflow/ParameterManager.py

echo "== ini path sources =="
rg -n 'self\.ini_dir|write_ini|ParamXMLFile\(\)\.load|\.ini' src/workflow/ParameterManager.py

Repository: OpenMS/FLASHApp

Length of output: 5228


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
p = Path('requirements.txt')
print("exists:", p.exists())
if p.exists():
    txt = p.read_text().splitlines()
    for i, line in enumerate(txt, 1):
        if 'defusedxml' in line.lower() or 'xml' in line.lower():
            print(f"{i}: {line}")
PY

Repository: OpenMS/FLASHApp

Length of output: 166


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== requirements.txt =="
sed -n '1,220p' requirements.txt | cat -n

echo "== callers of bool_param_paths_from_param_xml_ini =="
rg -n 'bool_param_paths_from_param_xml_ini|_merge_bool_params_from_ini|create_ini\(' src/workflow/ParameterManager.py

echo "== ini path sources =="
rg -n 'self\.ini_dir|write_ini|ParamXMLFile\(\)\.load|\.ini' src/workflow/ParameterManager.py

Repository: OpenMS/FLASHApp

Length of output: 5228


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
p = Path('requirements.txt')
print("exists:", p.exists())
if p.exists():
    for i, line in enumerate(p.read_text().splitlines(), 1):
        if 'defusedxml' in line.lower() or 'xml' in line.lower():
            print(f"{i}: {line}")
PY

Repository: OpenMS/FLASHApp

Length of output: 166


Use defusedxml for ParamXML ini parsing

src/workflow/ParameterManager.py should switch from xml.etree.ElementTree to defusedxml, and the dependency needs to be added to the project manifest as well — requirements.txt does not declare it. Keep the empty-set fallback for malformed/blocked XML.

🧰 Tools
🪛 Ruff (0.15.18)

[error] 17-17: Using xml to parse untrusted data is known to be vulnerable to XML attacks; use defusedxml equivalents

(S314)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/workflow/ParameterManager.py` around lines 6 - 18,
`bool_param_paths_from_param_xml_ini` currently parses ParamXML with
`xml.etree.ElementTree`, but it should use `defusedxml` to block unsafe XML
parsing while preserving the existing empty-set fallback on parse/load failures.
Update the import and parsing path in `ParameterManager.py` to use the defused
XML API, and add the missing `defusedxml` dependency to the project manifest so
the module is available at runtime.

Source: Linters/SAST tools

@t0mdavid-m t0mdavid-m merged commit 18a6510 into develop Jun 24, 2026
7 checks passed
@t0mdavid-m t0mdavid-m mentioned this pull request Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants