Version of GausHitFinder with unfold/transform/fold by wddgit · Pull Request #18 · Framework-R-D/phlex-examples

wddgit · 2026-06-18T22:41:33Z

A new version of GausHitFinder implemented with a series of algorithms, unfold/transform/fold, replacing a parallel_for in the existing implementation. We are keeping both versions so we can use them for studies and prototype work.

This PR adds a unit test that runs both this version and the first Phlex GausHitFinder version and compares them against results obtained from running the art version of GausHitFinder under the art Framework. The unit test added in this PR does not run the art version, but there are text files containing output from an art executable which are used for the comparisons.

This PR includes some small data files of the order of a few Mbytes. We discussed this and the decision was to go ahead and put these into the phlex-examples repository. In the future, it would be better if we develop a way of handling similar data files that does not involve including them in code repositories.

Algorithm Implementation

New Design1 Algorithm: Introduced find_hits_with_gaussians_design1 with an unfold/transform/fold pattern to replace the parallel_for approach, while retaining the original implementation for comparative study
Design1 Components:
- New unfold_wire_vector_design1 iterator adapter for processing wire collections
- Configuration struct find_hits_with_gaussians_design1_cfg with hit-finding thresholds and fitting quality criteria
- Support for multiple charge computation methods (normalized-area formula vs. discrete Gaussian integration)
- Candidate-hit finding/merging via CandHitStandard, with fallback to long-pulse decomposition when fit quality thresholds are exceeded
- Optional hit filtering and sorting by peak amplitude with configurable cuts
Original Implementation Simplified: Removed post-processing (hit sorting and file printing) from find_hits_with_gaussians.cpp

Build System

Library Updates: find_hits_with_gaussians now compiles both the original and new Design1 implementation file
New PHLEX Module Wrappers: Added registration targets for find_hits_with_gaussians_design1_hof, find_hits_with_gaussians_cell_id_hof, and print_hits_to_file_hof
Test Infrastructure: CMake configuration updated to copy input data files (*.dat) and reference outputs (art_hits_*.txt) into build directory

Tests

Comparative Testing: Tests execute both the new Design1 and original GausHitFinder implementations against pre-computed reference outputs from the art Framework version
Test Coverage: Per-event comparisons (events 0–4) using diff between art_hits_${n}.txt reference files and generated hits_${n}.txt output
Dual Test Pipelines: Parallel test configurations for both implementations (test_find_hits_with_gaussians.jsonnet and test_find_hits_with_gaussians_design1.jsonnet) with explicit dependency wiring
Reference Data: Small data files (few megabytes) with pre-computed art executable outputs included in repository for validation

PHLEX Integration

New Registration Modules:
- register_find_hits_with_gaussians_design1.cpp: Three-stage pipeline (unfold → transform → fold) with runtime configuration
- register_find_hits_with_gaussians_cell_id.cpp: Provider for cell_id data
- register_print_hits_to_file.cpp: Wires hit printing into the framework
Configuration Management: Includes bounds checking for plane indices and constructs shared CandHitStandard, PeakFitterMrqdt, and HitFilterAlg instances from configuration
JSONnet Configurations: New and updated pipeline definitions specifying layer configuration, data sources, and module parameters

Refactoring

Hit Output Function: Replaced print_hits() with print_hits_to_file() using updated signature (cell_id instead of filename), with hits sorted by PeakTime then PeakAmplitude
Wire Source Simplification: Removed atomic counter-based filename generation in favor of direct cell_id mapping (wires_<id.number()>.dat)

Documentation

Updated README.md to clarify the unfold/transform/fold design pattern and the role of Design1 as a prototype variant
Noted that identical output results were observed for both phlex versions, matching the art reference implementation

coderabbitai · 2026-06-18T22:41:48Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: ef5c671a-e515-4478-a1ca-034679fb2795

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a design1 variant of the Gaussian hit-finder using an unfold/transform/fold pipeline pattern. Refactors print_hits_to_file to use cell_id-based deterministic filenames and moves sorting out of find_hits_with_gaussians. Adds PHLEX registration modules for cell_id, design1 pipeline, and print_hits_to_file. Expands CMake test harness with per-event diff comparisons for both configurations.

Changes

Gauss hit-finder design1 + print-hits refactor

Layer / File(s)	Summary
`print_hits_to_file` refactor and `wires_source` determinism fix `migration/gauss_hit_finder/print_hits_to_file.hpp`, `migration/gauss_hit_finder/print_hits_to_file.cpp`, `migration/gauss_hit_finder/wires_source.cpp`, `migration/gauss_hit_finder/find_hits_with_gaussians.cpp`	`print_hits_to_file` signature changes to `(int cell_id, std::vector<recob::Hit> const&)` inside `namespace examples`, replacing the old `(hits, filename)` global. `wires_source` drops the atomic counter and uses `id.number()` directly. Inline sort-and-print is removed from `find_hits_with_gaussians`.
design1 algorithm contract and implementation `migration/gauss_hit_finder/find_hits_with_gaussians_design1.hpp`, `migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp`	New header declares `unfold_wire_vector_design1`, `find_hits_with_gaussians_design1_cfg`, and `find`/`fold` function signatures. Implementation adds a ROOT-independent `Gaus` evaluator, the wire iterator adapter, the full parallel ROI hit-finding/multi-Gaussian-fitting/filtering logic, and `fold_hits_into_vector_design1`.
PHLEX provider registrations `migration/gauss_hit_finder/register_find_hits_with_gaussians_cell_id.cpp`, `migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp`, `migration/gauss_hit_finder/register_print_hits_to_file.cpp`	Three new PHLEX registration modules: a `cell_id` provider exposing `data_cell_index` as `cell_info`; the design1 three-stage unfold/transform/fold pipeline with config validation and shared `PeakFitterMrqdt`/`HitFilterAlg` objects; and a `print_hits_to_file` observer consuming `cell_info` and hit products.
CMake targets, JSONnet configs, and test data `migration/gauss_hit_finder/CMakeLists.txt`, `migration/gauss_hit_finder/test_find_hits_with_gaussians.jsonnet`, `migration/gauss_hit_finder/test_find_hits_with_gaussians_design1.jsonnet`, `migration/gauss_hit_finder/art_hits_1.txt`, `migration/gauss_hit_finder/README.md`	CMake adds design1 and print-hits sources to the library, defines three new `MODULE` targets, copies fixture files, runs `phlex` for both JSONnet configs, and adds per-event `diff` tests with dependency wiring. Both JSONnet configs wire sources and modules for their respective pipelines. `art_hits_1.txt` adds expected output for event 1. README updated to reflect current state.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    note right of PHLEX: design1 pipeline
    PHLEX->>unfold_wire_vector_design1: register (layer_unfold_input → spill)
    unfold_wire_vector_design1->>find_hits_with_gaussians_design1: recob::Wire (unlimited concurrency)
    find_hits_with_gaussians_design1->>CandHitStandard: find/merge ROI candidates
    find_hits_with_gaussians_design1->>PeakFitterMrqdt: multi-Gaussian fit
    find_hits_with_gaussians_design1->>HitFilterAlg: optional hit filtering
    find_hits_with_gaussians_design1->>fold_hits_into_vector_design1: tbb::concurrent_vector hits
    fold_hits_into_vector_design1->>PHLEX: std::vector hits product (serial)
  end
  rect rgba(255, 200, 150, 0.5)
    note right of PHLEX: print-hits observation
    PHLEX->>cell_id_provider: data_cell_index → cell_info int
    PHLEX->>print_hits_to_file: observe cell_info + hits product
    print_hits_to_file->>print_hits_to_file: sort copy, write hits_cell_id.txt
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely identifies the main change: introducing an unfold/transform/fold version of GausHitFinder to replace the parallel_for approach, which directly maps to the PR's core objective.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@migration/gauss_hit_finder/CMakeLists.txt`:
- Around line 71-80: The set_tests_properties call for
GausHitFinderTestWithUnfold inside the foreach loop is overwriting the DEPENDS
property on each iteration, keeping only the final dependency from the last loop
cycle. Replace the set_tests_properties call that modifies
GausHitFinderTestWithUnfold with a set_property call using the APPEND keyword so
that all dependencies from each loop iteration accumulate rather than getting
replaced, ensuring GausHitFinderCompare_0 through GausHitFinderCompare_4 are all
properly registered as dependencies.

In `@migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp`:
- Around line 312-350: The iterators HitsumStartItr and HitsumEndItr are being
constructed from unchecked floating-point bounds at lines 316-320, which can
result in undefined behavior if those bounds fall outside the valid ROI range.
Instead, compute the floating-point bounds (using peakMean, nsigmaADC,
peakWidth, and the overlap adjustment logic) first, clamp them to the valid
range between startT and endT before creating the iterators, and then construct
HitsumStartItr and HitsumEndItr from the clamped values. This ensures the
iterator offsets are always valid before they are dereferenced.

In `@migration/gauss_hit_finder/print_hits_to_file.cpp`:
- Around line 26-31: The lambda comparator in the std::sort call on the hits
collection is incomplete—when both PeakTime() and PeakAmplitude() values are
equal between two hits, the function returns implicitly without a tiebreaker,
leaving the ordering unspecified. Add a third-level comparison after the
PeakAmplitude check in the lambda function to handle the case when both fields
match, using another field from recob::Hit (such as channel number or wire
identifier) as a tiebreaker to guarantee fully deterministic sort order across
all possible input datasets.

In `@migration/gauss_hit_finder/README.md`:
- Around line 4-6: In the opening description of the README.md file, correct the
subject-verb agreement error in the phrase that mentions the framework
conversion. Change "algorithms that works" to "algorithms that work" since
"algorithms" is plural and requires the plural form of the verb. This appears in
the section describing the conversion from the art framework to the phlex
framework.

In `@migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp`:
- Around line 49-63: The cand_hit_standard_vec is sized based on the count of
configuration keys rather than the actual plane index values, which causes
problems when plane indices are sparse or non-sequential. Modify the sizing
logic to accommodate the actual maximum plane index encountered in
finder_configs before populating the vector. Either calculate the maximum plane
value first and resize the vector to max_plane + 1, or alternatively replace the
vector with a std::map or std::unordered_map that maps plane numbers to
hit_finder objects to naturally handle sparse plane numbering without leaving
null entries. This ensures that line 147 in find_hits_with_gaussians_design1.cpp
can safely dereference cand_hit_standard.at(plane) for any valid plane in the
configuration.
- Around line 34-45: The configuration values being loaded into the
find_hits_with_gaussians_design1_cfg struct, particularly long_pulse_width_vec,
long_max_hits_vec, and area_norms_vec, are used as divisors downstream but are
not validated at load time. Add validation checks immediately after the cfg
struct initialization to ensure that all divisor-like configuration values are
positive and non-zero, throwing an appropriate exception or error if any invalid
values are found to prevent divide-by-zero or invalid scaling operations later
in find_hits_with_gaussians_design1.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 1cfe3d0d-43e3-433d-9a38-b39a87b2b470

📥 Commits

Reviewing files that changed from the base of the PR and between 506b462 and 880a2cf.

⛔ Files ignored due to path filters (5)

migration/gauss_hit_finder/wires_0.dat is excluded by !**/*.dat
migration/gauss_hit_finder/wires_1.dat is excluded by !**/*.dat
migration/gauss_hit_finder/wires_2.dat is excluded by !**/*.dat
migration/gauss_hit_finder/wires_3.dat is excluded by !**/*.dat
migration/gauss_hit_finder/wires_4.dat is excluded by !**/*.dat

📒 Files selected for processing (19)

migration/gauss_hit_finder/CMakeLists.txt
migration/gauss_hit_finder/README.md
migration/gauss_hit_finder/art_hits_0.txt
migration/gauss_hit_finder/art_hits_1.txt
migration/gauss_hit_finder/art_hits_2.txt
migration/gauss_hit_finder/art_hits_3.txt
migration/gauss_hit_finder/art_hits_4.txt
migration/gauss_hit_finder/find_hits_with_gaussians.cpp
migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp
migration/gauss_hit_finder/find_hits_with_gaussians_design1.hpp
migration/gauss_hit_finder/print_hits.cpp
migration/gauss_hit_finder/print_hits_to_file.cpp
migration/gauss_hit_finder/print_hits_to_file.hpp
migration/gauss_hit_finder/register_find_hits_with_gaussians_cell_id.cpp
migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp
migration/gauss_hit_finder/register_print_hits_to_file.cpp
migration/gauss_hit_finder/test_find_hits_with_gaussians.jsonnet
migration/gauss_hit_finder/test_find_hits_with_gaussians_design1.jsonnet
migration/gauss_hit_finder/wires_source.cpp

💤 Files with no reviewable changes (2)

migration/gauss_hit_finder/print_hits.cpp
migration/gauss_hit_finder/find_hits_with_gaussians.cpp

📜 Review details

🧰 Additional context used

🪛 Clang (14.0.6)

migration/gauss_hit_finder/register_find_hits_with_gaussians_cell_id.cpp

[warning] 9-9: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 9-9: all parameters should be named in a function

(readability-named-parameter)

migration/gauss_hit_finder/register_print_hits_to_file.cpp

[warning] 14-14: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 14-14: all parameters should be named in a function

(readability-named-parameter)

migration/gauss_hit_finder/find_hits_with_gaussians_design1.hpp

[warning] 54-54: function 'initial_value' should be marked [[nodiscard]]

(modernize-use-nodiscard)

[warning] 54-54: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 56-56: function 'predicate' should be marked [[nodiscard]]

(modernize-use-nodiscard)

[warning] 56-56: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 58-58: function 'unfold' should be marked [[nodiscard]]

(modernize-use-nodiscard)

[warning] 58-58: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 67-67: constructor does not initialize these fields: filter_hits, long_max_hits_vec, long_pulse_width_vec, max_multi_hit, area_method, area_norms_vec, chi2_ndf, pulse_height_cuts, pulse_width_cuts, pulse_ratio_cuts

(cppcoreguidelines-pro-type-member-init)

[warning] 83-83: use a trailing return type for this function

(modernize-use-trailing-return-type)

migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp

[warning] 25-25: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 25-25: parameter name 'x' is too short, expected at least 3 characters

(readability-identifier-length)

[warning] 27-27: statement should be inside braces

(readability-braces-around-statements)

[warning] 31-31: statement should be inside braces

(readability-braces-around-statements)

[warning] 34-34: statement should be inside braces

(readability-braces-around-statements)

[warning] 42-42: constructor does not initialize these fields: begin_, end_

(cppcoreguidelines-pro-type-member-init)

[warning] 51-51: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 56-56: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 59-59: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 65-65: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 457-457: if with identical then and else branches

(bugprone-branch-clone)

[note] 459-459: else branch starts here

(clang)

migration/gauss_hit_finder/print_hits_to_file.cpp

[warning] 16-16: 2 adjacent parameters of 'print_hits_to_file' of similar type are easily swapped by mistake

(bugprone-easily-swappable-parameters)

[note] 16-16: the first parameter in the range is 'cell_id'

(clang)

[note] 17-17: the last parameter in the range is 'input_hits'

(clang)

[note] 17-17: 'int' and 'const int &' parameters accept and bind the same kind of values

(clang)

[warning] 19-19: variable 'hits' is not initialized

(cppcoreguidelines-init-variables)

migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp

[warning] 22-22: use a trailing return type for this function

(modernize-use-trailing-return-type)

[warning] 22-22: all parameters should be named in a function

(readability-named-parameter)

[warning] 47-47: variable 'cand_hit_standard_vec' is not initialized

(cppcoreguidelines-init-variables)

🪛 Cppcheck (2.21.0)

migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp

[style] 218-218: The function 'StartTick' is never used.

(unusedFunction)

[style] 222-222: The function 'EndTick' is never used.

(unusedFunction)

[style] 230-230: The function 'SigmaPeakTime' is never used.

(unusedFunction)

[style] 242-242: The function 'SigmaPeakAmplitude' is never used.

(unusedFunction)

[style] 246-246: The function 'ROISummedADC' is never used.

(unusedFunction)

[style] 250-250: The function 'HitSummedADC' is never used.

(unusedFunction)

[style] 254-254: The function 'Integral' is never used.

(unusedFunction)

[style] 258-258: The function 'SigmaIntegral' is never used.

(unusedFunction)

[style] 262-262: The function 'Multiplicity' is never used.

(unusedFunction)

[style] 266-266: The function 'LocalIndex' is never used.

(unusedFunction)

[style] 270-270: The function 'GoodnessOfFit' is never used.

(unusedFunction)

[style] 274-274: The function 'DegreesOfFreedom' is never used.

(unusedFunction)

[style] 282-282: The function 'SignalType' is never used.

(unusedFunction)

[style] 300-300: The function 'PeakTimeMinusRMS' is never used.

(unusedFunction)

[style] 305-305: The function 'TimeDistanceAsRMS' is never used.

(unusedFunction)

[style] 35-35: The function 'isValidChannelID' is never used.

(unusedFunction)

[style] 215-215: The function 'NSignal' is never used.

(unusedFunction)

[style] 124-124: The function 'to_int' is never used.

(unusedFunction)

[style] 217-217: The function 'markValid' is never used.

(unusedFunction)

[style] 220-220: The function 'markInvalid' is never used.

(unusedFunction)

[style] 244-244: The function 'getInvalidID' is never used.

(unusedFunction)

[style] 294-294: The function 'asCryostatID' is never used.

(unusedFunction)

[style] 326-326: The function 'next' is never used.

(unusedFunction)

[style] 409-409: The function 'asTPCID' is never used.

(unusedFunction)

[style] 466-466: The function 'asPlaneID' is never used.

(unusedFunction)

[style] 220-220: The function 'set' is never used.

(unusedFunction)

[style] 51-51: The function 'initial_value' is never used.

(unusedFunction)

[style] 56-56: The function 'predicate' is never used.

(unusedFunction)

[style] 59-59: The function 'unfold' is never used.

(unusedFunction)

[style] 65-65: The function 'find_hits_with_gaussians_design1' is never used.

(unusedFunction)

[style] 464-464: The function 'fold_hits_into_vector_design1' is never used.

(unusedFunction)

migration/gauss_hit_finder/print_hits_to_file.cpp

[style] 35-35: The function 'isValidChannelID' is never used.

(unusedFunction)

[style] 16-16: The function 'print_hits_to_file' is never used.

(unusedFunction)

migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp

[style] 35-35: The function 'isValidChannelID' is never used.

(unusedFunction)

🪛 LanguageTool

migration/gauss_hit_finder/README.md

[style] ~106-~106: The expression “make sense” can be too colloquial for certain contexts. For a more formal tone, try using an alternative.
Context: ...lemented as a single phlex transform. Does it make sense to split it with unfolds and folds? Doe...

(WOULD_IT_MAKE_SENSE)

[style] ~107-~107: The expression “make sense” can be too colloquial for certain contexts. For a more formal tone, try using an alternative.
Context: ...nse to split it with unfolds and folds? Does it make sense to have multiple transforms? We don't k...

(WOULD_IT_MAKE_SENSE)

🔇 Additional comments (8)

migration/gauss_hit_finder/print_hits_to_file.hpp (1)

8-12: LGTM!

migration/gauss_hit_finder/wires_source.cpp (1)

26-33: LGTM!

migration/gauss_hit_finder/register_print_hits_to_file.cpp (1)

19-23: LGTM!

migration/gauss_hit_finder/test_find_hits_with_gaussians.jsonnet (1)

5-16: LGTM!

Also applies to: 64-68

migration/gauss_hit_finder/find_hits_with_gaussians_design1.hpp (1)

47-91: LGTM!

migration/gauss_hit_finder/register_find_hits_with_gaussians_cell_id.cpp (1)

9-16: LGTM!

migration/gauss_hit_finder/test_find_hits_with_gaussians_design1.jsonnet (1)

19-69: LGTM!

migration/gauss_hit_finder/art_hits_1.txt (1)

1-110: LGTM!

coderabbitai · 2026-06-18T22:51:03Z

+            std::vector<float>::const_iterator sumStartItr = range.begin() + startT;
+            std::vector<float>::const_iterator sumEndItr = range.begin() + endT;
+
+            //### limits for the sum of the Hit based on the gaussian peak and sigma
+            std::vector<float>::const_iterator HitsumStartItr =
+              range.begin() + peakMean - nsigmaADC * peakWidth;
+            std::vector<float>::const_iterator HitsumEndItr =
+              range.begin() + peakMean + nsigmaADC * peakWidth;
+
+            if (nGausForFit > 1) {
+              if (numHits > 0) {
+                if ((peakMean - nsigmaADC * peakWidth) < (prevpeak + nsigmaADC * prevpeakSig)) {
+                  float difPeak = peakMean - prevpeak;
+                  float weightpeak = prevpeakSig / (prevpeakSig + peakWidth);
+                  HitsumStartItr = range.begin() + prevpeak + difPeak * weightpeak;
+                  newleft = prevpeak + difPeak * weightpeak;
+                }
+              }
+
+              if (numHits < nGausForFit - 1) {
+                if ((peakMean + nsigmaADC * peakWidth) > (nextpeak - nsigmaADC * nextpeakSig)) {
+                  float difPeak = nextpeak - peakMean;
+                  float weightpeak = peakWidth / (nextpeakSig + peakWidth);
+                  HitsumEndItr = range.begin() + peakMean + difPeak * weightpeak;
+                  newright = peakMean + difPeak * weightpeak;
+                }
+              }
+            }
+
+            //protection to avoid negative ranges
+            if (newright - newleft < 0)
+              continue;
+
+            //avoid ranges out of ROI if it happens
+            if (HitsumStartItr < sumStartItr)
+              HitsumStartItr = sumStartItr;
+
+            if (HitsumEndItr > sumEndItr)
+              HitsumEndItr = sumEndItr;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clamp hit-sum bounds before creating iterators.

Line 316 and Line 319 construct iterators from floating, unchecked bounds and only clamp afterward. If the computed bounds leave the ROI, this is undefined behavior before the guard logic runs.

Suggested fix

- std::vector<float>::const_iterator HitsumStartItr = - range.begin() + peakMean - nsigmaADC * peakWidth; - std::vector<float>::const_iterator HitsumEndItr = - range.begin() + peakMean + nsigmaADC * peakWidth; + int hitStartIdx = + static_cast<int>(std::floor(peakMean - nsigmaADC * peakWidth)); + int hitEndIdx = + static_cast<int>(std::ceil(peakMean + nsigmaADC * peakWidth)); + + hitStartIdx = std::clamp(hitStartIdx, startT, endT); + hitEndIdx = std::clamp(hitEndIdx, startT, endT); + + std::vector<float>::const_iterator HitsumStartItr = + range.begin() + hitStartIdx; + std::vector<float>::const_iterator HitsumEndItr = + range.begin() + hitEndIdx;

🧰 Tools

🪛 Cppcheck (2.21.0)

[style] 326-326: The function 'next' is never used.

(unusedFunction)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp` around lines 312 - 350, The iterators HitsumStartItr and HitsumEndItr are being constructed from unchecked floating-point bounds at lines 316-320, which can result in undefined behavior if those bounds fall outside the valid ROI range. Instead, compute the floating-point bounds (using peakMean, nsigmaADC, peakWidth, and the overlap adjustment logic) first, clamp them to the valid range between startT and endT before creating the iterators, and then construct HitsumStartItr and HitsumEndItr from the clamped values. This ensures the iterator offsets are always valid before they are dereferenced.

coderabbitai · 2026-06-18T22:51:03Z

+    std::sort(hits.begin(), hits.end(), [](const recob::Hit& a, const recob::Hit& b) {
+      if (a.PeakTime() != b.PeakTime()) {
+        return a.PeakTime() < b.PeakTime();
+      }
+      return a.PeakAmplitude() < b.PeakAmplitude();
+    });


🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify whether current fixtures contain (PeakTime, PeakAmplitude) collisions # with different full hit records (a predictor of unstable diff ordering). python - <<'PY' import glob from collections import defaultdict def parse_hits(path): with open(path, "r", encoding="utf-8") as f: lines = [ln.rstrip("\n") for ln in f] i = 0 while i < len(lines): if lines[i].startswith("i = "): block = lines[i+1:i+18] # 17 printed fields if len(block) == 17: key = (block[3], block[6]) # PeakTime, PeakAmplitude yield key, tuple(block) i += 18 else: i += 1 for path in sorted(glob.glob("migration/gauss_hit_finder/art_hits_*.txt")): by_key = defaultdict(set) for key, sig in parse_hits(path): by_key[key].add(sig) collisions = {k: v for k, v in by_key.items() if len(v) > 1} print(f"{path}: {len(collisions)} ambiguous key(s)") PY

Repository: Framework-R-D/phlex-examples

Length of output: 382

🏁 Script executed:

# Verify the file exists and examine the sorting code find . -name "print_hits_to_file.cpp" -type f

Repository: Framework-R-D/phlex-examples

Length of output: 124

🏁 Script executed:

# Check for recob::Hit definition to understand the available fields rg -t cpp "class Hit|struct Hit" -A 30 --max-count=5

Repository: Framework-R-D/phlex-examples

Length of output: 15916

🏁 Script executed:

# Verify the fixture files exist and understand their structure ls -la migration/gauss_hit_finder/art_hits_*.txt 2>/dev/null | head -5 head -50 migration/gauss_hit_finder/art_hits_0.txt 2>/dev/null

Repository: Framework-R-D/phlex-examples

Length of output: 882

🏁 Script executed:

cat -n migration/gauss_hit_finder/print_hits_to_file.cpp | head -80

Repository: Framework-R-D/phlex-examples

Length of output: 3444

🏁 Script executed:

# Check for the full signature of Hit methods to verify the patch references valid methods rg -A 3 "Channel|StartTick|EndTick|LocalIndex|View" migration/gauss_hit_finder/copied_from_larsoft_minor_edits/Hit.h

Repository: Framework-R-D/phlex-examples

Length of output: 1330

Complete the sort order to guarantee determinism regardless of data distribution

The current two-level sort by PeakTime and PeakAmplitude is theoretically incomplete—when both values tie, hit ordering becomes unspecified and could vary across runs. While the current fixture files show no collisions on these pairs, adding a complete comparison chain defensively ensures stable output diffs across all possible input datasets. The referenced fields exist on recob::Hit and the patch is correct.

Suggested patch

- std::sort(hits.begin(), hits.end(), [](const recob::Hit& a, const recob::Hit& b) { - if (a.PeakTime() != b.PeakTime()) { - return a.PeakTime() < b.PeakTime(); - } - return a.PeakAmplitude() < b.PeakAmplitude(); - }); + std::sort(hits.begin(), hits.end(), [](const recob::Hit& a, const recob::Hit& b) { + return std::tie(a.PeakTime(), + a.PeakAmplitude(), + a.Channel(), + a.StartTick(), + a.EndTick(), + a.LocalIndex(), + a.View()) + < std::tie(b.PeakTime(), + b.PeakAmplitude(), + b.Channel(), + b.StartTick(), + b.EndTick(), + b.LocalIndex(), + b.View()); + });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@migration/gauss_hit_finder/print_hits_to_file.cpp` around lines 26 - 31, The lambda comparator in the std::sort call on the hits collection is incomplete—when both PeakTime() and PeakAmplitude() values are equal between two hits, the function returns implicitly without a tiebreaker, leaving the ordering unspecified. Add a third-level comparison after the PeakAmplitude check in the lambda function to handle the case when both fields match, using another field from recob::Hit (such as channel number or wire identifier) as a tiebreaker to guarantee fully deterministic sort order across all possible input datasets.

coderabbitai · 2026-06-18T22:51:03Z

+  examples::find_hits_with_gaussians_design1_cfg cfg = {
+    .filter_hits = config.get<bool>("filter_hits"),
+    .long_max_hits_vec = config.get<std::vector<int>>("long_max_hits_vec"),
+    .long_pulse_width_vec = config.get<std::vector<int>>("long_pulse_width_vec"),
+    .max_multi_hit = config.get<int>("max_multi_hit"),
+    .area_method = config.get<int>("area_method"),
+    .area_norms_vec = config.get<std::vector<double>>("area_norms_vec"),
+    .chi2_ndf = config.get<double>("chi2_ndf"),
+    .pulse_height_cuts = config.get<std::vector<float>>("pulse_height_cuts"),
+    .pulse_width_cuts = config.get<std::vector<float>>("pulse_width_cuts"),
+    .pulse_ratio_cuts = config.get<std::vector<float>>("pulse_ratio_cuts")
+  };


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate divisor-like config values at load time.

This block accepts values later used as divisors in find_hits_with_gaussians_design1 (e.g., long_pulse_width_vec, long_max_hits_vec, area_norms_vec). Zero/negative values can trigger divide-by-zero or invalid charge scaling.

Suggested fix

examples::find_hits_with_gaussians_design1_cfg cfg = { @@ }; + + auto validate_positive = [](auto const& values, char const* name) { + if (values.empty()) { + throw std::runtime_error(std::string{name} + " must not be empty"); + } + if (values.at(0) <= 0) { + throw std::runtime_error(std::string{name} + " must be > 0 for plane 0"); + } + }; + + validate_positive(cfg.long_max_hits_vec, "long_max_hits_vec"); + validate_positive(cfg.long_pulse_width_vec, "long_pulse_width_vec"); + if (cfg.area_method != 0) { + if (cfg.area_norms_vec.empty() || cfg.area_norms_vec.at(0) <= 0.0) { + throw std::runtime_error("area_norms_vec[0] must be > 0 when area_method != 0"); + } + }

🧰 Tools

🪛 Cppcheck (2.21.0)

[style] 35-35: The function 'isValidChannelID' is never used.

(unusedFunction)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp` around lines 34 - 45, The configuration values being loaded into the find_hits_with_gaussians_design1_cfg struct, particularly long_pulse_width_vec, long_max_hits_vec, and area_norms_vec, are used as divisors downstream but are not validated at load time. Add validation checks immediately after the cfg struct initialization to ensure that all divisor-like configuration values are positive and non-zero, throwing an appropriate exception or error if any invalid values are found to prevent divide-by-zero or invalid scaling operations later in find_hits_with_gaussians_design1.

coderabbitai · 2026-06-18T22:51:03Z

+  auto finder_configs = config.get<configuration>("cand_hit_standard_configs");
+  cand_hit_standard_vec.resize(finder_configs.keys().size());
+  for (auto const& key : finder_configs.keys()) {
+    auto finder_config = finder_configs.get<configuration>(key);
+    auto hit_finder = std::make_shared<examples::CandHitStandard>(
+        examples::CandHitStandardCfg{
+          .fRoiThreshold = finder_config.get<float>("roiThreshold")
+        });
+    unsigned int plane = finder_config.get<unsigned int>("Plane");
+    if (plane >= cand_hit_standard_vec.size()) {
+      std::cerr << "Error: plane number " << plane << " is out of range for cand_hit_standard_vec of size " << cand_hit_standard_vec.size() << std::endl;
+      throw std::runtime_error("Invalid plane number");
+    }
+    cand_hit_standard_vec[plane] = std::move(hit_finder);
+  }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden plane-to-finder mapping before transform execution.

Line 50 sizes cand_hit_standard_vec by key count, but indexing is by Plane. Sparse or duplicate Plane values can leave null entries, and Line 147 in migration/gauss_hit_finder/find_hits_with_gaussians_design1.cpp dereferences cand_hit_standard.at(plane) unconditionally.

Suggested fix

+ // Build storage from max plane index, not number of keys. + unsigned int max_plane = 0; + for (auto const& key : finder_configs.keys()) { + auto const c = finder_configs.get<configuration>(key); + max_plane = std::max(max_plane, c.get<unsigned int>("Plane")); + } + cand_hit_standard_vec.clear(); + cand_hit_standard_vec.resize(max_plane + 1); for (auto const& key : finder_configs.keys()) { auto finder_config = finder_configs.get<configuration>(key); @@ unsigned int plane = finder_config.get<unsigned int>("Plane"); if (plane >= cand_hit_standard_vec.size()) { @@ } + if (cand_hit_standard_vec[plane]) { + throw std::runtime_error("Duplicate cand_hit_standard config for plane " + std::to_string(plane)); + } cand_hit_standard_vec[plane] = std::move(hit_finder); } + + for (std::size_t p = 0; p < cand_hit_standard_vec.size(); ++p) { + if (!cand_hit_standard_vec[p]) { + throw std::runtime_error("Missing cand_hit_standard config for plane " + std::to_string(p)); + } + }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@migration/gauss_hit_finder/register_find_hits_with_gaussians_design1.cpp` around lines 49 - 63, The cand_hit_standard_vec is sized based on the count of configuration keys rather than the actual plane index values, which causes problems when plane indices are sparse or non-sequential. Modify the sizing logic to accommodate the actual maximum plane index encountered in finder_configs before populating the vector. Either calculate the maximum plane value first and resize the vector to max_plane + 1, or alternatively replace the vector with a std::map or std::unordered_map that maps plane numbers to hit_finder objects to naturally handle sparse plane numbering without leaving null entries. This ensures that line 147 in find_hits_with_gaussians_design1.cpp can safely dereference cand_hit_standard.at(plane) for any valid plane in the configuration.

wddgit · 2026-06-26T15:52:18Z

I just force pushed some changes to the PR.

The PR is rebased on top of the latest version of phlex-examples. I also built it with the latest version of phlex and made some minor changes to get it to work with that.

There were problems with setting PHLEX_PLUGIN_PATH. It was only working before when I tested it because I had manually set that. That is fixed. Also in CMakelists.txt I fixed some dependencies between tests.

I also fixed a couple minor problems that Code Rabbit found. Most of those I ignored because I am trying to keep this as close as possible to the version in LArSoft and I didn't fix any issue that also exists there.

Tests pass and this ready for human review at this point. As far as I can see it can be merged as is and any future development we do in a subsequent PR.

wddgit · 2026-06-26T15:56:42Z

+set_tests_properties(
+  GausHitFinderTest
+  GausHitFinderTestWithUnfold
+  PROPERTIES ENVIRONMENT "PHLEX_PLUGIN_PATH=${CMAKE_CURRENT_BINARY_DIR}")


This properly sets up the tests to find plugins in phlex-examples. Is there a way to set up the tests to find plugins in phlex? I spent a while trying to figure out how to do that and gave up. The tests only need one small plugin from phlex and I just copied it and now phlex-examples has its own version of generate_layers. I'll fix this now or in a subsequent PR if someone can explain how...

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Version of GausHitFinder with unfold/transform/fold

7f595a4

wddgit force-pushed the gaussHitFinderStartingWithUnfold branch from 880a2cf to 7f595a4 Compare June 26, 2026 15:36

wddgit commented Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Version of GausHitFinder with unfold/transform/fold#18

Version of GausHitFinder with unfold/transform/fold#18
wddgit wants to merge 1 commit into
Framework-R-D:mainfrom
wddgit:gaussHitFinderStartingWithUnfold

wddgit commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

wddgit commented Jun 26, 2026

Uh oh!

wddgit Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

wddgit commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Algorithm Implementation

Build System

Tests

PHLEX Integration

Refactoring

Documentation

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

wddgit commented Jun 26, 2026

Uh oh!

wddgit Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wddgit commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading