Fix multi-column ColumnIndex rebuild ordering via per-column radix sort + merge#914
Open
yihozhang wants to merge 4 commits into
Open
Fix multi-column ColumnIndex rebuild ordering via per-column radix sort + merge#914yihozhang wants to merge 4 commits into
yihozhang wants to merge 4 commits into
Conversation
Replace the global value sort plus per-value-group re-sort in the multi-column ColumnIndex rebuild with: a value-only radix sort of each column's (value, row) block (each block arrives RowId-ascending, so the stable sort yields (Value, RowId) order), followed by a merge of the sorted blocks that drops duplicate pairs. The single- and multi-column cases now share one code path, and no value group is ever re-sorted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Merging the per-column sorted blocks by folding a single growing accumulator against each block in turn is O(n*k) in the column count: the accumulator is re-copied on every merge. Replace it with a balanced (tournament) two-way merge -- merge adjacent runs pairwise, then merge the results pairwise, halving the run count each round -- which is O(n log k). For the common two-column case this is the same single merge as before; the win shows up on wide tables with many covered columns. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Merging this PR will not alter performance
Comparing Footnotes
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #914 +/- ##
==========================================
- Coverage 87.17% 86.59% -0.59%
==========================================
Files 88 89 +1
Lines 26106 26872 +766
==========================================
+ Hits 22759 23269 +510
- Misses 3347 3603 +256 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Member
|
Does this fix #911 ? Or different issue? |
Member
|
This I believe is instead of #899? |
The tournament merge allocated a fresh Vec per pairwise merge and a Vec<Vec<_>> per round. Instead, pack each round's merged runs contiguously into a second buffer of the same size and ping-pong the two buffers, reusing the moved-in `pairs` as one of them. The whole merge now uses a single extra allocation regardless of column count (k-1 allocations -> 1), with run boundaries tracked in a small inline array. merge2_into appends with a dedup window scoped to its own output so adjacent runs packed in one buffer are not merged together. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The multi-column
ColumnIndexrebuild (therebuild_indexcovering all of a table's rebuildable value columns) produced incorrect subsets. It collected(value, row_id)pairs column-major — all of column 0's pairs, then column 1's, etc. — and sorted them only by value. Within a single value's group the rows then came out as column 0's rows (RowId-ascending) followed by column 1's rows (RowId-ascending), i.e. not globally RowId-ascending.That violates the invariant that each key's subset is built from rows in ascending
RowIdorder (the contractadd_row_sortedand the offset consumers rely on), so a value present in more than one column could get a subset with out-of-order offsets. The single-column path was unaffected because its scan order is already RowId-ascending.Solution
Rebuild each key's subset from a properly
(Value, RowId)-sorted, de-duplicated pair list:(Value, RowId)— with no RowId passes.Single- and multi-column rebuilds now share one code path; the previous global-sort-plus-per-value-group-re-sort is gone.
Why this shape
RowIdpasses to the shared radix sort regressed the (hot) single-column path ~1–2.6%, because those passes are wasted work when pairs already arrive RowId-ordered. Sorting per-column and merging keeps the single-column case at its original cost.O(n log k)merge. A left-fold that merges one growing accumulator against each block in turn isO(n*k)in the column countk(the accumulator is re-copied every step). The balanced tournament merge halves the run count each round →O(n log k), which matters for wide tables. For the common two-column case it is the same single merge.Performance
Benchmarked with
scripts/bench.pyagainst the prior per-group-sort implementation across the standard suite (conv1d, luminal-llama, python_array, cykjson, eggcc-extraction). Net change is within the measurement noise floor (≈ ±1% run-to-run on these inputs), with no consistent regression — the single-column hot path is unchanged andeggcc/cykjsontrend slightly faster. TheO(n log k)merge improvement is asymptotic and applies to wide multi-column tables, which the current benchmark suite does not exercise.Testing
multi_column_column_index_rebuild_orders_each_value_by_rowasserts each value's subset matches RowId-ascending table scan order after rebuilds.core-relationsunit tests and the full egglog integration suite (723 tests) pass;cargo fmt --checkandclippy -D warningsclean.🤖 Generated with Claude Code