Skip to content

Tepid: only update indexes that have changed attributes#29

Closed
gburd wants to merge 15 commits into
masterfrom
tepid
Closed

Tepid: only update indexes that have changed attributes#29
gburd wants to merge 15 commits into
masterfrom
tepid

Conversation

@gburd

@gburd gburd commented Jul 4, 2026

Copy link
Copy Markdown
Owner

No description provided.

gburd and others added 14 commits July 3, 2026 21:59
  - Hourly upstream sync from postgres/postgres (24x daily)
  - AI-powered PR reviews using AWS Bedrock Claude Sonnet 4.5
  - Multi-platform CI via existing Cirrus CI configuration
  - Cost tracking and comprehensive documentation

  Features:
  - Automatic issue creation on sync conflicts
  - PostgreSQL-specific code review prompts (C, SQL, docs, build)
  - Cost limits: $15/PR, $200/month
  - Inline PR comments with security/performance labels
  - Skip draft PRs to save costs

  Documentation:
  - .github/SETUP_SUMMARY.md - Quick setup overview
  - .github/QUICKSTART.md - 15-minute setup guide
  - .github/PRE_COMMIT_CHECKLIST.md - Verification checklist
  - .github/docs/ - Detailed guides for sync, AI review, Bedrock

  See .github/README.md for complete overview

Complete Phase 3: Windows builds + fix sync for CI/CD commits

Phase 3: Windows Dependency Build System
- Implement full build workflow (OpenSSL, zlib, libxml2)
- Smart caching by version hash (80% cost reduction)
- Dependency bundling with manifest generation
- Weekly auto-refresh + manual triggers
- PowerShell download helper script
- Comprehensive usage documentation

Sync Workflow Fix:
- Allow .github/ commits (CI/CD config) on master
- Detect and reject code commits outside .github/
- Merge upstream while preserving .github/ changes
- Create issues only for actual pristine violations

Documentation:
- Complete Windows build usage guide
- Update all status docs to 100% complete
- Phase 3 completion summary

All three CI/CD phases complete (100%):
✅ Hourly upstream sync with .github/ preservation
✅ AI-powered PR reviews via Bedrock Claude 4.5
✅ Windows dependency builds with smart caching

Cost: $40-60/month total
See .github/PHASE3_COMPLETE.md for details

Fix sync to allow 'dev setup' commits on master

The sync workflow was failing because the 'dev setup v19' commit
modifies files outside .github/. Updated workflows to recognize
commits with messages starting with 'dev setup' as allowed on master.

Changes:
- Detect 'dev setup' commits by message pattern (case-insensitive)
- Allow merge if commits are .github/ OR dev setup OR both
- Update merge messages to reflect preserved changes
- Document pristine master policy with examples

This allows personal development environment commits (IDE configs,
debugging tools, shell aliases, Nix configs, etc.) on master without
violating the pristine mirror policy.

Future dev environment updates should start with 'dev setup' in the
commit message to be automatically recognized and preserved.

See .github/docs/pristine-master-policy.md for complete policy
See .github/DEV_SETUP_FIX.md for fix summary

Optimize CI/CD costs by skipping builds for pristine commits

Add cost optimization to Windows dependency builds to avoid expensive
builds when only pristine commits are pushed (dev setup commits or
.github/ configuration changes).

Changes:
- Add check-changes job to detect pristine-only pushes
- Skip Windows builds when all commits are dev setup or .github/ only
- Add comprehensive cost optimization documentation
- Update README with cost savings (~40% reduction)

Expected savings: ~$3-5/month on Windows builds, ~$40-47/month total
through combined optimizations.

Manual dispatch and scheduled builds always run regardless.
Review every PR (including drafts) with two jobs that authenticate to AWS
Bedrock (Claude Opus 4.8) via GitHub OIDC (vars.AWS_ROLE_ARN); no static
AWS credentials are stored in the repo.

- ocr-review: runs Alibaba Open Code Review through an ephemeral LiteLLM
  proxy bridging OCR's OpenAI protocol to Bedrock, and posts inline review
  comments. Uses output_config.effort=xhigh (Opus 4.8 adaptive thinking).
  Path-scoped rules (.github/ocr/rule.json) encode PostgreSQL community
  review standards plus reviewer discipline (verify against the diff, don't
  hallucinate, state confidence, be blunt, accuracy over approval).
- pg-history: OCR cannot call MCP, so a separate Bedrock tool-use agent
  (.github/ocr/pg-history.py) queries the Agora MCP server (pg.ddx.io) to
  tie the change to git + pgsql-hackers history, and upserts a comment
  linking threads as https://pg.ddx.io/m/pgsql-hackers/<message-id>.
The pg-history workflow job has been failing every run with 'Bedrock call
failed: The read operation timed out' -- botocore's default 60s read
timeout on bedrock-runtime is too short for a multi-round (MAX_ROUNDS=14)
tool-use loop against a large PR diff on a reasoning-capable model; a
single converse() call alone can take several minutes under load (the
sibling ocr-review job's own LLM pass over a similarly large diff took
30-40 minutes).  Confirmed via two consecutive live runs against PR #26.

Set read_timeout=900s (15 min) explicitly via botocore.config.Config;
leave connect_timeout short since a stuck TCP handshake is a different,
cheaper-to-detect failure mode that shouldn't wait as long.
Keep master a pristine mirror of upstream plus our .github/ CI. These
workflows rebase the .github-only commits onto postgres/postgres and push
via SYNC_PAT (a PAT carrying the 'workflow' scope — required because the
default GITHUB_TOKEN cannot update files under .github/workflows/):
  - sync-upstream.yml         (hourly schedule + manual dispatch)
  - sync-upstream-manual.yml  (on-demand, with a force-push toggle)
Nix-based development environment for PostgreSQL hacking.  Not for
merge; staged here so per-commit build/test runs can share a single
toolchain.
Add regression coverage for existing classic Heap-Only Tuple (HOT) update
behavior, committed first so the behavioral changes in the later HOT-indexed
commits are diffable against a known-good baseline.  This commit adds
tests so as to codify explicitly the HOT contract for the heap AM.

The new hot_updates regression test exercises:
- Basic HOT vs non-HOT update decisions
- The all-or-none property across multiple indexes
- Partial indexes and predicate handling
- BRIN (summarizing) indexes allowing HOT updates
- TOAST column handling with HOT
- Unique constraint behavior
- Multi-column indexes
- Partitioned table HOT updates
- HOT chain formation and the index-scan walk over a chain

Authored-by: Greg Burd <greg@burd.me>
Refactor executor update logic to determine which indexed columns have
actually changed during an UPDATE operation rather than leaving this up
to HeapDetermineColumnsInfo() in heap_update(). Finding this set of
attributes is not heap-specific, but more general to all table AMs and
having this information in the executor could inform other decisions
about when index inserts are required and when they are not regardless
of the table AM's MVCC implementation strategy.

The heap-only tuple decision (HOT) in heap functions as it always has;
what moves to the executor is only the determination of the "modified
indexed attributes" (modified_idx_attrs).

ExecUpdateModifiedIdxAttrs() replaces HeapDetermineColumnsInfo() and is
called before table_tuple_update() crucially without the need for an
exclusive buffer lock on the page that holds the tuple being updated.
This reduces the time the buffer lock is held later within
heapam_tuple_update() and heap_update().

Besides identifying the set of modified indexed attributes
HeapDetermineColumnsInfo() was also partially responsible for the
decision about what to WAL log for the replica identity key. That logic
moves into heap_update() and into the replacement helper
HeapUpdateModifiedIdxAttrs(), so simple_heap_update() and
heapam_tuple_update() share the same logic since both call into
heap_update().

Updates stemming from logical replication also use the new
ExecUpdateModifiedIdxAttrs() in ExecSimpleRelationUpdate().

ExecUpdateModifiedIdxAttrs() uses ExecCompareSlotAttrs() to identify
which attributes have changed and then intersects that with the set of
indexed attributes to identify the modified indexed set, the
modified_idx_attrs.

This patch introduces a few helper functions to reduce code duplication
and increase readability: HeapUpdateHotAllowable() and
HeapUpdateDetermineLockmode(), used in both heap_update() and
simple_heap_update().

heap_update() is now called with lockmode pre-determined and a boolean
indicating whether the update may be HOT, both const. If during
heap_update() the new tuple fits on the same page and that boolean is
true, the update is HOT. So although the functions and timing of the
HOT decision code have changed, none of the logic governing when HOT is
allowed has changed.

Development of this feature exposed nondeterministic behavior in three
existing tests, which have been adjusted to avoid inconsistent results
due to tuple ordering during heap page scans.

Authored-by: Greg Burd <greg@burd.me>
Discussion: https://commitfest.postgresql.org/patch/5556/
Discussion: https://www.postgresql.org/message-id/flat/78574B24-BE0A-42C5-8075-3FA9FA63B8FC%40amazon.com
Define the on-disk representation a HOT-indexed update and its later
prune/collapse produce, ahead of the code that reads or writes it:

- HEAP_INDEXED_UPDATED (htup_details.h), the t_infomask2 bit marking a
  heap-only tuple whose producing UPDATE also changed an indexed column; and
- access/hot_indexed.h, the inline fixed-size modified-attrs bitmap stored in
  the tail of such a tuple, plus the xid-free "collapse-survivor stub" format
  (HEAP_INDEXED_UPDATED with natts == 0, a forward link, and the segment's
  bitmap) and the accessors both share.

README.HOT-INDEXED introduces the design and the relaxed classic-HOT
invariant; later commits document the eligibility, write, read, and
prune/collapse machinery in their own sections.

Co-authored-by: Greg Burd <greg@burd.me>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Implement the HOT-indexed (Selective Index Update) feature on the foundation
laid by the executor's modified-attribute identification.

Eligibility: HeapUpdateHotAllowable returns a HeapUpdateIndexMode --
HEAP_UPDATE_ALL_INDEXES (not HOT; every index needs an entry), HEAP_UPDATE_HOT
(classic HOT; no index needs an entry), or HEAP_SELECTIVE_INDEX_UPDATE (HOT
chain, only the changed indexes maintained) -- computed from modified_idx_attrs
and the per-relation indexed-attribute set (RelationGetIndexedAttrs).  An
UPDATE that changes a non-summarizing indexed attribute is
HEAP_SELECTIVE_INDEX_UPDATE unless it is forced to HEAP_UPDATE_ALL_INDEXES by
one of: every indexed attribute changed (nothing to skip), an attribute
referenced by an expression index changed (expression-aware maintenance is not
implemented yet), a system catalog, or the logical-replication apply gate (see
the apply-gating commit).  Partial indexes, exclusion constraints, partitioned
tables, and non-btree access methods are all eligible -- the read path is
access-method agnostic and the predicate column is part of the index's
attribute set, so no carve-out is needed for them.

Write path: the table-AM update contract carries modified attributes IN/OUT as
a Bitmapset (on output the AM adds the whole-row sentinel,
TableTupleUpdateAllIndexes, to signal "every index needs an entry"), and
heap_update, for HEAP_SELECTIVE_INDEX_UPDATE, keeps the new version on the HOT
chain while ExecInsertIndexTuples maintains only the indexes whose attributes
changed.  The new heap-only tuple records, in an inline bitmap in its tail, the
attributes that changed at its hop.  Only the stored tuple carries the bitmap
and the HEAP_INDEXED_UPDATED flag; the caller's in-memory copy is left unmarked
so the flag never promises a trailing bitmap that is not present.

Read path: a chain walk to the live tuple unions the modified-attribute
bitmaps of every hop it crosses.  The index-access layer treats that
crossed-attribute bitmap as the staleness authority: if it overlaps the
arriving index's key columns the entry is stale and is dropped, and the row is
re-supplied by the fresh entry the same update planted.  The read path is
access-method agnostic and needs no value recheck or leaf key: it is correct
even when a key is cycled away and back, because the value-restoring update
planted a fresh entry whose walk crosses no later key-changing hop.

Unique checks are the one place that does compare values: _bt_check_unique
fetches the conflicting tuple under SnapshotDirty and, on a crossed-hop
arrival, compares the live tuple's current key against the arriving leaf with
the index's own ordering procedure (_bt_heap_keys_equal_leaf, BTORDER_PROC
under each column's collation).  Using the opclass comparator -- not a bitwise
image comparison -- distinguishes a stale ancestor leaf from a genuinely live
duplicate (equal under the opclass even if not bitwise-identical) and, in the
in-flight window of a restoring update, routes the stale-ancestor hit into
_bt_doinsert's xwait so the duplicate is still caught.  The comparison reads
plain key columns straight from the heap slot; it never evaluates an indexed
expression, because an UPDATE touching an expression-index attribute is
ineligible for HOT-indexed, so an expression index is never the one receiving
the fresh entry whose insert runs this check.

Co-authored-by: Greg Burd <greg@burd.me>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
A HOT-indexed update plants index entries that point at mid-chain heap-only
tuples, so a dead chain member cannot simply be removed: a not-yet-swept index
entry may still arrive at it, and the per-hop modified-attrs bitmap on it is
what a reader unions to judge staleness.

Teach prune to collapse a dead chain prefix into xid-free forwarding stubs:
each preserved dead key tuple is rewritten in place to a stub (frozen,
natts == 0, HEAP_INDEXED_UPDATED, forwarding via t_ctid.offnum) that keeps its
segment's modified-attrs bitmap, and a member whose attributes are wholly
subsumed by later hops is reclaimed instead.  Readers step through stubs
transparently and still cross every surviving hop's bitmap.  The collapse back
to classic HOT is driven by prune: once a chain is fully dead, a later prune
(heap_prune_chain / heap_prune_chain_find_live) reclaims its members and
re-points the root redirect straight at the first live tuple.  VACUUM's index
cleanup sweeps the stale leaves; its second pass (lazy_vacuum_heap_page) does
the usual LP_DEAD -> LP_UNUSED conversion and leaves the HOT-indexed collapse
to prune.

The collapse reuses the existing prune/freeze WAL via an xlhp_prune_items
sub-record carrying the (offset, forward) stub pairs; no new record type is
introduced.  A page that still carries a preserved stub (or a redirect that
forwards into a live HOT-indexed member) is kept non-all-visible so index-only
scans heap-fetch through the chain; heap_page_would_be_all_visible recognizes
both the redirect-to-SIU and the stub case explicitly.

Co-authored-by: Greg Burd <greg@burd.me>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
verify_heapam must not flag the HOT-indexed artifacts as corruption: a live
HEAP_INDEXED_UPDATED heap-only tuple whose mid-chain line pointer is preserved
because an index entry still points at it, an xid-free collapse-survivor stub,
and more than one LP_REDIRECT forwarding to the same live tuple are all
legitimate.  Recognize them and continue checking the rest of the chain.

Cover this with an amcheck regression test, and add a pg_upgrade test that
carries a relation with HOT-indexed chains, an ABA-cycled indexed column, an
out-of-line indexed column, and VACUUM-collapsed stubs across an upgrade,
verifying the data, verify_heapam, bt_index_check, and the chain scans on the
new cluster.

Authored-by: Greg Burd <greg@burd.me>
Expose the HOT-indexed activity counters maintained by the write path:
pg_stat_all_tables.n_tup_hot_indexed_upd, the per-index
n_tup_hot_indexed_upd_matched / n_tup_hot_indexed_upd_skipped counters in
pg_stat_all_indexes, and pg_relation_hot_indexed_stats() reporting per-relation
HOT-indexed chain composition.  Document them in monitoring.sgml and the
README.

With statistics, prune/collapse, and amcheck recognition all in place, add the
full feature test suite, which uses those facilities to verify behavior:

- hot_indexed_updates (regression): eligibility and classification; selective
  maintenance across multiple/composite indexes; the crossed-attribute read
  path for equality, range, and inequality scans; a key cycled away and back
  (ABA), including across two distinct live rows; TOASTed indexed columns;
  partial-index predicate flips (key and non-key predicate columns);
  trigger-modified indexed columns; exclusion-constraint tables; partitioned
  tables; non-btree access methods (hash, GIN, GiST); a UNIQUE index on a type
  where image equality differs from operator equality; CREATE INDEX / REINDEX
  and DROP INDEX over live chains; prune reclamation, stub mixes, and
  re-collapse across partial VACUUMs; the never-all-visible guard; and DDL
  after a chain exists (ADD COLUMN crossing a bitmap-size boundary, DROP
  COLUMN).
- hot_indexed_adversarial (isolation): concurrent UPDATE / VACUUM / prune and
  index scans, key cycling, aborts, and reader consistency across a concurrent
  collapse.
- 054_hot_indexed_recovery (recovery): WAL replay of the chain and its collapse
  under wal_consistency_checking.
- pg_surgery handling of HOT-indexed tuples and collapse-survivor stubs.

Authored-by: Greg Burd <greg@burd.me>
A HOT-indexed update of a replica-identity attribute on a subscriber leaves a
stale index leaf that the apply worker's replica-identity lookups must tolerate
-- which they do, but only when the subscriber's indexed attributes do not
extend past the columns those lookups key on.  Add the per-subscription
hot_indexed_on_apply option (subhotindexedonapply: off / subset_only (default)
/ always) and have HeapUpdateHotAllowable consult it when running in an apply
worker, comparing the relation's indexed-attribute set against its primary-key
attributes: "off" disqualifies HOT-indexed whenever any indexed attribute lies
outside the primary key, "subset_only" requires the indexed attributes to be a
subset of the primary key, and "always" applies no apply-path gating.

Wire the option through CREATE/ALTER SUBSCRIPTION, pg_subscription, pg_dump,
and psql's \dRs+, and document it (create_subscription, alter_subscription,
catalogs).  Cover apply under each mode (039), apply under REPLICA IDENTITY
FULL and a non-PK USING INDEX whose key is cycled (040), and decoding of
HOT-indexed update chains (test_decoding).

Authored-by: Greg Burd <greg@burd.me>
A/B and single-variant benchmark scripts for HOT-indexed updates: build two
postgres variants, run pgbench workloads exercising classic-HOT, non-HOT, and
HOT-indexed paths, and a self-contained bloat probe that reports the skip count
(index writes avoided on unchanged indexes) and changed-index bounding.  Not
for merge; kept for evaluating the feature.
# Conflicts:
#	.github/workflows/sync-upstream-manual.yml
#	.github/workflows/sync-upstream.yml
@gburd gburd closed this Jul 5, 2026
@gburd gburd deleted the tepid branch July 5, 2026 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant