Skip to content

feat(format): enumerate tracked files via git ls-files in --scope=all#1155

Open
gregmagolan wants to merge 14 commits into
mainfrom
feat/format-scope-all-git-ls-files
Open

feat(format): enumerate tracked files via git ls-files in --scope=all#1155
gregmagolan wants to merge 14 commits into
mainfrom
feat/format-scope-all-git-ls-files

Conversation

@gregmagolan

@gregmagolan gregmagolan commented May 28, 2026

Copy link
Copy Markdown
Member

Adds --scope=git to enumerate all non-gitignored tracked files via git ls-files before invoking the formatter, instead of having the formatter self-discover. --scope=all (formatter self-discovery) is unchanged.

What --scope=git does:

  • Runs git ls-files --cached --others --exclude-standard --full-name from the git root, scoped to the invoking directory when run from a subdirectory
  • Applies .gitattributes filtering (rules-lint-ignored, linguist-generated, gitlab-generated — same attributes as rules_lint)
  • Excludes files that are tracked in the index but missing from disk
  • Applies --include-pattern / --exclude-pattern against repo-relative paths before invoking the formatter (patterns are now effective in scope=all-equivalent mode)
  • Passes the resulting file list to the formatter in batches (≤128 000 chars each, matching rules_lint's format.sh cap)
  • Falls back to formatter self-discovery with a warning when git is unavailable

Why: Bare formatter binaries (e.g. @buildifier_prebuilt//buildifier) in --scope=all mode walk the filesystem and may visit gitignored generated files. --scope=git restricts the run to files git knows about. Raw formatters that only handle specific file types should pair --scope=git with --include-pattern (e.g. --include-pattern='**/*.bzl' for buildifier).

Implementation notes:

  • New helpers: _chunk_files, _filter_git_attributes, _git_ls_files, _apply_pattern_filters (extracted from duplicate pattern-filter blocks in scope=changed and the new scope=git branch)
  • _git_capture_root runs git from the git root; uses -c core.quotePath=false so non-ASCII filenames are never C-quoted in output
  • _git_capture/_git_capture_root both delegate to _git_capture_in to avoid duplicating the process-builder chain
  • lib/format_results.axl: fixed template that hardcoded --scope=changed in the "no files to format" message; now uses the actual scope value

Changes are visible to end-users: yes

  • Searched for relevant documentation and updated as needed: no
  • Breaking change (forces users to change their own code or config): no
  • Suggested release notes appear below: yes

Suggested release notes:

  • New --scope=git for aspect format: enumerates all non-gitignored files via git ls-files (respecting .gitignore and .gitattributes) before invoking the formatter. Makes --include-pattern / --exclude-pattern effective in this mode. Bare formatters like buildifier should pair --scope=git --include-pattern='**/*.bzl' --include-pattern='**/BUILD*' to scope to their supported file types.

Test plan

  • Manual testing:
    1. aspect format --scope=git in a repo with gitignored generated BUILD files — they should not be formatted
    2. aspect format --scope=git --include-pattern='**/*.bzl' — only .bzl files formatted
    3. Mark a file linguist-generated in .gitattributes — should be excluded from --scope=git
    4. aspect format --scope=all — unchanged self-discovery behavior
    5. aspect format --scope=git outside a git repo — warning emitted, formatter falls back to self-discovery

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f6b113636

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread crates/aspect-cli/src/builtins/aspect/format.axl Outdated
Comment thread crates/aspect-cli/src/builtins/aspect/format.axl Outdated
@gregmagolan gregmagolan force-pushed the feat/format-scope-all-git-ls-files branch from 659fb9c to ecb46fe Compare May 28, 2026 18:48
@aspect-workflows

aspect-workflows Bot commented May 28, 2026

Copy link
Copy Markdown

✨ Aspect Workflows Tasks

📅 Sat May 30 04:50:16 UTC 2026

⚠️ 2 flagged tasks

  • ⚠️ delivery (delivery-gha) · ⏱ 33.6s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Delivery complete (2 warn · 3 skipped)
  • ⚠️ delivery (delivery-gha-debug) · ⏱ 39.6s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Delivery complete (2 warn · 3 skipped)

✅ 29 successful tasks

  • ✅ build (build-gha) · ⏱ 25.7s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Bazel build complete (154 built)
  • ✅ build (build-gha-debug) · ⏱ 1m 2s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Bazel build complete (154 built)
  • ✅ buildifier (buildifier-gha) · ⏱ 22.8s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-all) · ⏱ 30.9s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-all-debug) · ⏱ 31.4s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-debug) · ⏱ 29.5s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-ephemeral) · ⏱ 53s · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-ephemeral-all) · ⏱ 6.4s · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-ephemeral-tracked) · ⏱ 1m 4s · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-tracked) · ⏱ 1m 16s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ buildifier (buildifier-gha-tracked-debug) · ⏱ 1m 10s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha) · ⏱ 20.7s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-all) · ⏱ 30.4s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-all-debug) · ⏱ 21.3s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-debug) · ⏱ 34.8s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-ephemeral) · ⏱ 2m 43s · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-ephemeral-all) · ⏱ 7.9s · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-ephemeral-tracked) · ⏱ 9.6s · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-tracked) · ⏱ 21.7s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ format (format-gha-tracked-debug) · ⏱ 22.6s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Format complete (clean)
  • ✅ gazelle (gazelle-from-source-gha) · ⏱ 21.2s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Gazelle complete (clean)
  • ✅ gazelle (gazelle-from-source-gha-debug) · ⏱ 52.8s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Gazelle complete (clean)
  • ✅ gazelle (gazelle-gha) · ⏱ 21.5s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Gazelle complete (clean)
  • ✅ gazelle (gazelle-gha-debug) · ⏱ 26.7s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Gazelle complete (clean)
  • ✅ gazelle (gazelle-gha-ephemeral) · ⏱ 52.1s · 🐙 GitHub Actions · ☑️ Check
    💬 Gazelle complete (clean)
  • ✅ lint (lint-gha) · ⏱ 21.6s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Lint complete (clean)
  • ✅ lint (lint-gha-debug) · ⏱ 21.6s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Lint complete (clean)
  • ✅ test (test-gha) · ⏱ 33.7s · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Bazel test complete (25/25 passed · 25 cached)
  • ✅ test (test-gha-debug) · ⏱ 3m · ✨ Aspect · 🐙 GitHub Actions · ☑️ Check
    💬 Bazel test complete (25/25 passed · 25 cached)

🔁 Reproduce

⚠️ delivery (delivery-gha · delivery-gha-debug)

# CI ran --mode=selective; --mode=always --track-state=false for off-runner with no state backend.
cd examples/deliverable
aspect delivery --mode=always --track-state=false --dry-run=true

Install aspect: docs.aspect.build/cli/install


⏱ Last updated Sat May 30 04:56:05 UTC 2026 · 📊 GitHub API quota 2,897/15,000 (19% used, resets in 1m)
🚀 Powered by Aspect CLI (v0.0.0-dev)  |  Aspect Build · X · LinkedIn · YouTube

@gregmagolan gregmagolan force-pushed the feat/format-scope-all-git-ls-files branch 2 times, most recently from 6437a06 to 8bd55f5 Compare May 30, 2026 04:48
gregmagolan and others added 14 commits May 29, 2026 21:48
In scope=all mode, enumerate files via `git ls-files --cached --others
--exclude-standard` before passing them to the formatter, instead of
letting the formatter self-discover. This mirrors what format_multirun
targets do internally, and fixes two issues:

- Bare binaries like @buildifier_prebuilt//buildifier would previously
  walk the filesystem and potentially visit gitignored generated files.
- --include-pattern and --exclude-pattern had no effect in scope=all;
  they are now applied to the git ls-files result before the formatter
  is invoked.

Also adds .gitattributes filtering (rules-lint-ignored, linguist-generated,
gitlab-generated) and batched invocations to stay under OS arg limits,
both mirroring the semantics of rules_lint's ls-files function.

When git is unavailable the existing fallback (formatter self-discovers,
format_args=[]) is preserved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…arden batching

- Extract _apply_pattern_filters() helper; replace duplicated include/exclude
  pattern blocks in scope=changed and scope=all branches
- Switch _filter_git_attributes from a fixed 500-file count to byte-budget
  batching via _chunk_files(), aligning with the formatter spawn loop and
  eliminating a latent ARG_MAX risk on repos with long paths
- Reorder helpers into dependency order: _chunk_files → _filter_git_attributes
  → _git_ls_files
- return [[]] instead of [files] in _chunk_files empty-input branch (clearer intent)
- Fix misleading comment "Split from the right" (code uses left split + parts[:-2])
- Fix inaccurate "same as format_multirun's internal behavior" in scope=all comment
- Fix "staged for deletion" → "deleted with or without staging" in _git_ls_files
- Fix else-branch comment to cover both scope=all-git-unavailable and
  scope=changed-degraded paths
- Update scope=changed degradation warn() text; "equivalent to --scope=all" is
  no longer accurate since scope=all now does git ls-files enumeration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ative paths

Address two issues raised in code review:

1. Bare formatter regression: passing all tracked files to a raw formatter like
   @buildifier_prebuilt//buildifier causes it to attempt parsing non-Starlark files
   (README.md etc.) as Starlark. Scope the git ls-files enumeration to only run
   when --include-pattern is specified; without patterns the formatter self-discovers
   its own file types as before.

2. Path consistency: git ls-files without --full-name returns CWD-relative paths
   from a subdirectory, so --exclude-pattern='pkg/generated/**' would fail to match
   'generated/foo.go' when running from pkg/. Add --full-name and run all git
   commands (ls-files, check-attr) from the git root so paths are always repo-relative,
   consistent with the documented behavior of --include-pattern / --exclude-pattern.
   Call _normalize_files_for_formatter to convert back to CWD-relative before spawning.

Also add _git_capture_root helper (parallel to _git_capture, runs from git root),
update --scope and --include-pattern arg descriptions to document the new behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Revert the include-pattern guard added in the previous commit. The main
point of scope=all git ls-files enumeration is to scope the formatter to
tracked files and exclude gitignored content — this applies regardless of
whether --include-pattern is set.

Raw formatters that only handle specific file types (e.g. buildifier for
.bzl/BUILD files) should pair --scope=all with --include-pattern to scope
the tracked-file list to their supported extensions.

Update --scope, --include-pattern, and --exclude-pattern descriptions to
reflect the new scope=all behavior accurately.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-scope=all to self-discovery

--scope=all continues to let the formatter walk the filesystem and discover
its own files (original behavior).

New --scope=git enumerates non-gitignored files via git ls-files before
passing them to the formatter. This scopes the run to files git knows about
(respecting .gitignore and .gitattributes) and makes --include-pattern /
--exclude-pattern effective with consistent repo-relative path matching.
Raw formatters that handle only specific file types (e.g. buildifier for
.bzl/BUILD files) should pair --scope=git with --include-pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bugs fixed:
- format_results.axl: template hardcoded "--scope=changed" in the
  "no files to format" message; now uses the actual scope value from
  data["format"]["scope"], which can be "changed", "all", or "git"
- scope=git: no warning was emitted when _git_ls_files() returned None
  (git unavailable); add warn() consistent with scope=changed fallback
- _git_ls_files: non-ASCII filenames could be C-quoted by git when
  core.quotePath=true; add -c core.quotePath=false to _git_capture_root
  so all root-based git commands emit unquoted paths

Simplifications:
- Extract _git_capture_in(ctx, work_dir, *args) as the single
  implementation; _git_capture and _git_capture_root become one-liner
  wrappers, eliminating the duplicated process-builder chain
- Remove redundant `cwd == git_root or` in path_scope condition (the
  startswith check already covers it)
- Remove verbose scope=git comment block that duplicated _git_ls_files
  docstring and --scope arg description

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed, all]

"tracked" pairs more naturally with "changed" — both describe the set
of files by their git state. Reorder values narrowest-to-broadest:
changed (PR diff) → tracked (all non-gitignored) → all (self-discovery).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… valid Starlark)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… from --scope=tracked

Files marked 'aspect-format-ignored' in .gitattributes are excluded from
git ls-files enumeration in --scope=tracked, giving users a first-party
escape hatch without depending on rules_lint attributes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… fix lint→format wording

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The variable holds only file paths passed as positional args to the
formatter, never flags. The rename prevents future confusion if formatter
flags are added — they must not go through the chunking path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
format_results_test.axl:
- Scenario 17: scope=tracked with 15 affected files — verifies the scope
  value renders correctly in the Configuration and body sections
- Scenario 18: scope=tracked + no_files_to_format — verifies the template
  bug fix: body now says "--scope=tracked" instead of hardcoded "--scope=changed"
- New loop invariant: when no_files_to_format is set and scope != "changed",
  assert the rendered body contains "--scope=<actual scope>"

.buildkite/pipeline.yaml:
- buildifier-task and format-task each gain back-to-back --scope=tracked
  and --scope=all runs (both ASPECT_DEBUG=1 and normal variants) so all
  three scope values are exercised on every CI push

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub Actions (ephemeral runners), GitHub Actions (Workflows runners),
CircleCI, and GitLab all gain back-to-back --scope=tracked and --scope=all
runs alongside the existing default (--scope=changed) for both the
format-task and buildifier-task jobs, covering ASPECT_DEBUG=1 and normal
variants on every platform.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gregmagolan gregmagolan force-pushed the feat/format-scope-all-git-ls-files branch from 8bd55f5 to 7f92d0b Compare May 30, 2026 04:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants