feat(format): enumerate tracked files via git ls-files in --scope=all#1155
Open
gregmagolan wants to merge 14 commits into
Open
feat(format): enumerate tracked files via git ls-files in --scope=all#1155gregmagolan wants to merge 14 commits into
gregmagolan wants to merge 14 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f6b113636
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
thesayyn
approved these changes
May 28, 2026
659fb9c to
ecb46fe
Compare
✨ Aspect Workflows Tasks📅 Sat May 30 04:50:16 UTC 2026
|
6437a06 to
8bd55f5
Compare
In scope=all mode, enumerate files via `git ls-files --cached --others --exclude-standard` before passing them to the formatter, instead of letting the formatter self-discover. This mirrors what format_multirun targets do internally, and fixes two issues: - Bare binaries like @buildifier_prebuilt//buildifier would previously walk the filesystem and potentially visit gitignored generated files. - --include-pattern and --exclude-pattern had no effect in scope=all; they are now applied to the git ls-files result before the formatter is invoked. Also adds .gitattributes filtering (rules-lint-ignored, linguist-generated, gitlab-generated) and batched invocations to stay under OS arg limits, both mirroring the semantics of rules_lint's ls-files function. When git is unavailable the existing fallback (formatter self-discovers, format_args=[]) is preserved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…arden batching - Extract _apply_pattern_filters() helper; replace duplicated include/exclude pattern blocks in scope=changed and scope=all branches - Switch _filter_git_attributes from a fixed 500-file count to byte-budget batching via _chunk_files(), aligning with the formatter spawn loop and eliminating a latent ARG_MAX risk on repos with long paths - Reorder helpers into dependency order: _chunk_files → _filter_git_attributes → _git_ls_files - return [[]] instead of [files] in _chunk_files empty-input branch (clearer intent) - Fix misleading comment "Split from the right" (code uses left split + parts[:-2]) - Fix inaccurate "same as format_multirun's internal behavior" in scope=all comment - Fix "staged for deletion" → "deleted with or without staging" in _git_ls_files - Fix else-branch comment to cover both scope=all-git-unavailable and scope=changed-degraded paths - Update scope=changed degradation warn() text; "equivalent to --scope=all" is no longer accurate since scope=all now does git ls-files enumeration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ative paths Address two issues raised in code review: 1. Bare formatter regression: passing all tracked files to a raw formatter like @buildifier_prebuilt//buildifier causes it to attempt parsing non-Starlark files (README.md etc.) as Starlark. Scope the git ls-files enumeration to only run when --include-pattern is specified; without patterns the formatter self-discovers its own file types as before. 2. Path consistency: git ls-files without --full-name returns CWD-relative paths from a subdirectory, so --exclude-pattern='pkg/generated/**' would fail to match 'generated/foo.go' when running from pkg/. Add --full-name and run all git commands (ls-files, check-attr) from the git root so paths are always repo-relative, consistent with the documented behavior of --include-pattern / --exclude-pattern. Call _normalize_files_for_formatter to convert back to CWD-relative before spawning. Also add _git_capture_root helper (parallel to _git_capture, runs from git root), update --scope and --include-pattern arg descriptions to document the new behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Revert the include-pattern guard added in the previous commit. The main point of scope=all git ls-files enumeration is to scope the formatter to tracked files and exclude gitignored content — this applies regardless of whether --include-pattern is set. Raw formatters that only handle specific file types (e.g. buildifier for .bzl/BUILD files) should pair --scope=all with --include-pattern to scope the tracked-file list to their supported extensions. Update --scope, --include-pattern, and --exclude-pattern descriptions to reflect the new scope=all behavior accurately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-scope=all to self-discovery --scope=all continues to let the formatter walk the filesystem and discover its own files (original behavior). New --scope=git enumerates non-gitignored files via git ls-files before passing them to the formatter. This scopes the run to files git knows about (respecting .gitignore and .gitattributes) and makes --include-pattern / --exclude-pattern effective with consistent repo-relative path matching. Raw formatters that handle only specific file types (e.g. buildifier for .bzl/BUILD files) should pair --scope=git with --include-pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bugs fixed: - format_results.axl: template hardcoded "--scope=changed" in the "no files to format" message; now uses the actual scope value from data["format"]["scope"], which can be "changed", "all", or "git" - scope=git: no warning was emitted when _git_ls_files() returned None (git unavailable); add warn() consistent with scope=changed fallback - _git_ls_files: non-ASCII filenames could be C-quoted by git when core.quotePath=true; add -c core.quotePath=false to _git_capture_root so all root-based git commands emit unquoted paths Simplifications: - Extract _git_capture_in(ctx, work_dir, *args) as the single implementation; _git_capture and _git_capture_root become one-liner wrappers, eliminating the duplicated process-builder chain - Remove redundant `cwd == git_root or` in path_scope condition (the startswith check already covers it) - Remove verbose scope=git comment block that duplicated _git_ls_files docstring and --scope arg description Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed, all] "tracked" pairs more naturally with "changed" — both describe the set of files by their git state. Reorder values narrowest-to-broadest: changed (PR diff) → tracked (all non-gitignored) → all (self-discovery). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… valid Starlark) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… from --scope=tracked Files marked 'aspect-format-ignored' in .gitattributes are excluded from git ls-files enumeration in --scope=tracked, giving users a first-party escape hatch without depending on rules_lint attributes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… fix lint→format wording Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The variable holds only file paths passed as positional args to the formatter, never flags. The rename prevents future confusion if formatter flags are added — they must not go through the chunking path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
format_results_test.axl: - Scenario 17: scope=tracked with 15 affected files — verifies the scope value renders correctly in the Configuration and body sections - Scenario 18: scope=tracked + no_files_to_format — verifies the template bug fix: body now says "--scope=tracked" instead of hardcoded "--scope=changed" - New loop invariant: when no_files_to_format is set and scope != "changed", assert the rendered body contains "--scope=<actual scope>" .buildkite/pipeline.yaml: - buildifier-task and format-task each gain back-to-back --scope=tracked and --scope=all runs (both ASPECT_DEBUG=1 and normal variants) so all three scope values are exercised on every CI push Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub Actions (ephemeral runners), GitHub Actions (Workflows runners), CircleCI, and GitLab all gain back-to-back --scope=tracked and --scope=all runs alongside the existing default (--scope=changed) for both the format-task and buildifier-task jobs, covering ASPECT_DEBUG=1 and normal variants on every platform. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8bd55f5 to
7f92d0b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
--scope=gitto enumerate all non-gitignored tracked files viagit ls-filesbefore invoking the formatter, instead of having the formatter self-discover.--scope=all(formatter self-discovery) is unchanged.What
--scope=gitdoes:git ls-files --cached --others --exclude-standard --full-namefrom the git root, scoped to the invoking directory when run from a subdirectory.gitattributesfiltering (rules-lint-ignored,linguist-generated,gitlab-generated— same attributes as rules_lint)--include-pattern/--exclude-patternagainst repo-relative paths before invoking the formatter (patterns are now effective in scope=all-equivalent mode)format.shcap)Why: Bare formatter binaries (e.g.
@buildifier_prebuilt//buildifier) in--scope=allmode walk the filesystem and may visit gitignored generated files.--scope=gitrestricts the run to files git knows about. Raw formatters that only handle specific file types should pair--scope=gitwith--include-pattern(e.g.--include-pattern='**/*.bzl'for buildifier).Implementation notes:
_chunk_files,_filter_git_attributes,_git_ls_files,_apply_pattern_filters(extracted from duplicate pattern-filter blocks inscope=changedand the newscope=gitbranch)_git_capture_rootruns git from the git root; uses-c core.quotePath=falseso non-ASCII filenames are never C-quoted in output_git_capture/_git_capture_rootboth delegate to_git_capture_into avoid duplicating the process-builder chainlib/format_results.axl: fixed template that hardcoded--scope=changedin the "no files to format" message; now uses the actual scope valueChanges are visible to end-users: yes
Suggested release notes:
--scope=gitforaspect format: enumerates all non-gitignored files viagit ls-files(respecting.gitignoreand.gitattributes) before invoking the formatter. Makes--include-pattern/--exclude-patterneffective in this mode. Bare formatters like buildifier should pair--scope=git --include-pattern='**/*.bzl' --include-pattern='**/BUILD*'to scope to their supported file types.Test plan
aspect format --scope=gitin a repo with gitignored generated BUILD files — they should not be formattedaspect format --scope=git --include-pattern='**/*.bzl'— only .bzl files formattedlinguist-generatedin.gitattributes— should be excluded from--scope=gitaspect format --scope=all— unchanged self-discovery behavioraspect format --scope=gitoutside a git repo — warning emitted, formatter falls back to self-discovery