Skip to content

Avoid detecting encoding comments in documentation text#1727

Merged
st0012 merged 1 commit into
ruby:masterfrom
st0012:codex/avoid-encoding-comment-doc-text
Jun 15, 2026
Merged

Avoid detecting encoding comments in documentation text#1727
st0012 merged 1 commit into
ruby:masterfrom
st0012:codex/avoid-encoding-comment-doc-text

Conversation

@st0012

@st0012 st0012 commented Jun 14, 2026

Copy link
Copy Markdown
Member

RDoc currently reads file content with RDoc::Encoding.read_file during RDoc::RDoc#parse_file, before parser selection. The current header regexp starts with ^ and permits arbitrary comment prose before coding:, so documentation text can be interpreted as an encoding directive before the selected parser handles the file (current regexp).

Ruby documents ^ as the beginning of a line and \A as the beginning of a string, and documents magic comments as top-level directives with plain encoding / coding syntax or Emacs-style -*- ... -*- syntax (Regexp anchors, magic comments). This PR updates RDoc header detection to stay at the file header and limits the coding: branch to those magic-comment forms (updated regexp).

The regression test covers documentation text that mentions #encoding: Returns, Encoding::US_ASCII.name, and a later prose # coding: line so those examples are treated as documentation text instead of encoding declarations (test fixture).

@st0012 st0012 requested a deployment to fork-preview-protection June 14, 2026 16:29 — with GitHub Actions Waiting
@st0012 st0012 added the bug label Jun 14, 2026
@st0012 st0012 marked this pull request as ready for review June 14, 2026 16:33
Copilot AI review requested due to automatic review settings June 14, 2026 16:33

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens RDoc::Encoding’s header parsing so encoding “magic comments” are only detected at the start of a file and only in recognized magic-comment forms, preventing documentation prose from being misinterpreted as an encoding directive during RDoc::RDoc#parse_file.

Changes:

  • Anchor header detection to the beginning of the string (\A) to avoid matching encoding directives mid-file.
  • Restrict the coding:/encoding: branch to magic-comment forms (including Emacs -*- ... -*- style) rather than arbitrary comment prose containing coding: later in the line.
  • Add a regression test ensuring documentation text mentioning #encoding: / Encoding::... / later # coding: is not treated as an encoding declaration.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
lib/rdoc/encoding.rb Refines HEADER_REGEXP to match encoding directives only in the file header and only in magic-comment formats.
test/rdoc/rdoc_encoding_test.rb Adds a regression test for documentation-like comment text that previously could be misdetected as an encoding directive.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kou kou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment thread lib/rdoc/encoding.rb
@st0012 st0012 force-pushed the codex/avoid-encoding-comment-doc-text branch from 428bb8c to b06087f Compare June 15, 2026 11:07
@st0012 st0012 requested a deployment to fork-preview-protection June 15, 2026 11:07 — with GitHub Actions Waiting
@st0012 st0012 merged commit dd5acdb into ruby:master Jun 15, 2026
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants