Skip to content

Add Java low-level tool definition E2E test and skill [1/6]#1721

Merged
edburns merged 3 commits into
mainfrom
edburns/1682-01-java-new-test-and-skill
Jun 18, 2026
Merged

Add Java low-level tool definition E2E test and skill [1/6]#1721
edburns merged 3 commits into
mainfrom
edburns/1682-01-java-new-test-and-skill

Conversation

@edburns

@edburns edburns commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a new Java failsafe integration test (LowLevelToolDefinitionIT) and the accompanying replay proxy YAML snapshot that exercise the current explicit tool-definition APIs.

This is the first of six focused PRs that break apart the changes originally combined in #1692. It does not fix issue #1682; rather it establishes baseline E2E coverage of the low-level tool-definition API before the ergonomic annotations are introduced.

Changes

New: Java E2E integration test

  • \java/src/test/java/com/github/copilot/LowLevelToolDefinitionIT.java
    Covers \CopilotTool.create, \createOverride, \getArgumentsAs(record),
    \getArguments(), available-tools filtering (\custom:*\ + \�uiltin:web_fetch),
    and mutable handler state (\currentPhase) asserted after tool execution.

New: Replay snapshot

New: Copilot skill

  • .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md\ and \�xamples.md
    Packages the knowledge of creating net-new Java E2E integration tests with
    handcrafted YAML snapshots into a reusable Copilot skill.

Fix: abort snapshot

  • \ est/snapshots/abort/should_abort_during_active_streaming.yaml
    Adds a second conversation entry to handle the cleared-history recovery
    code path that can occur after an abort during active streaming.

Related

Add a new Java failsafe integration test and replay snapshot that exercise
the current explicit tool-definition APIs before ergonomic annotations are
added. Related to issue #1682 but does not fix #1682.

Changes:
- Add \LowLevelToolDefinitionIT\ covering \create\, \createOverride\,
  \getArgumentsAs(record)\, \getArguments()\, and \ToolSet\ available tools
- Add \	est/snapshots/tools/low_level_tool_definition.yaml\ with multi-turn
  tool call and final response replay conversations
- Add \.github/skills/new-java-e2e-test-yaml-and-test\ skill documenting
  the workflow for creating new Java E2E tests with handcrafted YAML snapshots
- Fix \	est/snapshots/abort/should_abort_during_active_streaming.yaml\ to
  handle cleared-history recovery request (adds second conversation entry for
  the 2-message recovery code path)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 18, 2026 17:46
@edburns edburns requested a review from a team as a code owner June 18, 2026 17:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds baseline Java E2E (failsafe) coverage for the current “low-level” tool-definition APIs, along with a new replay-proxy snapshot and a reusable Copilot skill documenting how to add Java E2E tests backed by handcrafted YAML snapshots.

Changes:

  • Added a new Java failsafe integration test that defines custom tools via ToolDefinition.create(...) / createOverride(...) and validates tool execution via a replay snapshot.
  • Added a new replay-proxy YAML snapshot for the low-level tool-definition scenario.
  • Added a new repository Copilot skill documenting the workflow for adding Java E2E tests + YAML snapshots.
  • Extended an existing abort snapshot with an additional conversation entry for a recovery path.
Show a summary per file
File Description
test/snapshots/tools/low_level_tool_definition.yaml New replay snapshot for low-level tool definition E2E flow (custom tool calls + results).
test/snapshots/abort/should_abort_during_active_streaming.yaml Adds an additional conversation entry to cover an abort recovery path.
java/src/test/java/com/github/copilot/LowLevelToolDefinitionIT.java New Java failsafe IT that configures the proxy snapshot and exercises low-level tool-definition APIs.
.github/skills/new-java-e2e-test-yaml-and-test/SKILL.md New Copilot skill describing how to add Java E2E tests and craft YAML snapshots.
.github/skills/new-java-e2e-test-yaml-and-test/examples.md Examples supporting the new skill documentation.

Copilot's findings

  • Files reviewed: 5/5 changed files
  • Comments generated: 4

Comment thread java/src/test/java/com/github/copilot/LowLevelToolDefinitionIT.java
Comment thread .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md Outdated
Comment thread .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md Outdated
Comment thread .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md Outdated
@github-actions

This comment has been minimized.

edburns and others added 2 commits June 18, 2026 15:07
- Validate and assert search_items keyword in LowLevelToolDefinitionIT so getArguments() is meaningfully exercised.

- Correct skill docs to require explicit snapshot base names (no camelCase to snake_case conversion assumption).

- Correct replay matching description to 'next assistant message after matched request prefix' semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
modified:   .github/skills/java-coding-skill/SKILL.md

- While working on #1721, I discovered and hereby fix this important omission.

Signed-off-by: Ed Burns <edburns@microsoft.com>
@github-actions

Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review ✅

I reviewed this PR against all six SDK implementations (Node.js, Python, Go, .NET, Rust, Java) for cross-language consistency.

Summary

No significant cross-SDK consistency gaps found. This PR is well-structured as part of an intentional incremental series.


Changes analyzed

File Assessment
java/.../LowLevelToolDefinitionIT.java New Java E2E test — snapshot shared for PRs #2#6
test/snapshots/tools/low_level_tool_definition.yaml Shared snapshot for all 6 SDKs ✅
test/snapshots/abort/should_abort_during_active_streaming.yaml Backward-compatible fix for cleared-history recovery path ✅
.github/skills/new-java-e2e-test-yaml-and-test/ Java-specific Copilot skill — no SDK parity needed ✅

Feature parity check for APIs exercised in the new Java test

All APIs under test have consistent equivalents across SDKs (accounting for language idioms):

Java API Node.js Go .NET Python Rust
ToolDefinition.create(name, desc, schemaMap, handler) defineTool(name, { parameters: Record<string,unknown>, handler }) Tool{ Parameters: map[string]any, Handler } AIFunctionDeclaration Tool(name, desc, parameters: dict, handler) define_tool(name, desc, handler)
ToolDefinition.createOverride(...) defineTool("grep", { overridesBuiltInTool: true }) Tool{ OverridesBuiltInTool: true } CopilotTool.OverridesBuiltInToolKey overrides_built_in_tool=True Tool::with_overrides_built_in_tool(true)
new ToolSet().addCustom("*").addBuiltIn("web_fetch") new ToolSet().addCustom("*").addBuiltIn(...) []string{"custom:*", "builtin:web_fetch"} (idiomatic Go) new ToolSet().AddCustom("*").AddBuiltIn(...) ToolSet().add_custom("*").add_builtin(...) raw vec of strings

Abort snapshot fix (lines 31–37 added)

This fix is additive — it adds a second conversation entry to handle the cleared-history recovery code path. All SDKs with abort tests (Go, Node.js, Python, .NET, Rust) share this snapshot. The change is backward-compatible: the first conversation still handles the normal case; the second is a fallback for when session history is cleared after abort. No action needed in other SDKs.


The incremental PR structure (6 PRs, one per SDK) is a sound approach: the shared snapshot is already in place so PRs #2#6 can each add their equivalent test independently.

Generated by SDK Consistency Review Agent for issue #1721 · sonnet46 1.9M ·

edburns added a commit that referenced this pull request Jun 18, 2026
* On branch edburns/java-add-spotless-to-java-coding-skill
modified:   .github/skills/java-coding-skill/SKILL.md

- While working on #1721, I discovered and hereby fix this important omission.

Signed-off-by: Ed Burns <edburns@microsoft.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Signed-off-by: Ed Burns <edburns@microsoft.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@edburns edburns merged commit 0763c09 into main Jun 18, 2026
40 checks passed
@edburns edburns deleted the edburns/1682-01-java-new-test-and-skill branch June 18, 2026 19:54
edburns added a commit that referenced this pull request Jun 18, 2026
Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
* Add Node.js low-level tool-definition E2E test

Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix Node.js PR formatting and scope

- Apply Prettier formatting to tools.e2e.test.ts so Node ubuntu format check passes.

- Drop session lifecycle carryover from this PR by restoring Node session lifecycle files to upstream/main content, keeping this PR focused on low-level tool-definition coverage.

Related to issue #1682 but does not fix #1682.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
* Add Go low-level tool-definition E2E test

Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address Go PR review suggestions for low-level tool test

Synchronize handler-updated state with a mutex and move keyword assertion to the main test goroutine to avoid calling t.Fatalf from a tool handler goroutine.

Related to issue #1682 but does not fix #1682.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
* Add .NET low-level tool-definition E2E test

Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix .NET session lifecycle replay mismatch in PR 1728

Restore the second lifecycle prompt to 'Say world' to match the existing session_lifecycle snapshot and avoid replay cache misses in CI.

Related to issue #1682 but does not fix #1682.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
edburns added a commit that referenced this pull request Jun 18, 2026
* Add Rust low-level tool-definition E2E test

Related to issue #1682 but does not fix #1682.

Align low_level_tool_definition coverage with PR #1721 snapshot behavior by only defining tools exercised by the shared snapshot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: revert session_lifecycle.rs Say hi -> Say world to match snapshot

The snapshot expects 'Say world' but the branch had changed it to 'Say hi',
causing 'No cached response found' failures across all three OS variants.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants