fix(log_parser_bootstrap): guard Claude guardrail with safeOutputEntriesCount escape hatch by Copilot · Pull Request #41886 · github/gh-aw

Copilot · 2026-06-27T13:42:39Z

The Claude log_parser_bootstrap guardrail was calling setFailed on empty logEntries with no completion escape hatch — marking runs as failed even when the agent completed and emitted safe outputs. This was caused by a sandbox/firewall teardown race leaving agent-stdio.log unreadable for the parse step, intermittently hitting Avenger and Daily Rendering Scripts Verifier.

Changes

`log_parser_bootstrap.cjs`

Guardrail escape hatch: When safeOutputEntriesCount > 0, downgrade setFailed → core.warning for the Claude empty-logEntries guard. Mirrors the existing MCP failure handling pattern. setFailed is preserved for the zero-safe-outputs case (true startup failure).

if (parserName === "Claude" && (!logEntries || logEntries.length === 0)) {
  if (safeOutputEntriesCount > 0) {
    core.warning(`Claude produced no structured log entries, but agent completed with ${safeOutputEntriesCount} safe output … — treating as non-fatal post-completion infrastructure failure`);
  } else {
    core.setFailed(`${ERR_CONFIG}: Claude execution failed: no structured log entries …`);
  }
}

Hygiene: Suppress the misleading Claude log parsed successfully message on the empty-logEntries fallback path — it was printing immediately before setFailed, creating a contradictory log sequence.

`log_parser_bootstrap.test.cjs`

Three new cases covering: empty logEntries + safe outputs → warning only; empty logEntries + no safe outputs → setFailed; empty logEntries → no "parsed successfully" in core.info.

…iesCount escape hatch - [x] Fix Claude guardrail: gate `setFailed` on `safeOutputEntriesCount === 0`; downgrade to `core.warning` when safe outputs exist - [x] Suppress misleading 'parsed successfully' message for Claude on empty-logEntries fallback path - [x] Add 3 test cases: Claude + empty logEntries + safe outputs → warning (no setFailed); Claude + empty logEntries + no safe outputs → setFailed; Claude + empty logEntries → no 'parsed successfully' message Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-06-27T14:39:46Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #41886 does not have the 'implementation' label and has 0 new lines of code in business logic directories.

github-actions · 2026-06-27T14:39:49Z

✅ Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-27T14:39:53Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-06-27T14:40:03Z

✅ PR Code Quality Reviewer completed the code quality review.

Copilot

Pull request overview

This PR adjusts the log_parser_bootstrap Claude guardrail so that an empty logEntries result does not fail the run when there is evidence the agent completed successfully (i.e., safe outputs were emitted). This prevents intermittent false-negative failures caused by post-completion infrastructure teardown races that can make the structured log unreadable.

Changes:

Downgrade the Claude “empty structured log entries” guardrail from core.setFailed to core.warning when safeOutputEntriesCount > 0, while preserving failure when there are zero safe outputs.
Suppress the misleading ${parserName} log parsed successfully info message on the Claude empty-logEntries fallback path.
Add/extend unit tests covering the new warning vs failure behavior and the suppressed success-message behavior.

Show a summary per file

File	Description
actions/setup/js/log_parser_bootstrap.cjs	Adds a safe-outputs-based escape hatch for the Claude empty-`logEntries` guardrail and avoids contradictory “parsed successfully” messaging on the fallback path.
actions/setup/js/log_parser_bootstrap.test.cjs	Adds test cases asserting the new non-fatal warning behavior (when safe outputs exist), preserved failure behavior (when safe outputs do not exist), and the absence of the “parsed successfully” log for Claude on the fallback path.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 0
Review effort level: Low

github-actions · 2026-06-27T14:45:20Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 90/100 — Excellent

Analyzed 3 test(s): 3 design, 0 implementation, 0 guideline violation(s).

📊 Metrics & Test Classification (3 tests analyzed)

Metric	Value
New/modified tests analyzed	3
✅ Design tests (behavioral contracts)	3 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	3 (100%)
Duplicate test clusters	0
Test inflation detected	YES — 3.24:1 (55 test lines / 17 prod lines)
🚨 Coding-guideline violations	0

Test	File	Classification	Issues Detected
`should warn (non-fatal) when Claude has empty logEntries but safe outputs exist`	`log_parser_bootstrap.test.cjs:~206`	✅ Design	—
`should fail when Claude has empty logEntries and no safe outputs (startup failure)`	`log_parser_bootstrap.test.cjs:~225`	✅ Design	—
`should not print parsed successfully for Claude when logEntries is empty`	`log_parser_bootstrap.test.cjs:~240`	✅ Design	—

Go: 0 (*_test.go); JavaScript: 3 (*.test.cjs, *.test.js). No other languages detected.

⚠️ Flagged Tests — Inflation Note (1 issue)

Test file inflation (non-blocking): log_parser_bootstrap.test.cjs added 55 lines vs 17 production lines (3.24:1 ratio, threshold 2:1). This is typical for a fix introducing branching logic — the 3 tests each cover a distinct observable outcome of the new conditional (warn vs. fail vs. suppress-info-message), so the extra test lines reflect thorough behavioral coverage rather than padding. No action required.

Mock usage note: mockCore mocks global.core (GitHub Actions external I/O — legitimate). mockParseLog = vi.fn() is a dependency-injection parameter mock — not an internal business-logic bypass. All assertions are on observable outputs (setFailed, warning, info calls).

Verdict

✅ Check passed. 0% implementation tests (threshold: 30%). All 3 new tests verify behavioral contracts of the safeOutputEntriesCount escape hatch: the non-fatal warning path, the fatal-failure path, and the suppressed success message — full branch coverage of the new conditional logic.

🧪 Test quality analysis by Test Quality Sentinel · 34 AIC · ⌖ 12.3 AIC · ⊞ 8.4K · ◷

github-actions

✅ Test Quality Sentinel: 90/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%).

github-actions

Non-blocking observations only — logic is sound

The escape-hatch design (safeOutputEntriesCount > 0 → warning instead of setFailed) correctly mirrors the existing MCP-failure pattern and addresses the described teardown-race root cause. No blocking issues found.

Findings summary

Test coverage gaps (both non-blocking):

Third test ("should not print...") only asserts the absence of the success message; it does not assert setFailed was not called or warning was called. Tests 1 and 2 provide that coverage, but the third test stands on weak ground as a standalone regression guard.
All three new tests use logEntries: [] exclusively; the !logEntries (null) branch of the guardrail is never exercised by the new suite. Both paths emit the same warning, so this is a coverage gap, not a correctness gap.

Design note (acknowledged):
The escape hatch assumes GH_AW_SAFE_OUTPUTS always reflects the current run. In the GitHub Actions environment this holds, and it is the same assumption made by the pre-existing MCP-failure escape hatch. Worth having a comment documenting that assumption explicitly so the intent is clear to future readers.

🔎 Code quality review by PR Code Quality Reviewer · 76.4 AIC · ⌖ 7.08 AIC · ⊞ 5.2K

github-actions · 2026-06-27T14:49:07Z

+            const mockParseLog = vi.fn().mockReturnValue({ markdown: "## Result\n", mcpFailures: [], maxTurnsHit: false, logEntries: [] });
+            runLogParser({ parseLog: mockParseLog, parserName: "Claude" });
+            const infoCalls = mockCore.info.mock.calls.map(c => c[0]);
+            expect(infoCalls.some(msg => msg.includes("Claude log parsed successfully"))).toBe(false);


Weak assertion — the escape-hatch outcome is unverified in this test, so a future regression in the safeOutputEntriesCount > 0 branch would not be caught here.

💡 Suggested improvement

The test currently only checks that "Claude log parsed successfully" is absent from core.info, which would remain true even if core.setFailed fired. Adding explicit positive/negative assertions ties this test to the behavior it is supposed to guard:

// already covered by test 1, but worth repeating here for completeness expect(mockCore.setFailed).not.toHaveBeenCalled(); expect(mockCore.warning).toHaveBeenCalledWith( expect.stringContaining("treating as non-fatal post-completion infrastructure failure") );

Test 1 does assert both of these for the same code path, so this is non-blocking — but the third test as written only validates the hygiene side-effect, not the escape-hatch logic it is co-located with.

github-actions · 2026-06-27T14:49:07Z

+            process.env.GH_AW_SAFE_OUTPUTS = safeOutputsFile;
+            // Parser returns markdown but no structured logEntries — simulates sandbox teardown
+            // race leaving agent-stdio.log unreadable after the agent completed successfully.
+            const mockParseLog = vi.fn().mockReturnValue({ markdown: "## Result\n", mcpFailures: [], maxTurnsHit: false, logEntries: [] });


Missing coverage for logEntries: null — all three new test cases use logEntries: [], leaving the !logEntries branch of the guardrail untested.

💡 Detail

The condition at log_parser_bootstrap.cjs line 341 is:

if (parserName === "Claude" && (!logEntries || logEntries.length === 0))

logEntries is initialized as null and stays null if the parser returns undefined/nothing for that field. The new tests only exercise the logEntries.length === 0 path. A test like:

const mockParseLog = vi.fn().mockReturnValue({ markdown: "## Result\n", mcpFailures: [], maxTurnsHit: false /* no logEntries key */ });

...would exercise !logEntries (null) with safe outputs present. Both code paths lead to the same warning, so this is not a correctness gap — but the null path is darker and would be worth a targeted test.

github-actions

Skills-Based Review 🧠

Applied /tdd and /diagnose — approving with minor suggestions on test completeness and observability.

📋 Key Themes & Highlights

Key Themes

Test assertion gaps: two of the three new tests are underspecified (missing complementary assertions that would catch regressions in the other branch of the new conditional)
Test pollution: process.env is mutated but not restored in finally blocks — a pre-existing pattern, but worth correcting now that three more tests follow it
Observability: the new warning message is clear but missing the log file path, which would help correlate future occurrences with storage/teardown logs

Positive Highlights

✅ Correct root-cause analysis: the escape hatch targets the right signal (safeOutputEntriesCount > 0 means the agent genuinely completed)
✅ Mirrors the existing MCP failure handling pattern — consistent with the codebase's established convention
✅ setFailed is preserved for the zero-safe-outputs case, maintaining the safety guarantee
✅ Three new test cases covering all three new paths — good coverage of the main scenarios
✅ countSafeOutputEntries already handles empty/malformed JSONL correctly (returns 0), so the escape hatch is robust by construction

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 71 AIC · ⌖ 7.42 AIC · ⊞ 6.6K

github-actions · 2026-06-27T14:49:35Z

+            delete process.env.GH_AW_SAFE_OUTPUTS;
+            const mockParseLog = vi.fn().mockReturnValue({ markdown: "## Result\n", mcpFailures: [], maxTurnsHit: false, logEntries: [] });
+            runLogParser({ parseLog: mockParseLog, parserName: "Claude" });
+            expect(mockCore.setFailed).toHaveBeenCalledWith(`${ERR_CONFIG}: Claude execution failed: no structured log entries were produced. This usually indicates a startup or configuration error before tool execution.`);


[/tdd] The startup-failure test only asserts setFailed was called, but does not verify core.warning was NOT called — making it impossible to detect a regression that accidentally fires both.

💡 Suggested addition

Add this assertion alongside the existing one:

expect(mockCore.setFailed).toHaveBeenCalledWith(`${ERR_CONFIG}: Claude execution failed: ...`); expect(mockCore.warning).not.toHaveBeenCalled(); // confirm non-fatal path was not taken

Without it, a bug that emits both warning and setFailed would pass this test.

@copilot please address this.

github-actions · 2026-06-27T14:49:35Z

+            const mockParseLog = vi.fn().mockReturnValue({ markdown: "## Result\n", mcpFailures: [], maxTurnsHit: false, logEntries: [] });
+            runLogParser({ parseLog: mockParseLog, parserName: "Claude" });
+            const infoCalls = mockCore.info.mock.calls.map(c => c[0]);
+            expect(infoCalls.some(msg => msg.includes("Claude log parsed successfully"))).toBe(false);


[/tdd] Test 3 only checks the absence of the misleading info message — it is silent about whether core.warning was actually emitted. This leaves the hygiene fix incompletely specified.

💡 Suggested addition

Add the companion positive assertion to complete the specification:

const infoCalls = mockCore.info.mock.calls.map(c => c[0]); expect(infoCalls.some(msg => msg.includes("Claude log parsed successfully"))).toBe(false); // Also verify warning WAS emitted on this path expect(mockCore.warning).toHaveBeenCalledWith(expect.stringContaining("Claude produced no structured log entries"));

Without this, the test passes even if the warning call were accidentally removed.

@copilot please address this.

github-actions · 2026-06-27T14:49:35Z

+            expect(mockCore.setFailed).not.toHaveBeenCalled();
+            expect(mockCore.warning).toHaveBeenCalledWith("Claude produced no structured log entries, but agent completed with 1 safe output entry — treating as non-fatal post-completion infrastructure failure");
+          } finally {
+            fs.unlinkSync(logFile);


[/tdd] The finally block removes the temp files but leaves process.env.GH_AW_AGENT_OUTPUT and process.env.GH_AW_SAFE_OUTPUTS set — this is a pre-existing pattern in the suite, but extending it to three more tests increases the likelihood of ordering-dependent failures.

💡 Suggested finally block

} finally { fs.unlinkSync(logFile); fs.unlinkSync(safeOutputsFile); fs.rmdirSync(tmpDir); delete process.env.GH_AW_AGENT_OUTPUT; // add delete process.env.GH_AW_SAFE_OUTPUTS; // add }

Applying the same fix to all three new tests (and ideally the existing ones) would make the suite fully order-independent.

@copilot please address this.

github-actions · 2026-06-27T14:49:35Z

-      core.setFailed(`${ERR_CONFIG}: Claude execution failed: no structured log entries were produced. This usually indicates a startup or configuration error before tool execution.`);
+      if (safeOutputEntriesCount > 0) {
+        core.warning(
+          `Claude produced no structured log entries, but agent completed with ${safeOutputEntriesCount} safe output ${safeOutputEntriesCount === 1 ? "entry" : "entries"} — treating as non-fatal post-completion infrastructure failure`


[/diagnose] The warning is clear about what happened but omits the log file path, making it harder to correlate with storage/teardown logs in a post-incident investigation.

💡 Suggested enhancement

const logFilePath = process.env.GH_AW_AGENT_OUTPUT ?? "(unknown)"; core.warning( `Claude produced no structured log entries (log: ${logFilePath}), but agent completed with ${safeOutputEntriesCount} safe output ${...} — treating as non-fatal post-completion infrastructure failure` );

The log path is already available in scope and has low overhead to include; it would meaningfully reduce triage time on future occurrences.

@copilot please address this.

gh-aw-bot · 2026-06-27T15:19:23Z

@copilot please run the pr-finisher skill, tighten up the missing assertions in the third test, cover the logEntries: null path, and restore process.env in cleanup if that is the intended pattern.

Generated by 👨‍🍳 PR Sous Chef · 69.1 AIC · ⌖ 0.98 AIC · ⊞ 17.1K · ◷

…restore process.env - Third test: add setFailed not-called + warning called assertions - Tests 1 and 3: add delete process.env.GH_AW_SAFE_OUTPUTS in finally blocks - Two new tests cover logEntries: null path (no safe outputs -> setFailed, safe outputs -> warning) 28 tests pass (was 26) Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

github-actions · 2026-06-27T17:28:01Z

Great work, @app/copilot-swe-agent 🎉 — this is a clean, targeted fix for the Claude guardrail false-failure on sandbox/firewall teardown races.

The escape hatch logic (safeOutputEntriesCount > 0 → downgrade setFailed → core.warning) mirrors the existing MCP failure handling pattern, the suppression of the contradictory "parsed successfully" log message is a nice hygiene win, and the test suite covers all the meaningful branches: empty logEntries with and without safe outputs, null logEntries for both cases, and the no-spurious-success-message assertion. The PR is focused on exactly one bug and nothing else.

This looks ready for review. ✅

Generated by ✅ Contribution Check · 360.5 AIC · ⌖ 10.1 AIC · ⊞ 6K · ◷

Initial plan

4190454

Copilot AI assigned Copilot and pelikhan Jun 27, 2026

Copilot started work on behalf of pelikhan June 27, 2026 13:43 View session

Copilot AI changed the title ~~[WIP] Fix log_parser_bootstrap to handle empty logEntries~~ fix(log_parser_bootstrap): guard Claude guardrail with safeOutputEntriesCount escape hatch Jun 27, 2026

Copilot finished work on behalf of pelikhan June 27, 2026 14:00

Copilot AI requested a review from pelikhan June 27, 2026 14:00

pelikhan marked this pull request as ready for review June 27, 2026 14:38

Copilot AI review requested due to automatic review settings June 27, 2026 14:38

Copilot started reviewing on behalf of pelikhan June 27, 2026 14:39 View session

Copilot AI reviewed Jun 27, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 27, 2026

[aw] No-Op Runs #39849

Open

github-actions Bot approved these changes Jun 27, 2026

View reviewed changes

github-actions Bot reviewed Jun 27, 2026

View reviewed changes

github-actions Bot approved these changes Jun 27, 2026

View reviewed changes

Copilot started work on behalf of gh-aw-bot June 27, 2026 15:20 View session

Copilot AI requested a review from gh-aw-bot June 27, 2026 15:50

Copilot finished work on behalf of gh-aw-bot June 27, 2026 15:50

pelikhan merged commit c7c642f into main Jun 27, 2026
14 checks passed

pelikhan deleted the copilot/aw-failures-fix-log-parser-bootstrap branch June 27, 2026 17:19

github-actions Bot mentioned this pull request Jun 27, 2026

[Contribution Check Report] Contribution Check — 2026-06-27 #41828

Open

Uh oh!

Conversation

Copilot AI commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

log_parser_bootstrap.cjs

log_parser_bootstrap.test.cjs

Uh oh!

github-actions Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Review details

Uh oh!

github-actions Bot commented Jun 27, 2026

🧪 Test Quality Sentinel Report

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Non-blocking observations only — logic is sound

Uh oh!

github-actions Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Skills-Based Review 🧠

Key Themes

Positive Highlights

Uh oh!

github-actions Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

gh-aw-bot commented Jun 27, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Jun 27, 2026 •

edited

Loading

`log_parser_bootstrap.cjs`

`log_parser_bootstrap.test.cjs`

github-actions Bot commented Jun 27, 2026 •

edited

Loading

github-actions Bot commented Jun 27, 2026 •

edited

Loading

github-actions Bot commented Jun 27, 2026 •

edited

Loading

github-actions Bot commented Jun 27, 2026 •

edited

Loading