Skip to content

refactor(create_pull_request): extract helpers module and add tests#33687

Open
Copilot wants to merge 6 commits into
mainfrom
copilot/refactor-create-pull-request
Open

refactor(create_pull_request): extract helpers module and add tests#33687
Copilot wants to merge 6 commits into
mainfrom
copilot/refactor-create-pull-request

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 21, 2026

create_pull_request.cjs was a 2441-line monolith with no unit test coverage on its pure utility functions, making it hard to reason about or safely modify.

Extracted to create_pull_request_helpers.cjs

14 pure helpers and 4 constants that had no dependency on the main() factory closure:

  • Label retry: LABEL_MAX_RETRIES, LABEL_INITIAL_DELAY_MS, LABEL_MAX_DELAY_MS, isLabelTransientError
  • Config parsing: parseAllowedBaseBranches, isBaseBranchAllowed, parseStringListConfig
  • Fallback issue: MANAGED_FALLBACK_ISSUE_LABEL, mergeFallbackIssueLabels, sanitizeFallbackAssignees, renderManifestProtectionFallbackBody, buildManifestProtectionCreatePrUrl
  • Utilities: summarizeListForLog, createBundleTempRef, neutralizeClosingKeywordsForIssueBody, generatePatchPreview

New: create_pull_request_helpers.test.cjs

69 unit tests covering all extracted helpers — empty inputs, casing, trimming, truncation, URL encoding, glob patterns, core.warning invocation on limit exceeded, etc.

create_pull_request.cjs

Inline definitions replaced with imports from the helpers module. No behavior changes; all 113 existing tests continue to pass.

Copilot AI and others added 2 commits May 21, 2026 03:49
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title refactor(create_pull_request): extract helpers module and add comprehensive tests refactor(create_pull_request): extract helpers module and add tests May 21, 2026
Copilot AI requested a review from pelikhan May 21, 2026 03:53
@pelikhan pelikhan marked this pull request as ready for review May 21, 2026 04:55
Copilot AI review requested due to automatic review settings May 21, 2026 04:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the create_pull_request handler by extracting reusable helper logic into a dedicated module and adding unit tests for the extracted helpers to improve maintainability and safety of future changes.

Changes:

  • Extracted label retry/config parsing/fallback-issue/utility helper logic into create_pull_request_helpers.cjs.
  • Updated create_pull_request.cjs to import the extracted helpers instead of defining them inline.
  • Added a new Vitest suite create_pull_request_helpers.test.cjs covering the extracted helpers.
Show a summary per file
File Description
actions/setup/js/create_pull_request.cjs Replaces inline helper definitions with imports from the new helpers module.
actions/setup/js/create_pull_request_helpers.cjs New helpers module containing extracted constants and helper functions.
actions/setup/js/create_pull_request_helpers.test.cjs New unit tests for the extracted helpers.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 3

Comment on lines +257 to +263
module.exports = {
MANAGED_FALLBACK_ISSUE_LABEL,
LABEL_MAX_RETRIES,
LABEL_INITIAL_DELAY_MS,
LABEL_MAX_DELAY_MS,
summarizeListForLog,
createBundleTempRef,
Comment on lines +114 to +118
it("produces different refs on consecutive calls (collision avoidance)", () => {
const ref1 = createBundleTempRef("same-branch");
const ref2 = createBundleTempRef("same-branch");
// Extremely unlikely to collide; test documents intent
expect(ref1).not.toBe(ref2);
Comment on lines +459 to +461
const url = buildManifestProtectionCreatePrUrl("https://github.com", repoParts, "main", "release/v2.0-rc1", "Title");
// encodePathSegments encodes each slash-delimited segment but preserves '/'
expect(url).toContain("/compare/main...release/v2.0-rc1");
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

🧪 Test Quality Sentinel completed test quality analysis.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

PR Code Quality Reviewer completed the code quality review.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #33687 does not have the 'implementation' label and has 0 new lines of code in default business logic directories (threshold: 100).

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Review Summary

This is an excellent refactoring with strong test coverage. The extraction of 13 helper functions into a separate module with 69 comprehensive unit tests significantly improves maintainability.

Key Strengths:

  • Well-organized separation of pure utilities from closure-dependent logic
  • Comprehensive test coverage with edge cases, boundary conditions, and error paths
  • Clear naming and documentation for all extracted helpers
  • No behavior changes to existing functionality (all 113 original tests still pass)

Minor Issues:

  1. The PR description claims 14 helpers were extracted, but the module exports 13 (12 functions + 4 constants minus the duplicated count)
  2. One test doesn't validate what its name claims (URL encoding test at line 461)
  3. The collision avoidance test could theoretically be flaky (though extremely unlikely)

Recommendation: Approve after updating the PR description to match the actual count of exported helpers.

🔎 Code quality review by PR Code Quality Reviewer · ● 3.4M

it("URL-encodes branch names with special characters via encodePathSegments", () => {
const url = buildManifestProtectionCreatePrUrl("https://github.com", repoParts, "main", "release/v2.0-rc1", "Title");
// encodePathSegments encodes each slash-delimited segment but preserves '/'
expect(url).toContain("/compare/main...release/v2.0-rc1");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name says it validates URL-encoding, but the branch name release/v2.0-rc1 does not contain characters that require URL encoding. Slashes are preserved by encodePathSegments, and hyphens/dots/alphanumerics do not need encoding.

Impact: The test will pass even if encoding is broken for special characters.

Suggestion: Either:

  1. Use a branch name with characters that must be encoded (e.g., feature/my branch, fix/#123, release/v2.0%)
  2. Rename the test to "preserves slashes in branch names via encodePathSegments" to match what it actually validates

@github-actions github-actions Bot mentioned this pull request May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #33687 does not have the 'implementation' label and has 0 new lines of code in default business logic directories (threshold: 100).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

PR Code Quality Reviewer completed the code quality review.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /zoom-out and /tdd based on the refactoring + test coverage addition in this PR.


Key Themes

Module Extraction (/zoom-out):

  • ✅ Clean extraction of 14 pure helpers from a 2441-line monolith
  • ✅ All extracted functions have no dependency on the main() factory closure
  • ⚠️ Mixed concerns: config parsing + label retry + fallback rendering in one module
  • 💡 Current module size (274 lines) is acceptable; consider further splits if it grows past 300 lines

Test Coverage (/tdd):

  • ✅ Comprehensive: 69 unit tests covering all extracted helpers
  • ✅ Good edge case coverage: empty inputs, whitespace, truncation, URL encoding
  • ✅ Integration tests maintained: All 113 existing tests still pass
  • ⚠️ Test names sometimes describe how (implementation) rather than what (behavior)
  • ⚠️ Missing coverage for error delegation paths (e.g., isLabelTransientError network errors)

Positive Highlights

  • Test-first thinking: Edge cases are thoroughly covered (empty inputs, trimming, limits)
  • Zero behavior change: All existing integration tests pass — refactor is safe
  • Clear extraction criteria: Only pure helpers moved, no closure dependencies
  • Defensive programming: Input validation, type coercion, limit enforcement all tested

Suggested Improvements

  1. Add JSDoc type references for better IntelliSense in editors
  2. Rename tests to describe behavior, not implementation details
  3. Test error delegation paths (e.g., isLabelTransientError with network errors)
  4. Consider module boundaries for future extractions (config vs. label retry vs. fallback)

Verdict

This is a solid refactoring that significantly improves maintainability:

  • Reduces the main module from 2441 → 2198 lines (-10%)
  • Adds unit test coverage for previously untested utilities
  • Creates a foundation for future extractions

The concerns raised are minor and don't block merge. Approving! 🚀


Learn More About These Skills:

  • /zoom-out — Broader architectural perspective
  • /tdd — Test-driven development practices

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 4.5M

}

const limitResult = tryEnforceArrayLimit(sanitized, MAX_ASSIGNEES, "assignees");
if (!limitResult.success) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Missing edge case: What happens when core.warning is unavailable or throws?

The test mocks core.warning correctly, but the production code doesn't guard against core being undefined. Consider:

if (typeof core !== 'undefined' && core.warning) {
  core.warning(`Assignees limit exceeded...`);
}

This defensive pattern appears elsewhere in the codebase for GitHub Actions optional dependencies.

const { isTransientError } = require("./error_recovery.cjs");
const { tryEnforceArrayLimit } = require("./limit_enforcement_helpers.cjs");
const { MAX_ASSIGNEES } = require("./constants.cjs");
const { encodePathSegments, renderTemplateFromFile, getPromptPath } = require("./messages_core.cjs");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] Missing JSDoc import references.

The new module uses types from messages_core.cjs (e.g., encodePathSegments, renderTemplateFromFile) but doesn't include JSDoc type references. Consider adding:

/** `@type` {typeof import("./messages_core.cjs")} */

This helps TypeScript language servers provide better IntelliSense and catches type mismatches during development.

* @param {string[]|string|undefined} allowedBaseBranchesValue
* @returns {Set<string>}
*/
function parseAllowedBaseBranches(allowedBaseBranchesValue) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] Extraction boundary: Why is config parsing grouped with label retry logic?

parseAllowedBaseBranches, isBaseBranchAllowed, and parseStringListConfig handle configuration parsing, while isLabelTransientError + constants handle label retry logic. These are separate concerns.

Consider future split:

  • create_pull_request_config_helpers.cjs — config parsing
  • create_pull_request_label_helpers.cjs — label retry logic
  • create_pull_request_fallback_helpers.cjs — fallback issue rendering

For now this is acceptable (module < 300 lines), but keep these boundaries in mind for future extractions.

const result = parseAllowedBaseBranches("main,,develop");
expect(result).toEqual(new Set(["main", "develop"]));
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Missing test: isLabelTransientError with network errors.

The test only covers the GitHub-specific race condition ("Could not resolve to a node with the global id"). It should also verify that it correctly delegates to isTransientError for:

  • Network timeouts
  • 5xx server errors
  • Rate limit errors

Add:

it("returns true for network timeout errors", () => {
  const error = new Error("ETIMEDOUT");
  expect(isLabelTransientError(error)).toBe(true);
});

it("returns all items joined when count is within limit", () => {
expect(summarizeListForLog(["a", "b", "c"])).toBe("a, b, c");
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Test describes implementation, not behavior.

Test name: "returns unique crypto suffix"

This couples the test to the implementation detail (crypto.randomBytes). Better:

it("generates unique refs for branches that sanitize to the same value", () => {
  const ref1 = createBundleTempRef("feature/foo");
  const ref2 = createBundleTempRef("feature/foo");
  expect(ref1).not.toBe(ref2);
});

Test names should read like specifications — what the code does, not how.

@@ -68,35 +84,8 @@ async function createCopilotAssignmentClient(config) {
/** @type {string} Safe output type handled by this module */
const HANDLER_TYPE = "create_pull_request";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] Comment clarifies where, not why.

// NOTE: MANAGED_FALLBACK_ISSUE_LABEL, createBundleTempRef, and summarizeListForLog
// are imported from create_pull_request_helpers.cjs above.

This comment states the obvious (imports are at the top). Better:

// Extracted helpers: Pure functions with no dependency on main() closure.
// See create_pull_request_helpers.cjs for implementation details.

Or remove entirely — the import statement is self-documenting.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 1.4% of new tests are implementation tests (threshold: 30%).

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Quality Review Summary

Overall Assessment: ✅ Approved

This is a well-executed refactoring that extracts 12 pure helper functions (plus 4 constants) from a 2441-line monolith into a focused, testable module with 69 comprehensive unit tests.

Strengths

  • Clean extraction: All extracted helpers are genuinely pure functions with no closure dependencies
  • Comprehensive test coverage: Tests cover edge cases, input validation, trimming, truncation, error paths, and boundary conditions
  • Zero behavioral changes: All 113 existing tests continue to pass
  • Good module organization: Helpers grouped logically (label retry, config parsing, fallback issues, utilities)

Review Notes

The PR has already received thorough review with 10 existing comment threads that cover the main improvement areas:

  • PR description accuracy (helper count: 14 stated vs 12 exported)
  • Test robustness (randomness-based collision test, URL encoding validation)
  • Defensive coding (core.warning guard, transient error coverage)
  • Code clarity (test naming, JSDoc type references, module boundaries)

These are all valid points that improve the code quality, but none are blocking issues. The refactoring achieves its core goal: reducing a massive file into manageable, testable units.

Recommendation

Approve and merge after addressing the existing review comments. This is the kind of incremental improvement that makes codebases more maintainable over time.


References:

🔎 Code quality review by PR Code Quality Reviewer · ● 3.1M

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

🧪 Test Quality Sentinel completed test quality analysis.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd, /zoom-out, and /improve-codebase-architecture based on this refactoring + test addition PR.

Key Themes

  • Test coverage gaps: Missing complementary "happy path" test for core.warning not being called (line 293 test file)
  • Non-deterministic test: Probabilistic collision test (line 115) could cause false failures
  • API complexity: Dual input formats (array | string) and unused optional parameters add cognitive overhead

Positive Highlights

Excellent test discipline — 69 tests covering edge cases, boundary conditions, and error paths
Clean extraction — Pure functions moved without behavior changes (113 existing tests still pass)
Descriptive test names — Each test reads as a specification of expected behavior
Proper error handling coverage — Transient errors, rate limits, and edge cases all tested

Verdict

This is solid refactoring work with strong test coverage. The suggestions above are minor refinements that would simplify the API surface and make tests more deterministic — not blockers for merge.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 6M

// ---------------------------------------------------------------------------
describe("sanitizeFallbackAssignees", () => {
beforeEach(() => {
vi.clearAllMocks();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Missing test coverage for the core.warning not being called when assignees are within the limit.

The test on line 318 verifies that core.warning is called when the limit is exceeded, but there's no complementary test asserting that the warning is not called when assignees are valid and within bounds.

Consider adding:

it("does not warn when assignee count is within MAX_ASSIGNEES", () => {
  const warnSpy = vi.spyOn(global.core, "warning");
  sanitizeFallbackAssignees(["a1", "a2", "a3"]);
  expect(warnSpy).not.toHaveBeenCalled();
});

This ensures the happy path doesn't trigger false warnings and documents expected behavior.

const set = new Set();
if (Array.isArray(allowedBaseBranchesValue)) {
allowedBaseBranchesValue
.map(branch => String(branch).trim())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] This function has a shallow interface — the limit parameter is a leaky abstraction that callers must understand.

Every call site in the codebase uses the default limit of 10 except this function's own tests. The parameter adds complexity without providing reusable value.

Consider removing the parameter entirely:

/**
 * Summarize a list for log output to avoid excessively long lines.
 * Truncates to 10 items with an overflow count.
 * `@param` {string[]} values
 * `@returns` {string}
 */
function summarizeListForLog(values) {
  const limit = 10;
  if (!Array.isArray(values) || values.length === 0) {
    return "(none)";
  }
  const preview = values.slice(0, limit).join(", ");
  return values.length > limit ? `${preview} ... and ${values.length - limit} more` : preview;
}

This simplifies the interface and makes the 10-item limit an explicit implementation detail rather than a configurable concern.

});

it("produces different refs on consecutive calls (collision avoidance)", () => {
const ref1 = createBundleTempRef("same-branch");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] This test documents intent (collision avoidance) but cannot reliably verify it due to the probabilistic nature of crypto.randomBytes.

The comment "Extremely unlikely to collide; test documents intent" is honest, but a collision would cause a false test failure unrelated to code changes. This adds maintenance burden.

Consider either:

  1. Remove the test — the 8-char hex format test on line 108 already proves uniqueness is attempted
  2. Mock crypto.randomBytes to guarantee different outputs:
it("produces different refs on consecutive calls (collision avoidance)", () => {
  const cryptoMock = vi.spyOn(crypto, "randomBytes");
  cryptoMock.mockReturnValueOnce(Buffer.from([0x11, 0x22, 0x33, 0x44]));
  cryptoMock.mockReturnValueOnce(Buffer.from([0xaa, 0xbb, 0xcc, 0xdd]));
  
  const ref1 = createBundleTempRef("same-branch");
  const ref2 = createBundleTempRef("same-branch");
  
  expect(ref1).not.toBe(ref2);
  cryptoMock.mockRestore();
});

This makes the test deterministic and documents the uniqueness guarantee explicitly.

const sanitized = assignees
.filter(a => typeof a === "string")
.map(a => a.trim())
.filter(a => a.length > 0 && a.toLowerCase() !== "copilot");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] The dual input format (array | string) creates parsing ambiguity that complicates the API.

Callers must know that "main,develop" is treated differently from ["main,develop"]. This cognitive overhead is unnecessary when the project already has parseStringListConfig to handle this normalization.

Consider accepting only string[] and pushing comma-parsing to the caller:

/**
 * Parse allowed base branch patterns from an array of branch names.
 * Use parseStringListConfig() first if input may be comma-separated.
 * `@param` {string[]} allowedBaseBranches
 * `@returns` {Set<string>}
 */
function parseAllowedBaseBranches(allowedBaseBranches) {
  const set = new Set();
  if (!Array.isArray(allowedBaseBranches)) {
    return set;
  }
  allowedBaseBranches
    .map(branch => String(branch).trim())
    .filter(Boolean)
    .forEach(branch => set.add(branch));
  return set;
}

This makes responsibilities clear: parseStringListConfig handles format normalization, parseAllowedBaseBranches handles Set construction.

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 93/100

Excellent test quality

Metric Value
New/modified tests analyzed 12 test suites
✅ Design tests (behavioral contracts) 10 (83%)
⚠️ Implementation tests (low value) 2 (17%)
Tests with error/edge cases 12 (100%)
Duplicate test clusters 0
Test inflation detected No (1.73:1 ratio, within 2:1 limit)
🚨 Coding-guideline violations 0

Test Classification Summary

This PR adds comprehensive tests for the newly extracted create_pull_request_helpers.cjs module. The tests demonstrate excellent behavioral coverage with strong edge case validation.

View Detailed Test Classification
Test Suite File Classification Edge Case Coverage
summarizeListForLog create_pull_request_helpers.test.cjs:50 ✅ Design Empty arrays, non-arrays, truncation, custom limits
createBundleTempRef create_pull_request_helpers.test.cjs:98 ✅ Design Special chars, collision avoidance, sanitization
isLabelTransientError create_pull_request_helpers.test.cjs:125 ✅ Design GitHub race conditions, HTTP status codes
parseAllowedBaseBranches create_pull_request_helpers.test.cjs:151 ✅ Design Undefined, arrays, strings, whitespace filtering
isBaseBranchAllowed create_pull_request_helpers.test.cjs:190 ✅ Design Exact match, wildcards, glob patterns, empty set
parseStringListConfig create_pull_request_helpers.test.cjs:223 ✅ Design Falsy inputs, mixed types, comma-separated
mergeFallbackIssueLabels create_pull_request_helpers.test.cjs:256 ✅ Design Deduplication, empty labels, no args
sanitizeFallbackAssignees create_pull_request_helpers.test.cjs:291 ✅ Design Empty, "copilot" removal, limit enforcement
neutralizeClosingKeywordsForIssueBody create_pull_request_helpers.test.cjs:342 ✅ Design Multiple keywords, cross-repo refs, preservation
generatePatchPreview create_pull_request_helpers.test.cjs:388 ✅ Design Line/char limits, truncation indicators
constants test create_pull_request_helpers.test.cjs:35 ⚠️ Implementation Tests constant values, not behavior
buildManifestProtectionCreatePrUrl create_pull_request_helpers.test.cjs:438 🟡 Mixed URL structure (borderline implementation)

Flagged Tests — Minor Improvements Suggested

⚠️ constants test (create_pull_request_helpers.test.cjs:35-44)

Classification: Implementation test
Issue: Tests that constant values exist and are "sensible" rather than testing observable behavior.
What design invariant does this test enforce? None — it verifies constant values, not how they affect system behavior.
What would break if deleted? Only if constants are renamed or given invalid values (e.g., negative numbers). Behavioral regressions (wrong retry timing) would not be caught.
Suggested improvement: Remove this test entirely, or integrate it into a behavioral test that verifies retry backoff timing in practice. Constants should be validated by the behaviors that use them, not in isolation.

🟡 buildManifestProtectionCreatePrUrl (create_pull_request_helpers.test.cjs:438-473)

Classification: Mixed (borderline implementation)
Issue: Tests are tightly coupled to specific URL structure (path segments, query params order).
What design invariant does this test enforce? Partially behavioral (URL must contain branch names, title, optional body param) but also verifies implementation details (exact path format).
What would break if deleted? URL structure changes could go unnoticed. However, legitimate refactoring of URL generation logic would also break these tests.
Suggested improvement: If the URL format is a public API contract (consumed by external tools), keep as-is. If it's internal (only parsed by GitHub's compare page), relax assertions to verify only essential properties (contains branch names, contains properly encoded title) without dictating exact query param order or path separators.


Exemplary Test Patterns

These tests demonstrate the quality standards this project aims for:

sanitizeFallbackAssignees (lines 291-337)

  • Why excellent: Comprehensive edge case coverage (empty, invalid types, "copilot" filtering, MAX_ASSIGNEES enforcement)
  • Mock justification: Uses vi.spyOn(global.core, 'warning') to verify side effect logging — legitimate external I/O mocking
  • Error coverage: Tests both success (valid assignees) and limit-exceeded warning path
  • Behavioral focus: Validates observable outputs (filtered array, null for empty) and side effects (warnings logged)

neutralizeClosingKeywordsForIssueBody (lines 342-383)

  • Why excellent: Tests all closing keyword variants, cross-repo references, preserves non-keyword #N references
  • Edge cases: Empty/falsy inputs handled gracefully, no false positives
  • Behavioral contract: Clear system promise — prevents unintended GitHub issue auto-closure

generatePatchPreview (lines 388-433)

  • Why excellent: Tests both truncation limits (500 lines, 2000 chars) independently and combined
  • Output validation: Verifies collapsible <details> structure, diff fence, truncation indicators
  • Behavioral focus: Tests what users see in the issue body, not internal truncation logic

Language Support

Tests analyzed in this PR:

  • �� JavaScript (*.test.cjs): 12 test suites analyzed (vitest framework)
  • 🐹 Go (*_test.go): No Go tests in this PR

Verdict

Check passed. 17% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected.

Score breakdown:

  • Behavioral coverage: 33/40 points (83% design tests)
  • Edge case coverage: 30/30 points (100% of tests include edge cases)
  • Low duplication: 20/20 points (no copy-paste patterns)
  • Proportional growth: 10/10 points (1.73:1 test:prod ratio)

Recommendations:

  1. Consider removing the constants test (low value)
  2. Review buildManifestProtectionCreatePrUrl assertions — decide if URL structure is a public contract or internal implementation
  3. All other tests are exemplary — excellent behavioral focus and edge case coverage

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

🧪 Test quality analysis by Test Quality Sentinel · ● 5.5M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 93/100. Test quality is excellent — 17% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected. See detailed report in comment above.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all review comments

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot merge main and recompile, fix conflicts

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 21, 2026

@copilot review all review comments

Addressed all three review findings in commit test(helpers): fix flaky crypto test and URL-encoding test coverage:

  1. Helper count in PR description – the module exports 12 helper functions (not 14 as stated in the description). The PR description had a typo; no helpers were missing.

  2. Flaky createBundleTempRef test – replaced the probabilistic non-collision assertion with a deterministic vi.spyOn(crypto, "randomBytes") mock that pins two different byte sequences, then asserts the exact expected ref strings.

  3. URL-encoding test – changed the sample branch from release/v2.0-rc1 (no chars that need encoding) to feature/my branch#1, and updated the assertion to verify feature/my%20branch%231 appears in the URL, so the test actually fails if encodePathSegments is broken.

…suite

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI and others added 2 commits May 21, 2026 06:08
…te-pull-request

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 21, 2026

@copilot merge main and recompile, fix conflicts

Done. Merged main cleanly (no conflicts), rebuilt the binary, and recompiled all 233 workflow lock files. All 176 JS tests continue to pass.

@github-actions
Copy link
Copy Markdown
Contributor

Hey @Copilot 👋 — thanks for working on improving the test coverage and modularity of create_pull_request.cjs! The refactoring looks solid, and adding 69 unit tests is excellent. However, there's one thing that would help keep this PR focused:

  • Split the PR — this mixes the JavaScript refactoring (extracting helpers, adding tests) with 100+ workflow .lock.yml file updates. These appear to be infrastructure configuration changes (model aliases, API proxy settings) that are unrelated to the code refactoring. Consider separating them so reviewers can evaluate the refactoring independently from the infrastructure updates.

If you'd like to address this, here's a prompt you can use:

Split PR #33687 into two separate PRs:
1. One for the JavaScript refactoring: extracting `create_pull_request_helpers.cjs`, adding `create_pull_request_helpers.test.cjs`, and updating `create_pull_request.cjs` to use the extracted helpers.
2. One for the workflow `.lock.yml` infrastructure updates (model aliases, API proxy configuration changes).

The first PR should contain only changes to:
- actions/setup/js/create_pull_request.cjs
- actions/setup/js/create_pull_request_helpers.cjs (new file)
- actions/setup/js/create_pull_request_helpers.test.cjs (new file)

The second PR should contain all the `.github/workflows/*.lock.yml` changes.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • patchdiff.githubusercontent.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "patchdiff.githubusercontent.com"

See Network Configuration for more information.

Generated by ✅ Contribution Check · ● 2.7M ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants