fix: prevent zombie MCP processes via handshake timeout + deferred init by PatrickSys · Pull Request #89 · PatrickSys/codebase-context

PatrickSys · 2026-04-04T19:42:07Z

Adds a 30-second handshake timer that exits the process if no MCP client
sends an initialize message within the timeout window. Defers CPU-heavy
indexing and file-watcher setup to the oninitialized callback so a misfire
consumes near-zero resources before exiting.

Closes the recurring zombie process bug where Codex CLI spawns

px codebase-context \ without a subcommand, starting the MCP server
with no client to connect. The 4 previous guards (ppid poller, stdin close,
server.onclose, SIGHUP) all failed because cmd.exe holds the pipe open
while waiting for the child to produce output.

The fix is tool-agnostic: it does not rely on who is calling it or how.
Any invocation that does not result in an MCP initialize handshake within
30 seconds exits cleanly with code 1.

What / Why

Explain the change briefly.

Release notes

This repo uses an automated release flow.

Please use a Conventional-Commits style PR title (we squash-merge): feat: ..., fix: ..., docs: ..., chore: ..., refactor: ...
If this should not appear in release notes, say so here

greptile-apps · 2026-04-04T19:46:29Z

Greptile Summary

This PR fixes the recurring zombie MCP process bug by adding a 30-second handshake timeout that exits the process if no MCP initialize message arrives, and defers all CPU-heavy indexing and file-watcher setup to the oninitialized callback. It also ships a new discovery evaluation harness (src/eval/discovery-harness.ts) with frozen fixtures, gate logic, and comprehensive tests.

Key changes:

src/index.ts: Handshake timer (HANDSHAKE_TIMEOUT_MS, env-configurable) with unref() so it doesn't block natural exits; server.oninitialized now owns all project initialization (indexing, watcher, root refresh)
tests/zombie-guard.test.ts: Real child-process integration tests validating exit code 1, timeout message, deferred init ([DEBUG] Server ready absent before handshake), and custom timeout env var
src/eval/discovery-harness.ts + src/eval/types.ts: New discovery benchmark harness with surface runners, per-task scoring (usefulnessScore, firstRelevantHit, bestExampleUseful), gate evaluation with baseline + comparator lanes, and report formatting
tests/fixtures/discovery-*.json: Three new frozen fixtures (angular-spotify, excalidraw, benchmark protocol) pinned to concrete repository refs
scripts/run-eval.mjs + docs/capabilities.md: CLI and documentation updated for --mode retrieval|discovery

Confidence Score: 5/5

Safe to merge — the zombie guard logic is correct and well-tested; all findings are P2 style/maintenance suggestions

The core handshake-timeout implementation is sound: unref() is used correctly, oninitialized defers all heavy work, and four integration tests exercise the real binary. The discovery harness is comprehensive and its gate logic is correct for the current frozen protocol. Three P2 findings (lowerIsBetter inconsistency, missing undefined-runner guard, and build-dependency test guard) are non-blocking quality improvements that do not affect correctness at the current protocol version.

src/eval/discovery-harness.ts deserves a follow-up to align lowerIsBetter across compareMetric and compareMetricWithinTolerance, and to add a guard for unregistered surface runners

Important Files Changed

Filename	Overview
src/index.ts	Adds handshake timeout (30 s default, env-configurable) and defers all project init to oninitialized callback to prevent zombie MCP processes
tests/zombie-guard.test.ts	Integration tests spawning real child processes to verify exit-code-1 behavior, timeout message, deferred init, and custom timeout env var
src/eval/discovery-harness.ts	New discovery harness with surface runners, task scoring, gate evaluation, and report formatting; minor lowerIsBetter inconsistency with compareMetric
src/eval/types.ts	New discovery eval types: DiscoveryTask, DiscoveryFixture, DiscoverySummary, gate types, and benchmark protocol interfaces
tests/discovery-harness.test.ts	Comprehensive unit tests for discovery harness scoring, fixture freezing, report formatting, and gate evaluation edge cases
tests/eval-harness.test.ts	Imports countUtf8Bytes and estimateTokenCountFromBytes as shared exports; existing harness tests unchanged
scripts/run-eval.mjs	Extends eval CLI with --mode flag for retrieval
src/eval/harness.ts	Extracts countUtf8Bytes and estimateTokenCountFromBytes as public exports for reuse by the discovery harness
docs/capabilities.md	Updates eval docs to reflect discovery mode, new fixtures, metrics, and gate semantics
tests/fixtures/discovery-benchmark-protocol.json	Frozen benchmark protocol with ship gate, 15% comparator tolerance, and named comparator list
tests/fixtures/discovery-angular-spotify.json	Frozen discovery fixture for angular-spotify: 12 tasks, balanced 4/4/4 map/find/search split, pinned to commit SHA
tests/fixtures/discovery-excalidraw.json	Frozen discovery fixture for excalidraw: 12 tasks, balanced 4/4/4 map/find/search split, pinned to commit SHA
src/resources/codebase-intelligence.ts	No functional changes; serves as the surface runner for codebase://context discovery tasks
tests/fixtures/README.md	Documentation update describing the new discovery fixtures and their frozen-eval semantics

Sequence Diagram

sequenceDiagram
    participant CLI as npx codebase-context
    participant Main as main() in index.ts
    participant Timer as HandshakeTimer (30s, unref'd)
    participant MCP as MCP Client
    participant Init as Project Init

    CLI->>Main: start (STDIO mode)
    Main->>Timer: setTimeout(exit(1), 30s).unref()
    Main->>MCP: server.connect(StdioTransport)

    alt Zombie scenario — no client connects
        Timer-->>Main: fires after 30s
        Main->>CLI: process.exit(1) + stderr message
    else Real MCP client connects
        MCP->>Main: initialize message
        Main->>Timer: clearTimeout(handshakeTimer)
        Main->>Init: oninitialized() — index + file-watcher
        Init-->>MCP: ready
        MCP->>Main: (normal lifecycle)
        Main->>CLI: process.exit(0) on close/SIGTERM
    end

_{Reviews (1): Last reviewed commit: "fix: prevent zombie MCP processes via ha..." | Re-trigger Greptile}

greptile-apps · 2026-04-04T19:46:33Z

src/eval/discovery-harness.ts

+  metric: DiscoveryMetricName,
+  tolerancePercent: number
+): DiscoveryMetricComparison {
+  const lowerIsBetter = metric === 'averageFirstRelevantHit';


lowerIsBetter inconsistency with compareMetric

In compareMetric (line 243), both averageEstimatedTokens and averageFirstRelevantHit are treated as lower-is-better:

const lowerIsBetter = metric === 'averageEstimatedTokens' || metric === 'averageFirstRelevantHit';

But in compareMetricWithinTolerance only averageFirstRelevantHit is:

const lowerIsBetter = metric === 'averageFirstRelevantHit';

This is safe today because the comparator gate's usefulnessMetrics in the frozen protocol only contains averageUsefulness, averageFirstRelevantHit, and bestExampleUsefulnessRate — never averageEstimatedTokens. However, since usefulnessMetrics is typed as DiscoveryMetricName[], nothing prevents a future protocol update from adding averageEstimatedTokens to the comparator gate, which would silently evaluate it as higher-is-better and produce an incorrect gate result.

Consider aligning the two functions:

Suggested change

const lowerIsBetter = metric === 'averageFirstRelevantHit';

const lowerIsBetter = metric === 'averageEstimatedTokens' || metric === 'averageFirstRelevantHit';

Fixed. now includes and in its check, consistent with . This was already addressed in the rebase by taking the metric-alignment commit from master (21c3e43 — fix: align discovery protocol metrics), which added averagePayloadBytes to both functions and to DiscoveryMetricName.

greptile-apps · 2026-04-04T19:46:34Z

src/eval/discovery-harness.ts

+  for (const task of fixture.tasks) {
+    const runner = runners[task.surface];
+    const payload = await runner(task, rootPath);
+    results.push(evaluateDiscoveryTask(task, payload));


No guard for undefined surface runner

If task.surface is a value not present in the runners map (e.g., a typo in a fixture file or a newly added DiscoverySurface value without a corresponding runner), runner is undefined and calling await runner(task, rootPath) throws a confusing TypeError: runner is not a function rather than a descriptive error.

Consider adding an explicit check:

for (const task of fixture.tasks) { const runner = runners[task.surface]; if (!runner) { throw new Error(`No runner registered for surface: ${task.surface}`); } const payload = await runner(task, rootPath); results.push(evaluateDiscoveryTask(task, payload)); }

Fixed. Added an explicit guard before calling the runner:

const runner = runners[task.surface]; if (!runner) { throw new Error(`No runner registered for surface: ${task.surface}`); }

This surfaces fixture typos or unregistered surfaces with a clear message instead of a cryptic TypeError.

greptile-apps · 2026-04-04T19:46:35Z

tests/zombie-guard.test.ts

+import { fileURLToPath } from 'node:url';
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const ENTRY_POINT = path.resolve(__dirname, '..', 'dist', 'index.js');


Tests depend on a pre-built dist/index.js

The test resolves the entry point from dist/index.js, so it will fail with a confusing rejection (from child.on('error', reject)) if the project hasn't been built before running tests. A descriptive beforeAll guard would make the failure more actionable:

import { existsSync } from 'node:fs'; beforeAll(() => { if (!existsSync(ENTRY_POINT)) { throw new Error( `dist/index.js not found — run \`npm run build\` before the zombie-guard tests.` ); } });

This is especially relevant for contributors running vitest without a prior build step.

Fixed. Added a beforeAll guard at the top of the describe block:

beforeAll(() => { if (!existsSync(ENTRY_POINT)) { throw new Error( `dist/index.js not found - run \`npm run build\` before the zombie-guard tests.` ); } });

Throws early with an actionable message rather than letting the test fail with a cryptic spawn rejection.

Adds a 30-second handshake timer that exits the process if no MCP client sends an initialize message within the timeout window. Defers CPU-heavy indexing and file-watcher setup to the oninitialized callback so a misfire consumes near-zero resources before exiting. Closes the recurring zombie process bug where Codex CLI spawns \ px codebase-context <path>\ without a subcommand, starting the MCP server with no client to connect. The 4 previous guards (ppid poller, stdin close, server.onclose, SIGHUP) all failed because cmd.exe holds the pipe open while waiting for the child to produce output. The fix is tool-agnostic: it does not rely on who is calling it or how. Any invocation that does not result in an MCP initialize handshake within 30 seconds exits cleanly with code 1.

greptile-apps bot reviewed Apr 4, 2026

View reviewed changes

PatrickSys force-pushed the fix/zombie-process-deterministic branch from 0344cc1 to dfcbbce Compare April 5, 2026 10:08

PatrickSys merged commit 37fd4b9 into master Apr 5, 2026
3 checks passed

github-actions bot mentioned this pull request Apr 5, 2026

chore(master): release 1.10.0 #79

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent zombie MCP processes via handshake timeout + deferred init#89

fix: prevent zombie MCP processes via handshake timeout + deferred init#89
PatrickSys merged 1 commit intomasterfrom
fix/zombie-process-deterministic

PatrickSys commented Apr 4, 2026

Uh oh!

greptile-apps bot commented Apr 4, 2026

Important Files Changed

Uh oh!

greptile-apps bot Apr 4, 2026

Uh oh!

PatrickSys Apr 5, 2026

Uh oh!

greptile-apps bot Apr 4, 2026

Uh oh!

PatrickSys Apr 5, 2026

Uh oh!

greptile-apps bot Apr 4, 2026

Uh oh!

PatrickSys Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	const lowerIsBetter = metric === 'averageFirstRelevantHit';
	const lowerIsBetter = metric === 'averageEstimatedTokens' \|\| metric === 'averageFirstRelevantHit';

Conversation

PatrickSys commented Apr 4, 2026

What / Why

Release notes

Uh oh!

greptile-apps bot commented Apr 4, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

PatrickSys Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

PatrickSys Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

PatrickSys Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant