Skip to content

perf(native): fix 1238% incremental rebuild regression#856

Merged
carlos-alm merged 5 commits intomainfrom
fix/native-incremental-regression
Apr 5, 2026
Merged

perf(native): fix 1238% incremental rebuild regression#856
carlos-alm merged 5 commits intomainfrom
fix/native-incremental-regression

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

Fixes the native 1-file incremental rebuild regression from 42ms → 562ms (1238%) introduced in v3.9.0 by PR #807.

Root cause: The post-native-orchestrator processing unconditionally ran a full JS structure rebuild (changedFilePaths=null) and loaded all nodes from DB after every native build, even when the Rust fast-path already handled structure for small incremental builds.

Fixes:

  • Rust BuildPipelineResult now reports structureHandled: bool — true when the small-incremental fast path ran (≤5 changed files, >20 existing)
  • JS skips the structure phase entirely when Rust already handled it
  • JS scopes the DB reconstruction query to changed files only (instead of loading the entire graph) when structure was handled
  • For large incremental builds where Rust didn't handle structure, JS now passes changedFiles to buildStructureFn (matching the JS pipeline's medium-incremental path) instead of null (full rebuild)

Expected improvement: Native 1-file incremental should return to ~42ms range (from 562ms). WASM pipeline is unaffected (uses separate JS stage pipeline).

Both engines tested: All incremental parity, build parity, pipeline, and scoped-rebuild tests pass.

Test plan

  • TypeScript compiles (tsc --noEmit)
  • Biome lint clean on changed files
  • tests/builder/pipeline.test.ts — 4/4 pass
  • tests/integration/incremental-parity.test.ts — all pass
  • tests/integration/incremental-edge-parity.test.ts — all pass
  • tests/integration/scoped-rebuild.test.ts — all pass
  • 521 integration/builder tests pass (6 failures are pre-existing WASM grammar issue in worktree)
  • CI benchmark confirms native 1-file rebuild returns to ~42ms range

…dundant structure phase

The native orchestrator's post-processing unconditionally ran a full JS
structure rebuild (changedFilePaths=null) and loaded all nodes from DB
after every native build, regardless of whether the Rust fast-path
already handled structure. For a 1-file change on a 567-file codebase,
this turned a 42ms rebuild into 562ms.

Three fixes:
1. Rust reports `structureHandled: bool` in BuildPipelineResult so the
   JS side knows whether the small-incremental fast path ran
2. When Rust handled structure, JS skips the structure phase entirely
   and scopes the DB reconstruction query to changed files only
3. When Rust didn't handle structure (full/large incremental), JS now
   passes changedFiles to buildStructureFn instead of null, matching
   the JS pipeline's medium-incremental path behavior
@claude
Copy link
Copy Markdown

claude bot commented Apr 4, 2026

Claude finished @carlos-alm's task —— View job


PR Review for #856: perf(native): fix 1238% incremental rebuild regression

Review Progress

  • Phase 1: Root Cause Analysis
  • Phase 2: Configuration Change Detection
  • Phase 3: Test Integrity Check
  • Phase 4: Code Quality Review
  • Phase 5: Backlog Compliance Assessment
  • Final Assessment & Recommendation

Root Cause Analysis

Problem Being Solved: The native incremental build pipeline had a 1238% performance regression (42ms → 562ms) for 1-file incremental builds introduced in v3.9.0 by PR #807. The root cause was post-native processing unconditionally running a full JS structure rebuild and loading all nodes from the database even when the Rust fast-path already handled structure for small incremental builds.

Why This Approach: The fix adds a structureHandled: bool flag to the Rust BuildPipelineResult to communicate when the small-incremental fast path ran (≤5 changed files, >20 existing files). This allows the JS side to skip redundant work when Rust already handled structure updates.

Risk Assessment: Low risk. The change is surgical and well-contained. Both engines are tested for parity, and the fix only affects the coordination between native and JS processing phases.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies added
  • Foundation-aligned: ✓ — Improves core build performance without violating any principles
  • Problem-fit: 4 — Directly improves AI agent experience by reducing build times, which reduces round-trips and token waste
  • Breaking: No — Purely internal optimization, no API changes
  • Tier: 1

Code Quality Assessment

✅ Strengths

  1. Surgical precision: The fix is minimal and targeted, adding just one boolean flag
  2. Clear logic flow: The use_fast_path condition is well-defined (≤5 files, >20 existing)
  3. Proper scoping: When structure is handled, only changedFiles are loaded for analysis instead of the entire graph
  4. Fallback safety: For large incrementals where Rust skips structure, JS correctly passes changedFiles instead of null (full rebuild)
  5. Test coverage: All incremental parity and pipeline tests pass

✅ Implementation Quality

  • Rust side (build_pipeline.rs:596): structure_handled: use_fast_path correctly reports when fast path ran
  • TypeScript side (pipeline.ts:605-616): Proper conditional logic to skip structure when result.structureHandled is true
  • Scoped reconstruction (pipeline.ts:616): Smart optimization to only load changed files for analysis when structure was already handled
  • Fallback logic (pipeline.ts:423): Correctly passes changedFiles for medium incrementals vs null for full builds

✅ Edge Case Handling

  • Early exit path maintains structure_handled: true (line 213) — correct since no work needed
  • Debug logging clearly indicates when fast path is skipped (lines 514-519)
  • WAL handoff logic remains intact and unaffected by the change

🔍 Performance Logic Validation

The fast path thresholds are well-chosen:

  • FAST_PATH_MAX_CHANGED_FILES = 5: Targets single-file edits and small batch changes
  • FAST_PATH_MIN_EXISTING_FILES = 20: Ensures meaningful codebase size before optimization kicks in

The performance improvement logic is sound:

  1. Before: Native build + unconditional JS structure rebuild + full DB load
  2. After: Native build + skip JS structure + scoped DB load

Expected improvement: ~92% reduction (562ms → ~42ms) aligns with eliminating the redundant full structure phase.

Critical Concerns

None identified. This is a clean performance optimization that:

  • Maintains engine parity (both paths tested)
  • Preserves all functionality for edge cases
  • Uses established patterns (fast path thresholds)
  • Has comprehensive test coverage

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Excellent performance fix that directly addresses a significant regression. The implementation is surgical, well-tested, and maintains backward compatibility. The only reason for 4/5 instead of 5/5 is that this is fixing a regression rather than adding new value, but it's executed perfectly within those constraints.

Next Steps: Monitor CI benchmark to confirm the ~92% improvement is achieved in practice.


@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR fixes the 1238% native incremental rebuild regression introduced in v3.9.0 by removing two expensive unconditional operations that ran after every native build: a full-graph structure rebuild (changedFilePaths=null) and a full DB node load. The fix introduces a structure_handled: bool signal on BuildPipelineResult that lets the JS layer know whether the Rust fast path already handled structure, and scopes the DB reconstruction query to changed files only when it did.

Key changes:

  • Rust BuildPipelineResult gains structure_handled: booltrue for the small-incremental fast path (≤5 content-changed files, >20 existing files)
  • JS needsStructure = !result.structureHandled correctly gates the expensive JS buildStructure call
  • JS reconstructFileSymbolsFromDb is now scoped to result.changedFiles when structure was handled, avoiding a full-graph DB load for single-file incrementals
  • For large incrementals where Rust skipped structure, JS now passes structureScope (includes reverse-dep files) instead of null to buildStructureFn, matching the JS pipeline's medium-incremental path
  • The empty-array edge case flagged in the previous review thread is addressed with isFullBuild || !changedFiles?.length ? null : changedFiles

One P2 concern identified: For pure file-deletion builds (no content changes), parse_changes.len() == 0 satisfies the ≤5 fast-path threshold, so structure_handled = true and JS skips structure. If the deleted file was the sole occupant of its directory, the directory node becomes orphaned in the DB until the next full rebuild. Pre-PR this was masked by the unconditional JS structure pass. See inline comment for a targeted fix."

Confidence Score: 4/5

Safe to merge for the vast majority of use-cases; one narrow edge case (removal-only builds with fast path) may leave orphaned directory nodes that persist until the next full rebuild.

All prior review concerns are resolved. The regression fix is logically correct and well-scoped. One P2 finding remains: the use_fast_path condition is based on parse_changes.len() which is 0 for removal-only builds, causing structure_handled = true even though update_changed_file_metrics is a no-op in that case, leaving stale directory nodes. This is a real behavioural regression from pre-PR (where JS always ran structure) but has mild, recoverable impact. Keeping at 4/5 per the guidance that at least one P2 describing a correctness issue warrants the reduction.

crates/codegraph-core/src/build_pipeline.rs — the use_fast_path guard around line 518 should consider excluding removal-only builds (parse_changes.len() > 0) to avoid orphaned directory nodes.

Important Files Changed

Filename Overview
crates/codegraph-core/src/build_pipeline.rs Adds structure_handled: bool to BuildPipelineResult; correctly derives it from use_fast_path (≤5 content-changed files, >20 existing). Minor edge case: removal-only builds set structure_handled = true with parse_changes.len() = 0, potentially leaving orphaned directory nodes.
src/domain/graph/builder/pipeline.ts JS orchestrator correctly gates structure and DB reconstruction on structureHandled; scoped file load and structureScope ?? changedFiles fallback are both correct; the previous thread's empty-array guard is properly applied.

Sequence Diagram

sequenceDiagram
    participant JS as pipeline.ts (JS)
    participant Rust as build_pipeline.rs (Rust)
    participant DB as SQLite DB

    JS->>Rust: buildGraph(rootDir, config, aliases, opts)
    Rust->>Rust: collect + detect changes
    alt No changes (early exit)
        Rust-->>JS: { earlyExit: true, structureHandled: true }
        JS-->>JS: return early-exit
    else Small incremental (≤5 changed, >20 existing)
        Rust->>DB: purge changed/removed files
        Rust->>DB: parse + insert nodes
        Rust->>DB: resolve imports + build edges
        Rust->>DB: update_changed_file_metrics (fast path)
        Rust-->>JS: { structureHandled: true, changedFiles: [...] }
        JS->>DB: reconstructFileSymbolsFromDb(changedFiles only)
        Note over JS,DB: Scoped load — avoids full graph scan
        JS->>JS: runPostNativeAnalysis(changedFiles)
    else Large incremental or full build
        Rust->>DB: purge + parse + insert + edges
        Note over Rust: Skips structure phase
        Rust-->>JS: { structureHandled: false, structureScope: [...] }
        JS->>DB: reconstructFileSymbolsFromDb(all files)
        JS->>JS: runPostNativeStructure(structureScope ?? changedFiles)
        JS->>JS: runPostNativeAnalysis(changedFiles)
    end
Loading

Reviews (3): Last reviewed commit: "fix(native): resolve use-after-move comp..." | Re-trigger Greptile

// Full builds need null (rebuild everything). Incremental builds pass the
// changed file list so buildStructure only updates those files' metrics
// and contains edges — matching the JS pipeline's medium-incremental path.
const changedFilePaths = isFullBuild ? null : (changedFiles ?? null);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Empty-array changedFiles passes through ?? and silences structure updates for removal-only builds

When use_fast_path = false and the only change is file removals (no content-changed files), the Rust pipeline sets analysis_scope = Some([]) and the result carries changedFiles = []. The ?? operator only falls back to null for undefined or nullnot for an empty array — so changedFilePaths becomes [] and buildStructureFn is called with an empty list. This may skip all structure updates, leaving stale directory-containment edges for the removed files.

Pre-PR this code always passed null (full rebuild), so this edge case never triggered. A defensive fix:

Suggested change
const changedFilePaths = isFullBuild ? null : (changedFiles ?? null);
const changedFilePaths = isFullBuild || !changedFiles?.length ? null : changedFiles;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Applied the suggested guard — changedFilePaths now falls back to null when changedFiles is an empty array (removal-only builds), matching the pre-PR behavior of always doing a full structure rebuild in that case.

const changedFilePaths = isFullBuild || !changedFiles?.length ? null : changedFiles;

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Codegraph Impact Analysis

6 functions changed6 callers affected across 5 files

  • run_pipeline in crates/codegraph-core/src/build_pipeline.rs:100 (0 transitive callers)
  • NativeOrchestratorResult.structureScope in src/domain/graph/builder/pipeline.ts:267 (0 transitive callers)
  • NativeOrchestratorResult.structureHandled in src/domain/graph/builder/pipeline.ts:269 (0 transitive callers)
  • reconstructFileSymbolsFromDb in src/domain/graph/builder/pipeline.ts:307 (4 transitive callers)
  • runPostNativeStructure in src/domain/graph/builder/pipeline.ts:394 (4 transitive callers)
  • tryNativeOrchestrator in src/domain/graph/builder/pipeline.ts:548 (5 transitive callers)

…ucture scope

- Guard against empty changedFiles array passing through ?? operator in
  runPostNativeStructure, which would skip structure updates for
  removal-only builds (behavioral regression vs pre-PR)
- Remove noisy eprintln! that fires on every large-incremental build
  with a misleading message about running JS pipeline
- Add structureScope field to BuildPipelineResult containing the full
  changed_files list (including reverse-dep files) so the JS structure
  fallback path can update metrics for files whose edges changed even
  though their content didn't
@carlos-alm
Copy link
Copy Markdown
Contributor Author

All three Greptile findings addressed in 1716f4d:

1. eprintln! noise on every large-incremental build (lines 514-519)
Removed the eprintln! entirely. The JS side already knows when structure needs to run via structureHandled, so the message was both noisy and misleading — users don't need to "run JS pipeline" manually.

2. analysis_scope reuse as structure scope (lines 592-596)
Added a new structure_scope field to BuildPipelineResult that carries the full changed_files list (including reverse-dep files). The JS structure fallback now uses result.structureScope ?? result.changedFiles so reverse-dep files whose import_count/export_count changed due to edge rebuilding get their metrics updated. This matches how the fast path already uses the full changed_files in structure::update_changed_file_metrics.

3. Empty-array changedFiles edge case (line 423)
Applied the suggested guard: isFullBuild || !changedFiles?.length ? null : changedFiles. An empty array from removal-only builds now correctly falls back to null (full structure rebuild), matching pre-PR behavior.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

changed_files was moved into changed_file_list on line 544 and then
referenced again for structure_scope on line 607, which Rust rejects
as a use-after-move. Reuse changed_file_list.clone() which already
holds the correct None/Some(changed_files) value.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Fixed the use-after-move compile error identified in the Greptile summary.

changed_files was moved into changed_file_list on line 544 (Some(changed_files)) and then referenced again on line 607 for structure_scope. Since changed_file_list already holds the correct None (full builds) / Some(changed_files) (incremental builds) value from lines 541-544, the fix replaces the redundant conditional with changed_file_list.clone(). This is both correct and simpler.

Commit: 9f83fd4

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

1 similar comment
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 54ba5c0 into main Apr 5, 2026
19 checks passed
@carlos-alm carlos-alm deleted the fix/native-incremental-regression branch April 5, 2026 07:03
@github-actions github-actions bot locked and limited conversation to collaborators Apr 5, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant