feat(graph): add graph check command for pre-rebuild conflict detection#999
feat(graph): add graph check command for pre-rebuild conflict detection#999stmcgovern wants to merge 3 commits intopython-wheel-build:mainfrom
graph check command for pre-rebuild conflict detection#999Conversation
…tion Add a pre-rebuild check that answers two questions in seconds: 1. Is this graph structurally sound? (well-formed, acyclic) 2. How many wheels are extra? (collapsible vs required) The conflict classification uses the same specifier.filter logic as show_explain_duplicates and write_constraints_file. It is conservative: collapsible here guarantees write_constraints_file succeeds. Self-loops (e.g. safetensors[numpy]) are reported as warnings, not failures — they are common in extras and harmless for install deps. Real cycles (e.g. haystack-ai ↔ haystack-experimental) cause failure. Supports --json for CI integration and --constraints for pin output. Closes: python-wheel-build#998
📝 WalkthroughWalkthroughThis change introduces a new Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/fromager/commands/graph.py (2)
1075-1100: Consider making--jsonand--constraintsmutually exclusive.If both flags are specified,
--constraintssilently takes precedence. This could surprise users.Option: Add a click mutex constraint
`@click.option`( "--json", "as_json", is_flag=True, default=False, help="Output results as JSON.", ) `@click.option`( "--constraints", "as_constraints", is_flag=True, default=False, help="Output collapsible pins in constraints.txt format.", )You could add validation at the start of
check():if as_json and as_constraints: raise click.UsageError("--json and --constraints are mutually exclusive")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fromager/commands/graph.py` around lines 1075 - 1100, The check() command accepts both --json (as_json) and --constraints (as_constraints) but currently lets --constraints silently win; add an explicit mutual-exclusion check at the start of the check() function to detect if both as_json and as_constraints are True and raise a click.UsageError (with a clear message like "--json and --constraints are mutually exclusive"). This change should be implemented inside the check() function body that receives wkctx, graph_file, as_json, as_constraints so any callers get an immediate, user-facing error instead of surprising behavior.
1113-1129: Avoid reading the graph file twice.The graph JSON is loaded at line 1115, then re-read at line 1127 via
from_file. Usefrom_dict(graph_dict)instead.Proposed fix
if not wf_issues: - graph = DependencyGraph.from_file(graph_file) + graph = DependencyGraph.from_dict(graph_dict) entries = _classify_conflicts(graph) n_nodes = len(graph) - 1 # exclude root🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fromager/commands/graph.py` around lines 1113 - 1129, The code currently loads graph_file into graph_dict and later re-reads the file via DependencyGraph.from_file; replace that second read with DependencyGraph.from_dict(graph_dict) to avoid double I/O: after verifying wf_issues and acyclicity (using _check_well_formed and _check_acyclicity), call DependencyGraph.from_dict(graph_dict) (not from_file) to build the graph used by _classify_conflicts and to compute n_nodes, keeping the existing logic for entries = _classify_conflicts(graph) and n_nodes = len(graph) - 1.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/fromager/commands/graph.py`:
- Around line 1075-1100: The check() command accepts both --json (as_json) and
--constraints (as_constraints) but currently lets --constraints silently win;
add an explicit mutual-exclusion check at the start of the check() function to
detect if both as_json and as_constraints are True and raise a click.UsageError
(with a clear message like "--json and --constraints are mutually exclusive").
This change should be implemented inside the check() function body that receives
wkctx, graph_file, as_json, as_constraints so any callers get an immediate,
user-facing error instead of surprising behavior.
- Around line 1113-1129: The code currently loads graph_file into graph_dict and
later re-reads the file via DependencyGraph.from_file; replace that second read
with DependencyGraph.from_dict(graph_dict) to avoid double I/O: after verifying
wf_issues and acyclicity (using _check_well_formed and _check_acyclicity), call
DependencyGraph.from_dict(graph_dict) (not from_file) to build the graph used by
_classify_conflicts and to compute n_nodes, keeping the existing logic for
entries = _classify_conflicts(graph) and n_nodes = len(graph) - 1.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 59054cbc-aa40-420f-b03f-7f71ab591faa
📒 Files selected for processing (2)
src/fromager/commands/graph.pytests/test_graph_validate.py
…e file read Address review feedback: raise UsageError when both output flags are passed, and use from_dict(graph_dict) instead of re-reading the file.
Align test filename with the command name per CONTRIBUTING.md convention.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_graph_check.py (1)
347-360: Consider adding assertion forbuild_efficiencyfield.The JSON output includes
build_efficiencywithtotal_wheelsandextra_buildsper the implementation, but no test verifies this field. Minor gap.assert "build_efficiency" in data assert "total_wheels" in data["build_efficiency"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_graph_check.py` around lines 347 - 360, The test_clean_graph_json test is missing assertions for the build_efficiency section of the output; update the test (in test_clean_graph_json) to assert that "build_efficiency" exists in the parsed data dict and that it contains the expected keys (e.g., "total_wheels" and "extra_builds") and optionally that those values are present/not-None (use the local variable data to check keys and simple truthiness or type). Ensure assertions reference data["build_efficiency"]["total_wheels"] and data["build_efficiency"]["extra_builds"] so the test will fail if those fields are absent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/test_graph_check.py`:
- Around line 347-360: The test_clean_graph_json test is missing assertions for
the build_efficiency section of the output; update the test (in
test_clean_graph_json) to assert that "build_efficiency" exists in the parsed
data dict and that it contains the expected keys (e.g., "total_wheels" and
"extra_builds") and optionally that those values are present/not-None (use the
local variable data to check keys and simple truthiness or type). Ensure
assertions reference data["build_efficiency"]["total_wheels"] and
data["build_efficiency"]["extra_builds"] so the test will fail if those fields
are absent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 923cb1b7-9918-481e-aa54-06d541398075
📒 Files selected for processing (1)
tests/test_graph_check.py
Pull Request Description
What
Add a pre-rebuild check that answers two questions in seconds:
The conflict classification uses the same specifier.filter logic as show_explain_duplicates and write_constraints_file. It is conservative: collapsible here guarantees write_constraints_file succeeds.
Self-loops (e.g. safetensors[numpy]) are reported as warnings, not failures — they are common in extras and harmless for install deps. Real cycles (e.g. haystack-ai ↔ haystack-experimental) cause failure.
Supports --json for CI integration and --constraints for pin output.
Why
Closes: #998