[DOCS ]Add annotated partitioning documentation by yuslepukhin · Pull Request #27972 · microsoft/onnxruntime

yuslepukhin · 2026-04-03T18:01:55Z

This pull request introduces a new documentation page, PartitioningWithAnnotationsAndMemoryConstraints.md, which explains advanced ONNX Runtime features for partitioning model graphs across devices with explicit control. The doc covers how to annotate model layers for device assignment, collect per-node memory statistics, and enforce GPU memory budgets during partitioning. These features enable precise control over device placement and memory usage for large models.

The most important changes are:

New Documentation: Advanced Partitioning Features

Adds a comprehensive guide (PartitioningWithAnnotationsAndMemoryConstraints.md) describing how to use ONNX Runtime’s layer annotation and memory constraint features for graph partitioning.

Layer Assignment via Annotations

Explains how to annotate ONNX model nodes with layer_ann metadata, including manual annotation and automated annotation using Olive’s CaptureLayerAnnotations pass.
Provides configuration examples for mapping annotation patterns to devices at runtime using the session.layer_assignment_settings session option.

Capacity-Aware Partitioning

Details a two-phase workflow for profiling per-node memory usage and then enforcing a memory budget with the session.resource_cuda_partitioning_settings session option.
Covers both profiling-based and ad-hoc (estimation-only) approaches for memory-constrained partitioning. (docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.mdR1-R267

This is a follow up for #27595

Copilot

Pull request overview

Adds a new documentation page describing advanced ONNX Runtime graph partitioning features—layer annotation–based device assignment and CUDA capacity-aware partitioning—intended to guide users through annotation, profiling, and constrained partitioning workflows.

Changes:

Documented layer_ann node annotations and runtime mapping via session.layer_assignment_settings.
Documented memory-stat collection (session.collect_node_memory_stats_to_file) and CUDA memory-budget partitioning (session.resource_cuda_partitioning_settings).
Included end-to-end examples for combining annotation-based assignment with memory-constrained partitioning.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/annotated_partitioning/ParitioningWithAnnotationsAndMemoryConstraints.md

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md

yuslepukhin · 2026-04-04T00:03:53Z

/azp run web_Release / build_onnxruntime_web

azure-pipelines · 2026-04-04T00:04:01Z

No pipelines are associated with this pull request.

Add annotated partitioning

93ab698

yuslepukhin requested a review from Copilot April 3, 2026 18:01

yuslepukhin added the release:1.25.0 label Apr 3, 2026

Copilot started reviewing on behalf of yuslepukhin April 3, 2026 18:03 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Address Copilot review comments

0e38eff

yuslepukhin requested review from Copilot and tianleiwu April 3, 2026 18:23

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Copilot started reviewing on behalf of yuslepukhin April 3, 2026 18:34 View session

Address review round 2

79be28c

yuslepukhin requested a review from Copilot April 3, 2026 18:37

Copilot started reviewing on behalf of yuslepukhin April 3, 2026 18:38 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Address Copilot review round 3

2fe2614