[DOCS ]Add annotated partitioning documentation#27972
[DOCS ]Add annotated partitioning documentation#27972yuslepukhin wants to merge 6 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new documentation page describing advanced ONNX Runtime graph partitioning features—layer annotation–based device assignment and CUDA capacity-aware partitioning—intended to guide users through annotation, profiling, and constrained partitioning workflows.
Changes:
- Documented
layer_annnode annotations and runtime mapping viasession.layer_assignment_settings. - Documented memory-stat collection (
session.collect_node_memory_stats_to_file) and CUDA memory-budget partitioning (session.resource_cuda_partitioning_settings). - Included end-to-end examples for combining annotation-based assignment with memory-constrained partitioning.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/annotated_partitioning/ParitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/ParitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/ParitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/ParitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/ParitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Show resolved
Hide resolved
docs/annotated_partitioning/PartitioningWithAnnotationsAndMemoryConstraints.md
Outdated
Show resolved
Hide resolved
|
/azp run web_Release / build_onnxruntime_web |
|
No pipelines are associated with this pull request. |
This pull request introduces a new documentation page,
PartitioningWithAnnotationsAndMemoryConstraints.md, which explains advanced ONNX Runtime features for partitioning model graphs across devices with explicit control. The doc covers how to annotate model layers for device assignment, collect per-node memory statistics, and enforce GPU memory budgets during partitioning. These features enable precise control over device placement and memory usage for large models.The most important changes are:
New Documentation: Advanced Partitioning Features
PartitioningWithAnnotationsAndMemoryConstraints.md) describing how to use ONNX Runtime’s layer annotation and memory constraint features for graph partitioning.Layer Assignment via Annotations
layer_annmetadata, including manual annotation and automated annotation using Olive’sCaptureLayerAnnotationspass.session.layer_assignment_settingssession option.Capacity-Aware Partitioning
session.resource_cuda_partitioning_settingssession option.This is a follow up for #27595