Skip to content

ci: add SynapseML-Internal compatibility check to OSS pipeline#2542

Open
BrendanWalsh wants to merge 5 commits intomasterfrom
brwals/internal-compat-check
Open

ci: add SynapseML-Internal compatibility check to OSS pipeline#2542
BrendanWalsh wants to merge 5 commits intomasterfrom
brwals/internal-compat-check

Conversation

@BrendanWalsh
Copy link
Copy Markdown
Collaborator

Summary

Adds a non-blocking CI job (InternalCompat) to the OSS pipeline that validates SynapseML-Internal compiles and passes unit tests against the current OSS build. This catches breaking changes before they reach Internal.

What it does

  1. publishM2 — builds OSS JARs and publishes to local Maven repo
  2. Retarget — seds Internal's build.sbt to use the OSS version from this build + adds Resolver.mavenLocal
  3. Compile — runs sbt compile Test/compile against retargeted Internal
  4. Unit tests — creates Internal's conda env (with Synapse-Conda feed auth), fetches AI service secrets from mmlspark-keys, and runs spark.aifunc tests (128 tests)

Design decisions

  • Always runs, never blockscontinueOnError: true so failures surface as warnings, not build failures
  • spark.aifunc only — other test packages (powerbi, ebm, predict) extend HasSparkSession which eagerly initializes FabricTestConstants, requiring Fabric credentials from fabrictest-cert-admin-kv (not available in the OSS pipeline)
  • No Java pin — Internal uses agent-default Java 11 (not Java 8 like OSS), so we match that
  • Disk cleanup — Internal's conda env pulls PyTorch/CUDA (~15GB); we free ~30GB by removing Android SDK, .NET, GHC, Boost, and Docker images
  • CREATE_SEMPY_WRITER=false — the SemPy parquet writer dotnet codegen step is not needed for compat testing

CI validation

  • Build #213700535 — ✅ all green, 128/128 spark.aifunc tests passed
  • Prior runs validated compile-only, disk space, feed auth, and Java version fixes

Changes

All changes are in pipeline.yaml:

  • Added ADO repo resource for SynapseML-Internal
  • Added InternalCompat job (~100 lines)

Add an InternalCompat job that validates SynapseML-Internal compiles
against the current OSS build. Triggered via the testInternalCompat
pipeline parameter (default: false).

The job:
1. Checks out both OSS and Internal repos
2. Publishes OSS JARs to local Maven (~/.m2) via sbt publishM2
3. Retargets Internal's build.sbt to use the just-built OSS version
   (sed-replaces synapseMLVersion and adds Resolver.mavenLocal)
4. Runs sbt compile + Test/compile on Internal

This catches API-breaking changes (removed classes, changed signatures,
renamed packages) before they land in a release and break Internal.

Locally validated:
- sed patterns correctly modify Internal's build.sbt
- sbt version extraction works (core/version -> [info] line parsing)
- Internal compile + Test/compile succeeds against OSS artifacts in M2
The conda env creation was failing because synapseml-utils is in the
private A365/Synapse-Conda ADO feed. Added PipAuthenticate@1 before
conda env create, matching SynapseML-Internal's templates/conda.yml.

Also split conda setup into discrete steps (PATH, permissions, auth,
TOS, create) for clearer logs.
Internal's environment.yaml pulls PyTorch, CUDA libs, etc. (~15GB).
Agent disk fills up before pip finishes. Remove Android SDK, .NET,
Boost, GHC, and docker images to reclaim ~30GB. Also add pip cache
purge after env creation and bump job timeout to 90min.
The generateSemPyParquetWriterToolTask runs dotnet publish during
Compile/managedResources when CREATE_SEMPY_WRITER=true. We removed
the .NET SDK in the disk cleanup step, and the SemPy writer isn't
needed for compat testing. Set CREATE_SEMPY_WRITER=false.
1. Fix false-green: the || echo pattern swallowed exit codes, so the
   step always reported success even when all tests failed. Now tracks
   failures and exits non-zero.

2. Only run spark.aifunc tests. The other packages (powerbi, ebm,
   predict) extend HasSparkSession which eagerly initializes
   FabricTestConstants.INTEGRATION_WORKSPACE_ID — this requires
   INTEGRATION_ACCOUNT from fabrictest-cert-admin-kv, a Fabric-only
   Key Vault we don't have access to in the OSS pipeline.
Copilot AI review requested due to automatic review settings April 4, 2026 01:09
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 4, 2026

Hey @BrendanWalsh 👋!
Thank you so much for contributing to our repository 🙌.
Someone from SynapseML Team will be reviewing this pull request soon.

We use semantic commit messages to streamline the release process.
Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix.
This helps us to create release messages and credit you for your hard work!

Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

To test your commit locally, please follow our guild on building from source.
Check out the developer guide for additional guidance on testing your change.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 4, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 3143041.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Azure DevOps pipeline job to continuously validate that the closed-source SynapseML-Internal repo still compiles and passes a targeted unit-test subset when built against the current OSS SynapseML artifacts, helping detect breaking changes earlier.

Changes:

  • Adds a SynapseML-Internal repository resource to the pipeline.
  • Introduces a new non-blocking InternalCompat job that publishes OSS artifacts to local Maven, retargets Internal to that version, compiles, creates the Internal conda environment, and runs spark.aifunc tests.
  • Publishes Internal test results while keeping the job non-gating (continueOnError: true).
Show a summary per file
File Description
pipeline.yaml Adds a repo resource and a new non-blocking CI job to compile/test SynapseML-Internal against locally published OSS artifacts.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 1

Comment on lines +3 to +4
- repository: self
type: self
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the resources.repositories block, declaring the pipeline repo (repository: self) is typically unnecessary (the pipeline already has an implicit self repo for checkout: self). Consider removing this entry to reduce confusion and keep only the external SynapseML-Internal repository resource.

Suggested change
- repository: self
type: self

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit self declaration is needed here. The original pipeline had - repo: self (shorthand syntax). When restructuring to the full resources.repositories block to add the SynapseML-Internal external repo, the self entry must be included — ADO requires it in the list format for multi-repo checkout (checkout: self + checkout: SynapseML-Internal) to work correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants