fix: clean PVC source workspace before each on-cluster build#3583
fix: clean PVC source workspace before each on-cluster build#3583Ankitsinghsisodya wants to merge 4 commits intoknative:mainfrom
Conversation
The named PVC reused across builds was accumulating stale source files, eventually causing 'No space left on device' errors during git-init. - Add CleanAndUploadToVolume which removes the source/ directory before re-extracting the tar stream, leaving the cache/ subpath intact - Switch the direct-upload path to use CleanAndUploadToVolume - Pass deleteExisting: "true" to the git-clone StepAction in both buildpack and s2i task templates so the git path also gets a clean slate Fixes knative#3516
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Ankitsinghsisodya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @Ankitsinghsisodya. Thanks for your PR. I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Pull request overview
This PR addresses PVC workspace bloat in on-cluster Tekton builds by ensuring each build starts from a clean source/ workspace while preserving the cache/ path for build caching.
Changes:
- Added
k8s.CleanAndUploadToVolumeto remove the PVC’ssource/directory before extracting a new source tar stream (leavingcache/intact). - Switched the non-Git (direct upload) pipeline path to use
CleanAndUploadToVolume. - Configured Tekton
git-cloneStepAction to wipe the existing checkout viadeleteExisting: "true"for Git-based builds.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/k8s/persistent_volumes.go | Adds a new clean+upload helper that removes source/ before untarring content. |
| pkg/pipelines/tekton/pipelines_provider.go | Updates direct-upload pipeline flow to use the clean+upload helper. |
| pkg/pipelines/tekton/task-buildpack.yaml.tmpl | Sets deleteExisting: "true" for git-clone to avoid accumulating workspace content. |
| pkg/pipelines/tekton/task-s2i.yaml.tmpl | Sets deleteExisting: "true" for git-clone to avoid accumulating workspace content. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // CleanAndUploadToVolume removes the "source" directory from the volume root, | ||
| // then extracts the provided tar stream into the volume. The "cache" | ||
| // subdirectory is intentionally left intact so that build-layer caches | ||
| // accumulated by previous runs are preserved. | ||
| func CleanAndUploadToVolume(ctx context.Context, content io.Reader, claimName, namespace string) error { | ||
| return runWithVolumeMounted(ctx, TarImage, []string{"sh", "-c", "umask 0000 && rm -rf source && exec tar -xmf -"}, content, claimName, namespace) | ||
| } |
There was a problem hiding this comment.
CleanAndUploadToVolume introduces new cleanup behavior (deleting the existing source/ directory while preserving cache/), but there’s no test covering that contract. Consider adding an integration test that (1) pre-populates both source/ and cache/, (2) calls CleanAndUploadToVolume, and (3) asserts source/ is replaced and cache/ remains intact, to prevent regressions that could reintroduce PVC bloat or accidentally wipe caches.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3583 +/- ##
==========================================
- Coverage 56.24% 47.08% -9.17%
==========================================
Files 180 180
Lines 20465 20521 +56
==========================================
- Hits 11511 9662 -1849
- Misses 7755 9967 +2212
+ Partials 1199 892 -307
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…full Root cause: source/ and cache/ shared a single 256Mi PVC. The buildpack layer cache (cache/) grows across builds while source/ is already cleaned by git-clone's deleteExisting default. Once cache/ consumes all available space, git-clone fails with "No space left on device" during git-init. Changes: - Add getPipelineCachePvcName and a by-name DeletePersistentVolumeClaim - Standard pipeline: delete and recreate the source PVC before every build for a guaranteed clean workspace; create a separate persistent cache PVC so the buildpack layer cache survives across builds - PAC pipeline: use volumeClaimTemplate for the source workspace so Tekton provisions a fresh, ephemeral source PVC per run and cleans it up automatically; create a persistent cache PVC at repo-config time - Revert the redundant deleteExisting: "true" additions to both task templates — the pinned git-clone StepAction already defaults to true Fixes knative#3516
- Restore subPath: source on the standard PipelineRun source-workspace binding; sourcesAsTarStream archives under source/ so the workspace mount must point at that subPath for the task to find func.yaml at the expected path - Honour build.remoteStorageClass in PAC volumeClaimTemplate by adding a StorageClassName field to templateData and conditionally emitting storageClassName in both pack and s2i PAC PipelineRun templates - Remove the dead source PVC creation from createClusterPACResources; PAC PipelineRuns now bind source-workspace via volumeClaimTemplate so the named source PVC was provisioned but never mounted
Two bugs: 1. Race condition: DeletePersistentVolumeClaim is asynchronous. The PVC enters Terminating state and the subsequent Create call was receiving AlreadyExists, treating it as success, and proceeding against a terminating object. Fix: add WaitForPVCDeletion which polls until the PVC is NotFound before returning, and call it after every delete. 2. Cache accumulation: the dedicated cache PVC was explicitly kept across builds and never trimmed. CNB does not automatically evict unused layers so the cache PVC would grow unboundedly over time — the same disk-full failure the fix was meant to prevent, just deferred. Fix: rotate both source and cache PVCs before every standard build. For PAC, switch cache workspace to volumeClaimTemplate so both workspaces are ephemeral per run with no long-lived PVCs on the cluster at all.
Root cause
source/andcache/share a single 256 Mi PVC. The buildpack layer cache (cache/) grows across builds whilesource/is already cleaned by git-clone'sdeleteExistingdefault. Oncecache/consumes all available space, git-clone fails with "No space left on device" during git-init — affecting both the standard pipeline and PAC paths.Fix
Separate source and cache into two distinct PVCs so they can no longer starve each other:
volumeClaimTemplate— Tekton provisions a fresh ephemeral PVC per run and auto-deletes itBuildpack layer cache is preserved. Every build starts from a clean source workspace. The two volumes can never fill each other's disk.
Changes
pkg/k8s/persistent_volumes.go— addDeletePersistentVolumeClaim(by name) andCleanAndUploadToVolumepkg/pipelines/tekton/resources.go— addgetPipelineCachePvcNamepkg/pipelines/tekton/pipelines_provider.go— delete+recreate source PVC before each standard build; create cache PVC separatelypkg/pipelines/tekton/pipelines_pac_provider.go— create cache PVC at repo-config timetemplates.go— addCachePvcName,PvcSizefields; addpipelinePvcSizehelpertemplates_pack.go/templates_s2i.go— standard runs reference separate PVCs; PAC runs usevolumeClaimTemplatefor sourcetask-buildpack.yaml.tmpl/task-s2i.yaml.tmpl— revert redundantdeleteExisting: "true"(already the default in the pinned StepAction)Fixes #3516
Test plan
func deploy(non-Git) 10+ times; confirm source PVC is recreated fresh each time and PVC never fillsfunc deploy(Git URL) 10+ times; confirm git-clone always succeeds and cache PVC grows independentlyfunc deleteremoves both source and cache PVCsgo test ./pkg/k8s/... ./pkg/pipelines/tekton/...