Skip to content

fix: build-push-docker-manifest polling#1567

Merged
erikburt merged 4 commits into
mainfrom
fix/docker-manifest-polling
May 28, 2026
Merged

fix: build-push-docker-manifest polling#1567
erikburt merged 4 commits into
mainfrom
fix/docker-manifest-polling

Conversation

@erikburt
Copy link
Copy Markdown
Contributor

@erikburt erikburt commented May 28, 2026

This adds polling to manifest, before trying to get the digest from ECR.

Changes

Notes

Seeing exit code 255 during manifest push step. Example: https://github.com/smartcontractkit/chainlink/actions/runs/26590320161/job/78349080233?pr=22679

[77](https://github.com/smartcontractkit/chainlink/actions/runs/26590320161/job/78349080233?pr=22679#step:4:494)
#1 0.000 copying sha256:2531e76391220f8e6ab3d252722f30143ea2f220cf53472633372303bff1aed6 from ***.dkr.ecr.us-west-2.amazonaws.com/chainlink@sha256:db57e3d3a3ecc538a0cbb810da9860179bb1c6e1abfd1476b1bba7a8404ca9e0 to ***.dkr.ecr.us-west-2.amazonaws.com/chainlink
#1 1.306 pushing sha256:3ee95908cced393678e9214d8325c66e40e7ee08d5506cc9f4ee055d42648726 to ***.dkr.ecr.us-west-2.amazonaws.com/chainlink:pr-22679-b2b70d0-plugins-testing
#1 DONE 1.7s
Error: Process completed with exit code 255.
  • When we push the arch specific images, and before we push the manifest index, we poll for the images to ensure that the manifest is valid.
    • - name: Wait for source images to be available
      shell: bash
      env:
      DOCKER_MANIFEST_NAME: ${{ steps.manifest-name.outputs.name }}
      DOCKER_IMAGE_NAME_DIGESTS: ${{ inputs.docker-image-name-digests }}
      DOCKER_IMAGE_AVAILABILITY_CHECK:
      ${{ inputs.docker-image-availability-check }}
      run: |
      # Convert comma-separated list into array
      IFS=',' read -ra DIGESTS <<< "$DOCKER_IMAGE_NAME_DIGESTS"
      if [[ "${DOCKER_IMAGE_AVAILABILITY_CHECK}" == "true" ]]; then
      echo "Checking image availability before creating manifest..."
      MAX_RETRIES=5
      RETRY_DELAY=10
      ALL_AVAILABLE=false
      for attempt in $(seq 1 $MAX_RETRIES); do
      echo "Attempt ${attempt}/${MAX_RETRIES}: Checking image availability..."
      MISSING_IMAGES=()
      for digest in "${DIGESTS[@]}"; do
      # Trim whitespace from digest
      digest=$(echo "$digest" | xargs)
      IMAGE_WITH_DIGEST="${DOCKER_MANIFEST_NAME}@${digest}"
      if docker buildx imagetools inspect "${IMAGE_WITH_DIGEST}" >/dev/null 2>&1; then
      echo " ✓ ${digest} is available"
      else
      echo " ⏳ ${digest} not yet available"
      MISSING_IMAGES+=("${digest}")
      fi
      done
      if [[ ${#MISSING_IMAGES[@]} -eq 0 ]]; then
      echo "✅ All images are available!"
      ALL_AVAILABLE=true
      break
      else
      echo "❌ ${#MISSING_IMAGES[@]} image(s) still missing: ${MISSING_IMAGES[*]}"
      if [[ $attempt -lt $MAX_RETRIES ]]; then
      echo "Waiting ${RETRY_DELAY} seconds before next attempt..."
      sleep $RETRY_DELAY
      fi
      fi
      done
      if [[ "$ALL_AVAILABLE" != "true" ]]; then
      echo "::error::Failed to verify image availability after ${MAX_RETRIES} attempts"
      echo "::error::Missing images: ${MISSING_IMAGES[*]}"
      exit 1
      fi
      else
      echo "Image availability check disabled, using fixed delay..."
      echo "Waiting 30 seconds for source images to be fully available..."
      sleep 30
      fi
  • But when we pushed the manifest, we immediately queried the ECR for the digest which I believe is why we were seeing errors
  • # Execute the command
    docker buildx imagetools create "${CMD_ARGS[@]}"
    # Get manifest digest (format: sha256:hash)
    MANIFEST_DIGEST=$(docker buildx imagetools inspect "${DOCKER_MANIFEST_NAME_WITH_TAG}" | grep -m1 'Digest:' | awk '{print $2}')
  • This fixes that issue by similarly polling for the manifest, ensuring it's existence, and safely extracting the digest

Testing

Attempt 1/5: Inspecting manifest (***.dkr.ecr.us-west-2.amazonaws.com/chainlink:pr-22681-8bb010f) to retrieve digest...
Successfully retrieved manifest digest on attempt 1: sha256:6008273de81214df86c3a3e809167a32c60fd00827be1b08bacfd970c133ad41
manifest-digest=sha256:6008273de81214df86c3a3e809167a32c60fd00827be1b08bacfd970c133ad41
manifest-name=***.dkr.ecr.us-west-2.amazonaws.com/chainlink
manifest-name-with-digest=***.dkr.ecr.us-west-2.amazonaws.com/chainlink@sha256:6008273de81214df86c3a3e809167a32c60fd00827be1b08bacfd970c133ad41
manifest-name-with-tag=***.dkr.ecr.us-west-2.amazonaws.com/chainlink:pr-22681-8bb010f

^ This at least proves there's no regression. We will see if it fixes the issue that actually caused the 255 exits.

@erikburt erikburt self-assigned this May 28, 2026
@erikburt erikburt requested a review from chainchad May 28, 2026 19:20
@erikburt erikburt marked this pull request as ready for review May 28, 2026 19:20
@erikburt erikburt requested a review from a team as a code owner May 28, 2026 19:20
@erikburt erikburt merged commit a8bc5b6 into main May 28, 2026
18 checks passed
@erikburt erikburt deleted the fix/docker-manifest-polling branch May 28, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants