Skip to content

fix: retry transient failures in container image metadata fetch#386

Open
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-container-detail-fetching
Open

fix: retry transient failures in container image metadata fetch#386
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-container-detail-fetching

Conversation

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Registry requests for container image metadata intermittently fail with timeouts or transient network errors, with no retry logic causing the entire fetch to fail immediately.

Backend (docker-registry.js)

  • Added withRetry helper: retries up to 2 times with linear backoff (2 s, 4 s) on transient errors — timeout, ECONNRESET, ECONNREFUSED, ETIMEDOUT, ENOTFOUND, HTTP 5xx
  • Non-transient errors (404, auth failures) are rethrown immediately without retrying
  • getImageConfig (the entry point for the /metadata API) is wrapped with withRetry
async function withRetry(fn, maxRetries = 2, retryDelay = 2000) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (err) {
      if (attempt === maxRetries) throw err;
      const isTransient = err.message.includes('timeout') || /HTTP 5\d\d/.test(err.message) || ...;
      if (!isTransient) throw err;
      await new Promise(resolve => setTimeout(resolve, retryDelay * (attempt + 1)));
    }
  }
}

async function getImageConfig(registry, repo, tag) {
  return withRetry(() => _getImageConfig(registry, repo, tag));
}

Frontend (ContainerFormPage.tsx)

  • When all retries are exhausted, the error message now appends — Use "Fetch metadata" to try again. to direct the user to the existing retry button.

- Add `withRetry` helper to docker-registry.js that retries up to 2 times
  on transient errors (timeout, ECONNRESET, ECONNREFUSED, ETIMEDOUT,
  ENOTFOUND, HTTP 5xx) with linear backoff (2s, 4s)
- Wrap `getImageConfig` with `withRetry` so registry timeouts no longer
  fail the metadata endpoint on the first attempt
- Improve frontend error message to hint the user can click
  "Fetch metadata" to retry manually if all retries are exhausted

Closes #385
Copilot AI changed the title [WIP] Fix container detail fetching timeout issue fix: retry transient failures in container image metadata fetch Jun 25, 2026
Copilot AI requested a review from runleveldev June 25, 2026 17:06
@runleveldev runleveldev marked this pull request as ready for review June 25, 2026 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants