Skip to content

fix: fail immediately on non-retriable navigation errors#4976

Open
andrewneilson wants to merge 1 commit intoSkyvern-AI:mainfrom
andrewneilson:non-retriable-nav-errors
Open

fix: fail immediately on non-retriable navigation errors#4976
andrewneilson wants to merge 1 commit intoSkyvern-AI:mainfrom
andrewneilson:non-retriable-nav-errors

Conversation

@andrewneilson
Copy link
Copy Markdown
Contributor

@andrewneilson andrewneilson commented Mar 4, 2026

Summary

  • Classify navigation failures into three buckets:
    • permanent: ERR_INVALID_URL
    • proxy-sensitive: ERR_NAME_NOT_RESOLVED, ERR_NAME_RESOLUTION_FAILED, ERR_CERT_*, ERR_SSL_*
    • transient: other net::ERR failures
  • Skip same-context retries for permanent and proxy-sensitive errors in navigate_with_retry()
  • Preserve the outer context-recreation retry in get_or_create_page() for proxy-sensitive and transient
    errors, since recreating the browser context may recover by selecting a different proxy node
  • Only permanent failures like ERR_INVALID_URL now bypass both retry layers
  • Use FailedToNavigateToUrl.error_message in the outer retry path when available instead of relying on
    str(e)

Test plan

  • ruff check skyvern/constants.py skyvern/webeye/navigation.py skyvern/webeye/real_browser_state.py tests/unit_tests/webeye/test_navigation_retry.py
  • ruff format --check skyvern/constants.py skyvern/webeye/navigation.py skyvern/webeye/ real_browser_state.py tests/unit_tests/webeye/test_navigation_retry.py
  • Unit coverage for:
    • inner retry short-circuit for permanent and proxy-sensitive errors
    • inner retry exhaustion for transient errors
    • transient recovery on retry
    • outer retry skip for permanent errors
    • outer context-recreation retry for DNS errors
    • outer context-recreation retry for transient net::ERR errors

🤖 Generated with Claude Code

@andrewneilson andrewneilson force-pushed the non-retriable-nav-errors branch from 97183ef to f97994e Compare March 4, 2026 11:22
Copy link
Copy Markdown
Contributor

@AronPerez AronPerez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-Review: PR #4976

Build Status

  • All CI checks pass.

Findings Summary

  • 1 bug (see inline comment) — outer retry loop undermines the non-retriable logic
  • 1 suggestion (see inline comment) — tests duplicate production logic
  • Good idea with clean implementation, but an existing outer retry loop will still retry non-retriable errors.

Agreement with Claude's Review

  • No Claude review found.

Reviewed by AronPerez's auto-reviewer

error_str = str(e)

if any(pattern in error_str for pattern in NON_RETRIABLE_NAV_ERRORS):
LOG.warning(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The outer retry in get_or_create_page (further down in this file) catches any exception whose str() contains "net::ERR" and retries by closing the browser context. Since FailedToNavigateToUrl embeds the original error in its message (e.g., "Failed to navigate to url ... Error message: net::ERR_NAME_NOT_RESOLVED"), these non-retriable errors will still be caught and retried at the get_or_create_page level — partially defeating the purpose of this PR. Consider either:

  1. Having get_or_create_page re-raise FailedToNavigateToUrl without retrying, or
  2. Adding the non-retriable check there as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. I ended up not disabling the outer retry for all of the error types in question since the two layers serve different purposes:

  • navigate_with_retry() retries in the same page/context/proxy, which is not useful for ERR_NAME_*,
    ERR_CERT_*, and ERR_SSL_*
  • the outer get_or_create_page() retry recreates the browser context, which is the only path that can pick a different proxy node. DNS/TLS failures can be proxy-specific so fully skipping the outer retry would turn recoverable proxy failures into immediate task failures.
  • I've kept permanent errors (currently just ERR_INVALID_URL) in a separate category that skips both

from skyvern.constants import NON_RETRIABLE_NAV_ERRORS
from skyvern.exceptions import FailedToNavigateToUrl


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: These tests re-implement navigate_to_url's retry logic in a local helper rather than testing the real RealBrowserState.navigate_to_url. If the production code changes its retry behavior, these tests will still pass. Consider mocking the heavy dependencies and testing the real method, or extracting the retry logic into a standalone testable function.

Also worth considering: adding net::ERR_SSL_ to the non-retriable patterns in constants.py — errors like ERR_SSL_PROTOCOL_ERROR indicate equally non-retriable TLS incompatibilities.

@andrewneilson andrewneilson force-pushed the non-retriable-nav-errors branch from f97994e to 8ac1a61 Compare March 19, 2026 15:16
@andrewneilson
Copy link
Copy Markdown
Contributor Author

made these changes to address feedback:

    fix: three-tier nav error classification for proxy-aware retry

    Split navigation errors into permanent (ERR_INVALID_URL) and
    proxy-sensitive (DNS/TLS) categories. Permanent errors bypass all
    retry layers. Proxy-sensitive errors skip same-context retry but
    still get the outer context-recreation path in get_or_create_page,
    which can pick a different proxy node.

    Also extracts navigate_with_retry into a shared helper and uses
    the typed FailedToNavigateToUrl.error_message field in the outer
    except block instead of falling back to str(e).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andrewneilson andrewneilson force-pushed the non-retriable-nav-errors branch from 8ac1a61 to 8cb76d3 Compare April 11, 2026 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants