Skip to content

[pull] master from apify:master#226

Merged
pull[bot] merged 3 commits into
threatcode:masterfrom
apify:master
Jun 22, 2026
Merged

[pull] master from apify:master#226
pull[bot] merged 3 commits into
threatcode:masterfrom
apify:master

Conversation

@pull

@pull pull Bot commented Jun 22, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Copilot AI and others added 3 commits June 22, 2026 13:12
…nt reuse (#3771)

`ImpitHttpClient` always reused cached `Impit` instances, which might be
undesirable, as described in #3769. This change introduces an explicit
cache toggle so callers can opt out of client reuse.

Adds `cacheClients?: boolean` to `ImpitHttpClient` constructor options
(default: `true`), with JSDoc on the new option.

```ts
const httpClient = new ImpitHttpClient({
  cacheClients: false,
  // other impit options...
});
```

Fixes #3769
…#3765)

## What & why

`RequestProvider.addRequestsBatched` retried `unprocessedRequests` via a
recursive helper with no retry limit. When the request queue permanently
rejects a request — e.g. a `400` from the platform batch-add API for a
malformed `userData` shape — the same request kept coming back as
unprocessed, so the recursion never terminated and the crawler hung
forever with no actionable error.

## Fix

Bound the recursion to a small number of *consecutive no-progress*
attempts (`MAX_UNPROCESSED_REQUESTS_RETRIES = 3`). Once exhausted, the
remaining unprocessed requests are skipped and a warning is logged
pointing at the likely cause. Any progress (some requests accepted)
resets the counter, so transient backpressure is still retried as
before.

Closes #3764
## What

Lets users declare a `label → userData` map once and pass it as the
router's `Routes` type argument, so route handlers get
`request.userData` typed by label — and unknown labels become a
compile-time error.

```ts
interface Routes {
    PRODUCT: { sku: string; price: number };
    CATEGORY: { categoryId: string };
}

const router = createCheerioRouter<CheerioCrawlingContext, Routes>();

router.addHandler('PRODUCT', async ({ request }) => {
    request.userData.sku;   // string
    request.userData.price; // number
});

router.addHandler('TYPO', async () => {}); // ❌ compile error: not a known label
```

## How

- New `Routes` generic on `Router`, `RouterHandler` and `RouterRoutes`,
defaulting to an open `Record<string, …>` so existing untyped usage is
unchanged.
- `Router.create` and all 10 `createXRouter` factories expose **two
overloads** — a route-map form (per-label typing) and the legacy
flat-`userData` form. The second type argument selects between them: a
route map matches the first overload, any other shape falls through to
the second. This makes the documented `createXRouter<Ctx, Routes>()`
form work **without changing the meaning of the released `<Ctx,
UserData>` type argument**, keeping it fully backwards compatible. (The
sole ambiguous case — a flat `userData` whose every field is itself an
object — is read as a route map; a single scalar field disambiguates
it.)
- `addHandler` gains two overloads: a **label-strict** one (infers
`userData` from the label) tried first, and the existing
**`<UserData>`-generic** one as the fallback. The default handler
receives the union of all declared `userData` shapes.
- The route-map constraint uses the F-bounded `Record<keyof Routes,
Dictionary>` form so both `interface` and `type` route maps are accepted
(a plain `Record<string, …>` constraint silently rejects `interface`
declarations).

This is a compile-time-only change (zero runtime cost). A follow-up
targeting `v4` adds an opt-in [Standard
Schema](https://standardschema.dev) variant that also validates
`userData` at runtime.

Relates to #3082

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@pull pull Bot locked and limited conversation to collaborators Jun 22, 2026
@pull pull Bot added the ⤵️ pull label Jun 22, 2026
@pull pull Bot merged commit 1124aca into threatcode:master Jun 22, 2026
@pull pull Bot had a problem deploying to github-pages June 22, 2026 16:19 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants