Skip to content

fix(NODE-7606): restore serializer performance by removing generator-based iteration#890

Open
tadjik1 wants to merge 6 commits into
mainfrom
fix/NODE-5937-serializer-perf
Open

fix(NODE-7606): restore serializer performance by removing generator-based iteration#890
tadjik1 wants to merge 6 commits into
mainfrom
fix/NODE-5937-serializer-perf

Conversation

@tadjik1

@tadjik1 tadjik1 commented Jun 9, 2026

Copy link
Copy Markdown
Member

Description

Summary of Changes

#886 (NODE-5937) made serializeInto iterative and, along the way, started producing each frame's key/value pairs from a generator (toKvPairs). Generators aren't free: every .next() saves and restores execution state and hands back a fresh {value, done} object, and Object.entries() builds a [key, value] array for each field on top of that. On a flat document with 145 fields that's hundreds of throwaway allocations per serialize, and it cost about 54% on flat_bson._serialize_bson.

This PR drops the generator. makeFrame() calls Object.keys() once when the frame is created and the loop walks them by index. Maps keep an entries() iterator since they have no random access, and arrays index straight into keyIndex.

A few smaller changes in the same hot loops:

  • regexp.test(key) replaces key.match(regexp), and the ignoreKeys Set lookup only runs when key[0] === '$'.
  • value._bsontype is read once per BSON-typed value instead of several times down the dispatch chain.
  • In the deserializer, the per-field isArray recompute becomes a maintained currentIsArray; the setValue closure becomes a top-level assignValue(dest, …) with currentDest tracked next to currentFrame so the destination is never recomputed per field; and a dead globalUTFValidation frame field is removed.
  • Both parsers now thread their frame stack as a linked list (a prev pointer per frame) instead of a separate array, which drops the array push/pop per nested frame (suggested by @addaleax in the fix(NODE-5937): rewrite deserializeObject as iterative, not recursive #886 review).

What is the motivation for this change?

flat_bson._serialize_bson normalized throughput dropped from ~216 to ~99 after #886 merged. The perf alerting context on the perf build variant caught it.

Benchmarks

rhel90-dbx-perf-large, Node v22.11.0, 10 000 iterations, 1 000 warmup. normalized_throughput, higher is better.

Test Baseline (2ea8dbb) Regression (0231603) This PR vs Baseline vs Regression
flat_bson._serialize_bson 215.58 98.87 228.42 +6.0% +131%
deep_bson._serialize_bson 76.61 56.94 83.63 +9.2% +47%
full_bson._serialize_bson 92.33 85.76 132.01 +43% +54%
flat_bson._deserialize_bson 93.62 90.76 92.23 −1.5% +1.6%
deep_bson._deserialize_bson 69.00 63.87 64.19 −7.0% +0.5%
full_bson._deserialize_bson 82.04 77.12 80.12 −2.3% +3.9%

Serialize is back above baseline on all three document shapes. Flat and full deserialize sit within noise of baseline.

https://spruce.corp.mongodb.com/task/node_bson_perf_run_spec_benchmarks_patch_c5d1dd89d933556a9c9880152156d15caa0f293b_6a282aa58fce9500076b6fbc_26_06_09_15_00_55/trend-charts?execution=0

image image
image image
image image
image image
image image
image image
image

The only metric still under baseline is deep_bson._deserialize_bson (−7.0%), and that's structural rather than something this PR introduced. deep_bson is 63 documents nested six deep, so the iterative deserializer allocates a frame object for each nested document where the old recursive code just leaned on the call stack. Those allocations are most of the gap. This PR nudges the number up (63.87 to 64.19) and makes deep serialize a lot faster, but it doesn't close it. We can probably claw the rest back with lighter or pooled frames; I left that for a follow-up, since what remains is small and inside the benchmark's run-to-run noise.

Driver benchmarks

I also ran the driver's spec benchmark suite end to end with node-mongodb-native pointed at this branch (the performance-tests variant), comparing against the bson ^7.2.0 it currently ships, which predates #886 and is the known-good recursive code. Nothing regressed on the BSON-sensitive benchmarks. Serialize came out at parity or faster. Deserialize was at parity apart from a ~6% dip on the 1M-tiny-document aggregate, which is the same per-document setup cost from #886 noted above and not new here. The large-document bulk-insert benchmark bounced between −22% and +5% across runs, with the slow slot moving between it and the single-document variant; a micro-benchmark and the benchmark's own ~8% mainline CV confirm that's noise from the 2.73 MB document, not the serializer.
1 run
2 run

Double check the following

  • Lint is passing (npm run check:lint)
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: type(NODE-xxxx)[!]: description
    • Example: feat(NODE-1234)!: rewriting everything in coffeescript
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

tadjik1 added 5 commits June 9, 2026 11:57
…based iteration

The iterative rewrite in 0231603 introduced a ~54% throughput regression on
flat documents (flat_bson serialize: ~21 µs vs ~10 µs baseline).

Root cause: serializeInto used a generator function (toKvPairs / yield*)
to produce key-value pairs, driving the hot loop via Iterator.next(). For
every field this paid: generator state-machine transitions, {value,done}
heap allocations, and [key,value] pair arrays from Object.entries().

Fix: replace the generator with index-based iteration in SerializationFrame.
makeFrame() pre-computes Object.keys() once per plain object and stores the
array and a keyIndex counter on the frame. The hot loop reads keys[keyIndex++]
and accesses the value directly via property lookup — no allocations per field.
Arrays use keyIndex as the numeric index directly. Maps retain an explicit
iterator since they lack numeric indexing.

Cycle detection and toBSON() semantics are unchanged.
- Replace key.match(regexp) with regexp.test(key) — avoids allocating a
  match-result array on every key of every object
- Guard ignoreKeys.has(key) with key[0] === '$' — all four ignored keys
  start with '$', so non-dollar keys skip the Set lookup entirely
- Remove redundant typeof key === 'string' guard — keys are always strings
- Cache value._bsontype in the BSON-type dispatch branch — eliminates
  4-5 repeated property reads per BSON-typed value (Long, ObjectId, etc.)
The iterative deserializer rewrite in #886 left several costs that the
recursive version did not have:

- A per-field recompute of `isArray` (read currentFrame, branch, assign on
  every key) — replaced with a `currentIsArray` flag updated only when a
  frame is pushed or popped.
- A `setValue` closure invoked once per field that re-derived the
  destination document from `currentFrame` on every call — replaced with a
  maintained `currentDest` and a module-level `assignValue`, so the common
  path is a direct property write with no closure call or branch.
- A dead `globalUTFValidation` field written into every frame literal but
  never read — removed, shrinking each frame allocation (helps nested docs
  where frame churn dominates).

`toPotentialDbRef` is also hoisted to module scope so it is not recreated
per deserialize call. Behavior is unchanged; the proto-pollution guard,
DBRef promotion, code-with-scope, raw, and array handling are all preserved.
Both the serializer and deserializer carried an explicit array as the frame
stack (push/pop plus length/top bookkeeping). This replaces that array with a
linked list threaded through a `prev` pointer on each frame: descending into a
nested document points the new frame at its parent, and finishing a frame
restores `currentFrame = frame.prev`. The deserializer's `nestedParsingStack`
array is removed entirely, since it duplicated the `currentFrame` pointer.

Suggested by @addaleax during the #886 review as a follow-up to try. Local
interleaved A/B measurement (both builds in one process, min of 11 reps per
shape) shows serialize consistently ~3% faster on flat/deep/full and
deserialize neutral; the array-vs-linked difference is removed allocation and
index bookkeeping per nested frame.

Behavior is unchanged: deep-nesting, DBRef promotion, code-with-scope, raw,
Map, and __proto__ handling all verified.
Commit a22387a dropped the `typeof key === 'string'` guard from the key
validation block on the assumption that keys are always strings. That holds
for plain objects and arrays, but Map keys can be non-string. Without the
guard, a non-string Map key with checkKeys:true reached `key.includes('.')`
and threw a TypeError from a different site than the baseline.

Restoring the guard makes behavior byte-identical to the pre-regression
implementation: non-string Map keys (unsupported by design, NODE-3027) fail
in encodeUTF8Into exactly as before, regardless of checkKeys. The Set lookup
optimization is unaffected — it remains gated by `key[0] === '$'`.
@tadjik1 tadjik1 force-pushed the fix/NODE-5937-serializer-perf branch from b6fb5f3 to be49fe2 Compare June 9, 2026 15:00
@tadjik1 tadjik1 changed the title fix(NODE-5937): restore serializer performance by removing generator-based iteration fix(NODE-7606): restore serializer performance by removing generator-based iteration Jun 9, 2026
@tadjik1 tadjik1 marked this pull request as ready for review June 9, 2026 16:44
@tadjik1 tadjik1 requested a review from a team as a code owner June 9, 2026 16:44
Copilot AI review requested due to automatic review settings June 9, 2026 16:44

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restores BSON serializer performance by replacing generator-based key/value iteration with a lower-allocation frame iterator, and applies additional hot-path optimizations in both serialization and deserialization while preserving the iterative (non-recursive) parsing approach introduced in #886.

Changes:

  • Replaced generator-based iteration (toKvPairs) with a makeFrame() helper that precomputes keys once and advances with numeric indices; switched the frame stack from an array to a linked list (prev).
  • Reduced per-field overhead in the serializer (e.g., regexp.test, gated $ ignoreKeys check, cached _bsontype lookup).
  • Reduced per-field overhead in the deserializer by maintaining currentDest/currentIsArray, removing a dead frame field, and switching the parsing stack to a linked list.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/parser/serializer.ts Removes generator/Object.entries iteration in favor of indexed frame iteration and linked-list frames; applies small per-field hot-path reductions.
src/parser/deserializer.ts Switches nested frame stack to a linked list and reduces per-field work via maintained destination/array state and helper functions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tadjik1 tadjik1 marked this pull request as draft June 9, 2026 18:19
@tadjik1 tadjik1 marked this pull request as ready for review June 9, 2026 20:09
@tadjik1 tadjik1 requested review from addaleax and nbbeeken June 9, 2026 20:09
addaleax
addaleax previously approved these changes Jun 10, 2026
Comment thread src/parser/serializer.ts Outdated
throw new BSONVersionError();
} else if (value._bsontype === 'ObjectId') {
}
const bsontype = value._bsontype;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const bsontype = value._bsontype;
const bsontype = value[bsonType];

Might be a really good opportunity for this drive-by cleanup

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Replaced with const tag = ... to avoid naming problems.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if ('_bsontype' in object && typeof object._bsontype === 'string') {
      throw new BSONError(`BSON types cannot be serialized as a document`);

We have this error checking code, I don't think we can/should change it but it's an interesting limitation that's now lifted 😅 we can support { "_bsontype" : "str" } documents for those who like to have fun with their data schema....

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not even just necessarily a limitation – I think it would be a reasonable invariant to expect that bson.serialize() never throws if its argument is a return value of bson.deserialize(). I'm pretty sure you this means that for example in the driver, right now there are documents that you can read from the database but not copy/replace back into it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes another driver could totally insert a document that looks like the example I gave and our driver couldn't round trip it. so, bug ticket?

Are there issues with backwards compat 🤔 like if we lift the prevention of this is there the chance that it could be more likely to accidentally insert a document of this shape when you meant to just have a BSON primitive... need to think on that.

@tadjik1 tadjik1 Jun 11, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems like a pre-existing issue - it was there before the PR. Worth a ticket though, I'll file one, thanks!
https://jira.mongodb.org/browse/NODE-7608

tadjik1 added a commit that referenced this pull request Jun 10, 2026
Per @addaleax's review suggestion on #890, dispatch on the public `bsonType`
symbol (`value[bsonType]`) instead of the `_bsontype` string in serializeInto
and the serializeLong/serializeMinMax helpers, so the tag read goes through
the documented interface. The local is renamed to `tag` so it doesn't collide
with the imported `bsonType` symbol.

The checks that guard against duck-typed or foreign objects keep reading the
`_bsontype` string: the root guard, the plain-object check, and the
"unrecognized _bsontype" catch-all all need to fire for an object that carries
a string `_bsontype` (and possibly a forged version symbol) but no `bsonType`
symbol. Routing those through the symbol would let such objects slip past, as
the existing serializer tests confirm. Output is unchanged, verified
byte-identical against the recursive baseline (33 832 differential checks).
Per @addaleax's review suggestion on #890, dispatch on the public `bsonType`
symbol (`value[bsonType]`) instead of the `_bsontype` string in serializeInto
and the serializeLong/serializeMinMax helpers, so the tag read goes through
the documented interface. The local is renamed to `tag` so it doesn't collide
with the imported `bsonType` symbol.
@tadjik1 tadjik1 force-pushed the fix/NODE-5937-serializer-perf branch from e0fea98 to 0954119 Compare June 10, 2026 14:28

@nbbeeken nbbeeken left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work! linked lists are under utilized nice to see it lending a hand :)))

@PavelSafronov PavelSafronov self-assigned this Jun 12, 2026

@PavelSafronov PavelSafronov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, approving. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants