Scope the primary-key → _id rename to the outermost document#204
Scope the primary-key → _id rename to the outermost document#204scottmessinger wants to merge 2 commits into
Conversation
The MongoDB convention of renaming the schema's primary-key field to
`_id` was being applied recursively. Once `Conversions.from_ecto_pk/2`
descended into a nested map (for example an `embeds_one` sub-document),
the parent schema's `pk` value was still threaded through, so any
nested field named `:id` was also rewritten to `:_id`.
That meant an embedded value like
%{jurisdiction: %{id: "MD", title: "Maryland"}}
was stored as
{jurisdiction: {_id: "MD", title: "Maryland"}}
even when the embedded schema declared its own `@primary_key {:id, :string,
autogenerate: false}`. Queries that filter on `jurisdiction.id` (the
common shape for apps that put a string identifier on a sub-document)
return nothing, and the wire format diverges from what other clients
(Ruby driver, Node driver, etc.) write for the same model.
The pk -> _id rename is only meaningful at the top of a document. Below
that, the field names are user-controlled and should be passed through
unchanged. This patch makes the recursive call inside `document/2` and
`document/3` pass `nil` for the pk argument, so the rename only fires
for the outermost document.
`Conversions.to_ecto_pk/2` (the read path) is left as-is for now. Its
rename of nested `"_id"` -> `Atom.to_string(pk)` only matters when a
sub-document actually contains an `_id` field, which is uncommon in
hand-written documents (Mongo only auto-adds `_id` to top-level docs).
A symmetric read-side change can follow separately if there's appetite.
There was a problem hiding this comment.
Pull request overview
This PR fixes MongoDB serialization so the schema primary-key → _id rename only applies to the outermost document, preventing nested sub-documents (e.g., embeds_one) from having their own :id fields incorrectly rewritten to :_id.
Changes:
- Update
document/2to recurse into nested values withpk = nil, avoiding accidental:id→:_idrewrites inside embedded/nested documents. - Update
document/3(used viainject_params/3) with the samepk = nilrecursion behavior for parameter injection paths. - Add explanatory inline comments describing the bug and rationale.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Pure unit tests on `Mongo.Ecto.Conversions.from_ecto_pk/2` and `inject_params/3` covering the recursion fix: - top-level `:id` is renamed to `:_id` - a nested map's `:id` is left alone even when the parent pk is `:id` - the same holds inside a list of nested maps - `pk: nil` is a no-op Verified by reverting `conversions.ex` to the pre-patch version: 3 of the 5 tests fail (nested `:id` ends up as `:_id`). With the patch they all pass.
mweidner037
left a comment
There was a problem hiding this comment.
The rationale sounds valid, though I am not familiar with the actual code.
I'm surprised we have not run into this issue before. It looks like our StandardSet.jurisdiction fields are at least saved correctly already (the DB has jurisdiction.id instead of jurisdiction._id).
I believe this counts as a breaking change, in case consumers depend on the old functionality (e.g., they renamed their embedded document IDs to _id).
The MongoDB convention of renaming the schema's primary-key field to
_idis currently applied at every level of recursion. OnceConversions.from_ecto_pk/2descends into a nested map (for example anembeds_onesub-document), the parent schema'spkis still threadedthrough, so any nested field named
:idis also rewritten to:_id.That means an embedded value like
gets stored as
even when the embedded schema declared its own
@primary_key {:id, :string, autogenerate: false}. Queries that filteron
jurisdiction.id(the common shape for apps that put a stringidentifier on a sub-document) return nothing, and the wire format
diverges from what other Mongo clients (the Ruby driver, the Node
driver, plain
mongoshinserts) would write for the same model.The pk →
_idrename is only meaningful at the top of a document.Below that, the field names are user-controlled and should be passed
through unchanged. This patch makes the recursive calls inside
document/2anddocument/3passnilfor the pk, so the renameonly fires for the outermost document.
Repro (before this patch)
After the patch the nested key is
id, not_id.Read path
Conversions.to_ecto_pk/2(the read path) is left as-is for now. Itsrename of nested
"_id"→Atom.to_string(pk)only matters when asub-document actually contains an
_idfield. Mongo only auto-creates_idon top-level documents, so this is rare in practice. A symmetricread-side change can follow separately if there's appetite — happy to
add it to this PR if you'd prefer.
Tests
The existing mongodb_ecto test suite passes against this change.
Downstream verification: the CSP API
(
commonstandardsproject/api, vendored copy atelixir/vendor_deps/mongodb_ecto/)goes from 17 broken read-side contract tests to a clean run of 34/34
once
jurisdiction.idqueries stop missing.