messages: Fix msg_headers deser to consume trailing tx-count varint#321
Open
djdarcy wants to merge 1 commit into
Open
messages: Fix msg_headers deser to consume trailing tx-count varint#321djdarcy wants to merge 1 commit into
djdarcy wants to merge 1 commit into
Conversation
The P2P 'headers' message wire format is a varint N followed by N entries, where each entry is an 80-byte CBlockHeader plus the standard CBlock framing's transaction-count varint. The varint is always 0 (the whole point of a 'headers' message is blocks-without-transactions) but it is still part of the on-wire bytes, so a deserializer that reads only the 80-byte header gets misaligned for header[1] onwards. The bug doesn't show up in self-round-trip tests because msg_ser also skipped the trailing byte, so serialize-then-deserialize was symmetric. It only surfaces against on-wire data from a real peer, where header[1] .hashPrevBlock ends up shifted one byte left of header[0].GetHash(). This fix consumes the trailing varint in msg_deser and writes one back out in msg_ser, keeping the symmetric round-trip property intact while matching the actual on-wire format. The existing Test_msg_headers.test_serialization round-trip test continues to pass. A regression test (genesis + block 1 chain continuity, exact byte consumption, round-trip identity against a wire-formatted body) is available as a gist for review and can be added to bitcoin/tests/test_messages.py if desired: https://gist.github.com/djdarcy/dc8843b6183f65165621c8fc09fcedc4 Closes petertodd#320.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hey @petertodd there is a small bug in
msg_headers.msg_deser(see: #320). Basically the P2Pheadersmessage wire format puts each entry inside the standard CBlock framing: 80-byte block header + transaction-count varint. The varint is always 0 (the whole point of aheadersmessage is blocks-without-transactions) but it is still part of the on-wire bytes. The current deserializer callsVectorSerializer.stream_deserialize(CBlockHeader, f), which reads exactly 80 bytes per header and skips the trailing varint. Meaning afterheader[0]the stream is one byte ahead of where the deserializer thinks it is, andheader[1]onwards is misaligned by one byte.The bug doesn't show up in self-round-trip tests because
msg_seralso skipped the trailing byte, so serialize-then-deserialize was symmetric. It only surfaces against on-wire data from a real peer:Observed output against any mainnet peer:
Note
header[1].hashPrevBlockisheader[0].GetHash()shifted left by one byte.This PR consumes the trailing varint in
msg_deserand writes one back out inmsg_ser, keeping the symmetric self-round-trip property intact. Note: the existingTest_msg_headers.test_serializationstill passes unchanged.A small wire-format regression test (builds a body from mainnet block 0 + block 1 and checks chain continuity, exact byte consumption, and round-trip identity) is available as a gist: https://gist.github.com/djdarcy/dc8843b6183f65165621c8fc09fcedc4. I held it back from this PR's diff to keep the change minimal -- happy to drop it into
bitcoin/tests/test_messages.pyif you'd like in-tree coverage.FWIW the bug is also why
opentimestamps-clientcurrently carries a workaround that intercepts theheaderscommand at the framing layer and parses the body manually (see opentimestamps/opentimestamps-client#164). Once this is added / released, that workaround can come out.Closes #320.
Tested on Python 3.12 + Win11; all 20 tests in
bitcoin/tests/test_messages.pystill pass.