Skip to content

Quad4-Software/msgpack

Repository files navigation

msgpack

This repository is a fork of github.com/vmihailenco/msgpack (v5 API), maintained by Quad4 at github.com/Quad4-Software/msgpack.

The wire format and public API are unchanged: every Marshal / Unmarshal / Encoder / Decoder call signature, struct tag, and option from upstream v5.4.1 continues to work. This fork adds security, correctness, and performance fixes; it does not introduce breaking changes.

Drop-in migration from github.com/vmihailenco/msgpack/v5

  1. Replace the import path:

    // before

import "github.com/vmihailenco/msgpack/v5"

// after import "github.com/Quad4-Software/msgpack/v5/pkg/msgpack"


The package is still imported as `msgpack`, so call sites do not need to change.

2. Pull the module:

```bash
go get github.com/Quad4-Software/msgpack/v5@latest
  1. (Optional) For users of the App Engine helpers:

    // before

import "github.com/vmihailenco/msgpack/v5/msgpappengine"

// after import "github.com/Quad4-Software/msgpack/extra/msgpappengine"


This is a separate Go module; install with `go get github.com/Quad4-Software/msgpack/extra/msgpappengine@latest`.

The wire codes (subpackage `msgpcode`) move from `github.com/vmihailenco/msgpack/v5/msgpcode` to `github.com/Quad4-Software/msgpack/v5/pkg/msgpack/msgpcode`. Constants are unchanged.

## What this fork fixes vs. upstream `v5.4.1`

The upstream module has been effectively unmaintained for several years. This fork addresses, without changing the API:

### Security

- **OOM via forged length prefixes.** Four decode paths (`bytes`, `bytesPtr`, `decodeSlice`, `decodeSliceValue`, `DecodeMap`, `DecodeUntypedMap`) trusted the on-wire length and allocated up front, so a single hostile `bin32` / `str32` / `array32` / `map32` header could request a multi-gigabyte allocation before the underlying short read failed. All four now clamp the initial allocation to the documented per-decoder limit and grow incrementally as real bytes arrive. The `disableAllocLimitFlag` toggle (`Decoder.DisableAllocLimit(true)`) preserves the legacy unbounded behaviour for callers that want it.
- **Stack exhaustion and 32-bit length overflow hardening.** Decoder recursion now enforces a depth cap (`SetDecodeDepthLimit`, default 10,000) so hostile deeply nested payloads fail with an error instead of consuming unbounded stack. `str32` / `bin32` / `array32` / `map32` / `ext32` length parsing now rejects `uint32` lengths that overflow `int` on 32-bit builds rather than silently wrapping.
- **`disableAllocLimitFlag` was a no-op in `decodeSliceValue`.** Upstream compared the bit flag against the literal `1`, but the flag is `1 << 1 == 2`, so the limit was never applied along the typed-slice path. The comparison is now `!= 0`.
- **Goroutine and memory leak in the per-type preallocator.** The upstream `cachedValue` spawned one perpetual goroutine for every distinct `reflect.Type` ever decoded into, plus a 256-slot buffered channel each, and held a global `sync.RWMutex` on the hot decode path. A long-running process that ever decoded into N distinct Go types retained N goroutines forever. The preallocator is now backed by a `sync.Map` of `*sync.Pool`; idle entries are reclaimed by the GC, the global mutex is gone from the lookup path, and goroutine count stays flat under churn. Verified by `TestNoGoroutineLeakOnDistinctTypes` and `TestConcurrentDistinctTypesPreallocate` in `pkg/msgpack/leak_test.go`.

### Correctness

- **`invalid code=cb decoding int64` on JSON-sourced data.** `(*Decoder).int` and `(*Decoder).uint` now accept `msgpcode.Float` / `msgpcode.Double` payloads when the destination is a Go integer, provided the value is finite, integer-valued, and in range. This unblocks the common JSON -> `map[string]any` -> msgpack -> struct round-trip (`encoding/json` decodes JSON numbers into `float64` regardless of declared field type). NaN, infinities, fractional values, negative values into `uint64`, and out-of-range magnitudes still error rather than silently truncating.
- **Pooled `*bytes.Reader` does not retain caller data.** `Marshal` / `Unmarshal` reuse internal buffers via `sync.Pool`; the wrappers are reset to `nil` before being returned to the pool so they cannot leak the previous caller's slice into a subsequent call. Verified by `TestPoolDoesNotRetainCallerData` and `TestInvariantPoolDoesNotAlias`.

### Performance (vs. upstream `v5.4.1`, geomean over 5 x 2s `benchstat` runs)

- `Unmarshal`: pooled `*bytes.Reader` wrapper. `BenchmarkStructUnmarshal` -50% B/op (96 -> 48), -1 alloc/op, -5.16% time. `BenchmarkStructUnmarshalPartially` -75% B/op (64 -> 16), -1 alloc/op, -6.68% time.
- `Marshal`: pre-grows the encode buffer to 64 bytes, skipping the first one or two backing-array doublings for typical small payloads. The returned slice still owns its backing array; aliasing semantics are preserved.
- `AppendMarshal` / `(*Encoder).Append`: caller-owned destination-buffer APIs for hot paths that reuse output capacity; with a warm buffer they run at zero allocs/op on both scalar and representative struct benchmarks.
- `byteWriter.WriteByte`: writes through a 1-byte field on the wrapper struct instead of allocating a fresh `[]byte{c}` per call. `BenchmarkDiscard` -100% B/op, -100% allocs/op.
- Pooled `*Encoder` and `*Decoder` (`GetEncoder` / `PutEncoder`, `GetDecoder` / `PutDecoder`) work as before; reuse benchmarks added in `pkg/msgpack/bench_test.go`.

## Install

```bash
go get github.com/Quad4-Software/msgpack/v5@latest
import "github.com/Quad4-Software/msgpack/v5/pkg/msgpack"

The module path is github.com/Quad4-Software/msgpack/v5 (the /v5 suffix matches the major version). Source lives under pkg/msgpack/; subpackage msgpcode is at pkg/msgpack/msgpcode.

Features

  • Primitives, arrays, maps, structs, time.Time, and interface{}.
  • Allocation-aware API surface: Marshal for convenience and AppendMarshal / (*Encoder).Append for caller-managed reusable output buffers.
  • App Engine *datastore.Key and datastore.Cursor via extra/msgpappengine (optional module).
  • CustomEncoder / CustomDecoder for custom encoding.
  • Extensions, struct tags (msgpack:"..."), omitempty, sorted map keys, array-encoded structs, and Decoder.Query-style path queries.

Layout

Path Purpose
pkg/msgpack Public API (Marshal, Encoder, Decoder, etc.)
pkg/msgpack/msgpcode Wire format opcode constants
extra/msgpappengine Optional Google App Engine helpers (separate go.mod)
go.work Workspace: root module + extra/msgpappengine for local go test ./...
scripts/ci Local CI parity with Gitea (test-all.sh, scan-all.sh, setup-go.sh, ...)
.gitea/workflows CI (ci.yml) and security scan (scan.yml)

Testing

  • Unit tests in pkg/msgpack/*_test.go cover the decoder/encoder surface, struct round-trips, time, ext, intern, and query paths.
  • Property-based tests via pbt (git.quad4.io/Go-Libs/pbt/pkg/pbt) in pbt_test.go (roundtrip properties for []byte, string, []int, []string, map[string]int, map[string]string).
  • Fuzz targets in fuzz_test.go: FuzzMarshalUnmarshalRoundtrip, FuzzUnmarshalArbitrary, FuzzDecoderQuery, FuzzDecodeIntoStruct, FuzzDecodeExtHeader, FuzzDecodeTime, FuzzDecodeInternedString, FuzzDecodeSkip, FuzzDecodeMulti. Regression corpora for the allocation-limit fixes are committed under pkg/msgpack/testdata/fuzz/.
  • Stress tests in stress_test.go: concurrent Marshal / Unmarshal under -race, large byte slice / string round-trips, deeply nested map and slice round-trips, repeated reuse of pooled encoder/decoder.
  • Invariant tests in invariant_test.go: Unmarshal(nil) / empty input never panics; Marshal(nil) is a single msgpcode.Nil byte; bit-exact round-trip of int64 min, uint64 max, NaN, -Inf; pool aliasing checks.
  • Leak tests in leak_test.go: per-type preallocator must not retain goroutines or values across decoder churn.

Run go test -race ./... for the full suite; make runs go vet plus tests.

License

BSD 2-clause; see LICENSE. Original copyright remains with the vmihailenco authors; fork maintenance is attributed in this README.

About

maintained fork of msgpack for go

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors