Skip to content

Move columnar_support to the core library#730

Merged
frankmcsherry merged 1 commit intoTimelyDataflow:master-nextfrom
frankmcsherry:columnar_movement
Apr 25, 2026
Merged

Move columnar_support to the core library#730
frankmcsherry merged 1 commit intoTimelyDataflow:master-nextfrom
frankmcsherry:columnar_movement

Conversation

@frankmcsherry
Copy link
Copy Markdown
Member

This lightly updates and moves the columnar_support example code into the core library, to evolve it less confusingly. The main changes as part of the PR are splitting the 2kLOC file into smaller parts, and introducing efficiencies around work that should be a sequential pass (e.g. filtering zeros, partitioning by time).

Moves the columnar arrangement / container infrastructure from the
`examples/columnar/columnar_support/` tree into `src/columnar/` as a
public, experimental module. API and internals are explicitly marked as
unstable in the module-level docs; rough edges (`unimplemented!`
`ContainerBytes`, eager-consolidate `leave_dynamic`, single-`U`
`join_function`) are listed up front.

Also generalizes the dynamic-scope helpers: `DynTime<TOuter, T>` is now
parametric (was hardcoded to `Product<u64, PointStamp<u64>>`), and
`leave_dynamic` carries matching bounds plus a `level > 0` assert.

Updates the in-tree consumers: `examples/columnar/main.rs` and
`interactive/examples/ddir_col.rs` switch from path-mounted modules to
`use differential_dataflow::columnar`. `ddir_vec.rs` + `interactive/src/ir.rs`
gain a `RowLike` impl for `SmallVec<A>`, used by the vec-backed ddir
example for its row representation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@frankmcsherry frankmcsherry changed the base branch from master to master-next April 25, 2026 21:48
@frankmcsherry frankmcsherry merged commit 01482fb into TimelyDataflow:master-next Apr 25, 2026
6 checks passed
frankmcsherry added a commit that referenced this pull request Apr 29, 2026
* Restore pre-#725 spines.rs and inline EditList::load

Brings back the spines arrangement bake-off (deleted in #724 Spring
cleaning, then RHH-dependent) with three modes: `key` (OrdKeySpine),
`val` (OrdValSpine with Val=()), and `col` (columnar ValSpine via the
columnar module added in #730). All three feed the same Vec-shaped
input collections through one driver loop; `col` repacks via a small
in-dataflow `unary` (`ToRecorded`) that builds `RecordedUpdates`
containers before `arrange_core`.

Bisecting against the example exposed a regression introduced in #725:
EditList::load now delegates to populate_key, which seek_keys + checks
+ rewinds vals on every call. In the merge-join inner loop (join.rs
Ordering::Equal arm), the cursor is already positioned by the upstream
`match trace_key.cmp(&batch_key)` work, so the seek is redundant.
Repeated 1M times in the spines query phase, this added ~3s (+40%
queries time vs pre-#725 baseline).

Restoring EditList::load to its pre-#725 division of labor — assume
the cursor is positioned, walk vals inline — recovers performance.
populate_key and replay_key keep the seek for callers that legitimately
need it (reduce, ValueHistory). The Option-based meet API from #725
stays.

Measurements (1M keys, 1000 size, key mode):
- v0.23.0 baseline: 6.56s queries
- pre-#725 (f4e7550): 7.16s queries
- master HEAD before this commit: 10.12s queries
- this commit: 7.00s queries

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Tighten up spines examples

* Extract common target columnar size

* TrieChunker work

* De-penalize col in spiners.rs

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant