Skip to content

Several HDF5 printer (v1 + v2) fixes#587

Merged
anderkve merged 25 commits intomasterfrom
hdf5_work
May 8, 2026
Merged

Several HDF5 printer (v1 + v2) fixes#587
anderkve merged 25 commits intomasterfrom
hdf5_work

Conversation

@anderkve
Copy link
Copy Markdown
Collaborator

@anderkve anderkve commented May 5, 2026

A series of targeted fixes to the v1 (HDF5Printer) and v2 (HDF5Printer2) HDF5 printers, surfaced by a code audit of the two printer source trees plus their combine tools. Each commit is independently small, locally verified against the spartan example (1 and 2 MPI processes, where applicable), and behaviourally additive — no on-disk format changes, no API changes for callers.

  • [v1] Default-initialise hid_t members to -1 to harden aux-printer paths. HDF5Printer's seven hid_t file/group/location handles had no in-class initialisers; the aux constructor only assigned the three *_location_id members, leaving the four file/group handles indeterminate on every aux instance. This was the root cause of Multi-nest fails with hdf5 version 1 #495 (resolved at the call site by using the file handle through primary_printer); this commit fixes the underlying class invariant so any future aux call site that forgets to route through primary_printer fails deterministically (-1 is the sentinel the existing checks already treat as "unset").

  • [v2] Close HDF5 type handle leak in HDF5DataSet::write_buffer. The synchronised-write path called H5Dget_type for an H5Tequal sanity check but never closed the returned id, leaking one type handle per flush per dataset. Mirrors the close that already exists in the sibling write_RA_buffer. Slow-burn leak, mechanical fix.

  • [v1] Clarify why buffer / dataset destructors are intentionally empty. Replaces misleading TODO: close on destruct comments in DataSetInterfaceBase and VertexBufferNumeric1D_HDF5 with comments documenting the real reason: these objects are populated via assign-from-temporary, so a closing destructor would invalidate the surviving copy's hid_t. Cleanup is correctly driven from HDF5Printer::flush()/finalise(). Pure documentation change to prevent a future "fix" from reintroducing handle aliasing.

  • [v1] Replace bare exit(1) in printer constructor with printer_error().raise(). The "file exists, group missing, not in resume mode" branch in HDF5Printer::common_constructor called exit(1) with no message — bypassing MPI shutdown (worker ranks then hung at the downstream Barrier) and giving the user no diagnostic. Replaced with the same printer_error pattern its three sibling branches already use. (Does not address the broader rank-0-error-deadlocks-workers class of issues, which need BarrierWithTimeout.)

  • [v2] Delete unused MPI_flush_to_rank / MPI_recv_all_buffers / MPI_request_buffer_data path. The v2 printer carried a complete second MPI shipping mechanism (seven functions, eight dedicated MPI tag constants, plus recv_counter/send_counter debug scaffolding) with no external callers and a missing Recv for the h5v2_BEGIN Send — abandoned mid-wiring. The currently-used gather_and_print path is correct and tested. Net -262 lines.

  • [v1] Close H5Dget_type ids leaked in combine_tools. Eight call sites in hdf5_combine_tools.cpp (4 in the primary loop, 2 in the auxilliary branch, 2 in the metadata loop) obtained datatype ids via H5Dget_type for setup_hdf5_points and never released them. Fixed by adding H5Tclose after each datatype's last use.

  • [v1] Avoid undefined iterator decrement past begin() in combine size-trim loops. Two near-identical for (auto it = valids.end()-1; size > 0; --it) loops in hdf5_stuff incurred UB on empty input (end()-1 is undefined) and again on the final post-step decrement when all entries are invalid (--it to begin()-1). Replaced with the equivalent while (size > 0 && !valids[size-1]) --size;, which carries no iterator arithmetic.

  • [v2] HDF5Printer2::flush() now also flushes registered aux buffermasters. Previously flush() invoked only buffermaster.flush(), leaving every aux printer's metadata / RA / sync stream still in RAM until either an auto-flush or finalise() ran. Now flushes the primary's own buffermaster plus each entry in aux_buffers (sync streams first, then RA, matching finalise()'s ordering). No MPI gather added — each rank still writes its own data under the existing cross-rank Utils::FileLock, consistent with how the auto-flush path works.

  • [v1] Guard sync_pos == 0 underflow in synchronise_buffers and get_buffer fast_forward. Both sites computed get_sync_pos() - 1 on an unsigned member that's zero-initialised, underflowing to ULONG_MAX. The bug was dormant in practice (call ordering ensured the underflow value was never delivered to a buffer with non-zero state), but fragile. Adds explicit if (sync_pos > 0) guards at both sites; sync_pos == 0 is now a no-op, which matches the semantics ("no previous slot to catch up to").

Tests:

  • Build cleanly (for each new commit).
  • spartan.yaml + v2 printer + 1 MPI rank
  • spartan.yaml + v2 printer + 2 MPI ranks
  • spartan.yaml + v1 printer + 1 MPI rank
  • spartan.yaml + v1 printer + 2 MPI ranks (with combine routines enabled)
  • spartan.yaml + v1 printer + MultiNest + 1 MPI rank (what originally reproduced Multi-nest fails with hdf5 version 1 #495)
  • Reproduced the pre-fix exit(1) deadlock scenario for the printer-ctor commit and confirmed it now terminates with a FATAL ERROR and MPI_Finalize.

anderkve and others added 23 commits May 5, 2026 08:41
The hdf5_v1 printer aborted with "H5Fflush returned error value (-1)... not a file or file object" whenever a scanner explicitly called flush() on an auxiliary print stream. MultiNest hits this on its very first dumper callback, where it flushes the "txt" (Posterior) and "live" (LastLive) RA streams.

Root cause: only the primary HDF5Printer opens the HDF5 file and sets its file_id member; auxiliary printers inherit location_id, RA_location_id,and metadata_location_id from the primary but never initialise their own file_id. HDF5Printer::empty_sync_buffers() called H5Fflush(file_id, ...)unconditionally, so on an aux printer it was flushing an uninitialised hid_t.

Fix: route the flush through primary_printer->file_id. primary_printer defaults to "this" for the primary and is set to the actual primary in the aux constructor, so the same call is correct in both cases. This also makes the dumper's flush() calls actually useful (they now push HDF5's internal buffers to disk) instead of erroring out.

Closes #495.
…t final_size on workers

Issue
-----
In HDF5Printer2::finalise() (Printers/src/printers/hdf5printer_v2/
hdf5printer_v2.cpp), the local `std::size_t final_size` was declared
unconditionally but only assigned inside an `if(myRank==0)` block (via
`buffermaster.get_next_free_position()`). Immediately after, the loop
that flushes unsynchronised (random-access) aux buffers and calls
`(*it)->extend_all_datasets_to(final_size)` ran on **every** rank.
Worker ranks therefore passed an uninitialised value into
`extend_all_datasets_to`, which is undefined behaviour.

In practice this hit any MPI scan using the v2 hdf5 printer with
auxiliary RA streams (i.e. anything beyond a single-rank run): worker
ranks with non-empty `aux_buffers` would attempt local writes / dataset
extensions that they have no business performing — only rank 0 owns
the output file. The matching synchronised-buffer flush block earlier
in finalise() is already correctly gated by `if(myRank==0)`, so the
fix is to apply the same guard symmetrically.

Fix
---
- Move the `final_size` query and the entire RA-flush /
  extend_all_datasets_to loop inside `if(myRank==0)`. Worker ranks
  have already shipped their RA buffer contents to rank 0 via the
  preceding `gather_and_print(...)` call, which remains unguarded
  (it is a collective and must be called by all ranks).
- Tighten the scope of the `final_size` declaration so it lives only
  inside the rank-0 block where it is meaningful.
- Move the `add_aux_buffer(RAbuffer)` call inside the rank-0 block
  too, since the RAbuffer only contains gathered data on rank 0 and
  the loop that consumes `aux_buffers` is now rank-0-only. (Note:
  RAbuffer being a stack local appended to the member `aux_buffers`
  vector is a separate latent issue, tracked in
  `hdf5_printer_issues.md`.)
- Added a comment explaining why the block is rank-0-only.

Verification
------------
- Built GAMBIT with `-DWITH_MPI=True`.
- Ran the spartan example single-rank
  (`OMP_NUM_THREADS=1 ./gambit -rf yaml_files/spartan.yaml`):
  successful, "Final dataset size is 9003".
- Ran spartan with 2 MPI ranks
  (`mpiexec -np 2 ./gambit -rf yaml_files/spartan.yaml`):
  successful, "Final dataset size is 9001". Inspected the output
  HDF5: all 34 datasets under /data have identical length (9001),
  confirming sync and RA datasets were extended consistently and no
  output was corrupted.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
Add an "Optional: MPI" subsection to step 1 covering the
`libopenmpi-dev openmpi-bin` install and the corresponding
`-DWITH_MPI=True` cmake flag, plus an MPI run example in step 6
(including the `--allow-run-as-root --oversubscribe` flags that are
typically needed in containerised dev environments).

Needed for testing any printer / scanner code paths gated on
`#ifdef WITH_MPI` (e.g. the hdf5_v2 finalise rank-0 guard fixed in
the preceding commit).

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
Cmake writes downloaded_tarball_paths.txt at the repo root listing the
paths of scanner / backend tarballs it downloaded into
ScannerBit/downloaded/ and Backends/downloaded/ during a build. It is
purely a build-time bookkeeping file and should not be tracked.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
…F5 errors

Issue
-----
In Printers/src/printers/hdf5printer/hdf5tools.cpp, errorsOff() saved
the current HDF5 error handler into module-globals
old_func / old_client_data and then installed a NULL handler;
errorsOn() restored from those globals. This is fine for an
isolated pair, but it breaks when errorsOff() is called while errors
are already off:

  errorsOff();   // saves real handler -> globals; installs NULL
  ...
  errorsOff();   // saves NULL (current handler) -> globals; installs NULL
  ...
  errorsOn();    // restores globals = NULL  -> errors stay off forever

There is exactly one such nested case in the v1 printer code today
(hdf5tools.cpp lines 369/377/386 in the dataset corruption-recovery
helper), and it permanently disables HDF5 error reporting for the rest
of the process whenever it is hit. Any subsequent HDF5 problem then
fails silently. There are also ~25 paired call sites across
hdf5tools.cpp and hdf5_combine_tools.cpp that work fine in isolation
but would corrupt the global state if any of them were ever wrapped
in another errorsOff() / errorsOn() block.

Fix
---
Make errorsOff() / errorsOn() idempotent with a bool guard:
- errorsOff(): if errors are already off, do nothing (in particular,
  do not call H5Eget_auto2, which would overwrite the saved handler
  with the NULL we installed ourselves).
- errorsOn(): if errors are already on (i.e. unbalanced call), do
  nothing rather than blindly restoring whatever is in the saved
  slots.

The bool / handler state is moved into an unnamed namespace at file
scope, giving it internal linkage. Functionally equivalent to the
existing namespace-scope globals for this codebase (no other TU
references them), but it (a) makes intent explicit -- these are
strictly the implementation detail of the two functions just below
them -- and (b) prevents any future accidental ODR collision on the
short generic names.

A counter-based nesting scheme was considered but rejected: no real
caller pairs an inner errorsOn() with an inner errorsOff(), so all the
counter would do is silently absorb the one buggy nested call we
already have. The bool keeps the logic minimal and matches what every
existing caller actually wants.

Note: the underlying state is still a module global, so errorsOff /
errorsOn remain not thread-safe. That is a separate (lower-priority)
issue tracked in hdf5_printer_issues.md.

Verification
------------
- Rebuilt GAMBIT (-DWITH_MPI=True) cleanly.
- Single-rank v1 spartan run: success.
- 2-rank MPI v1 spartan run: success.
- Resume run via combine-routines path through hdf5_combine_tools.cpp
  (which has the bulk of the errorsOff / errorsOn call sites): success.
  No HDF5 warnings or errors in any of the runs.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
The hdf5_v1 printer aborted with "H5Fflush returned error value (-1)... not a file or file object" whenever a scanner explicitly called flush() on an auxiliary print stream. MultiNest hits this on its very first dumper callback, where it flushes the "txt" (Posterior) and "live" (LastLive) RA streams.

Root cause: only the primary HDF5Printer opens the HDF5 file and sets its file_id member; auxiliary printers inherit location_id, RA_location_id,and metadata_location_id from the primary but never initialise their own file_id. HDF5Printer::empty_sync_buffers() called H5Fflush(file_id, ...)unconditionally, so on an aux printer it was flushing an uninitialised hid_t.

Fix: route the flush through primary_printer->file_id. primary_printer defaults to "this" for the primary and is set to the actual primary in the aux constructor, so the same call is correct in both cases. This also makes the dumper's flush() calls actually useful (they now push HDF5's internal buffers to disk) instead of erroring out.

Closes #495.
…r paths

The HDF5Printer class has seven hid_t members for HDF5 file / group /
location handles:

  file_id, group_id, RA_group_id, metadata_id,
  location_id, RA_location_id, metadata_location_id

The four file/group handles are only assigned by the *primary*
constructor; the aux constructor at hdf5printer.cpp:683-699 only sets
the three *_location_id members (inheriting them from
primary_printer). With no in-class initialisers, the four file/group
handles held indeterminate values on every auxiliary printer instance.

This was the root cause of issue #495 (H5Fflush(file_id, ...) firing
on aux printers when MultiNest's dumper called flush() on its 'txt' /
'live' aux streams). That specific call site is now correctly routed
through primary_printer->file_id (merge of upstream
fix_hdf5v1_and_multinest_issue), but the underlying class invariant
remained fragile: any future aux call site that forgets to go through
primary_printer would silently read uninitialised memory.

Fix: give all seven hid_t members an in-class default initialiser of
-1. This matches the sentinel value the class already treats as "unset
HDF5 handle" -- the existing get_location() / get_RA_location() /
get_metadata_location() checks at hdf5printer.cpp:1144,1155,1166
already raise a printer_error when they see -1. With this change the
class invariant becomes uniform: an unset HDF5 handle in this class
is -1, period. Any accidental HDF5 call on an aux's file/group handle
now passes H5I_INVALID_HID, which HDF5 reliably rejects -- failures
will surface deterministically as printer_errors instead of as
indeterminate behaviour or random "lucky" successes.

The bottom three *_location_id members are assigned by both
constructors today, so the = -1 there is purely defensive. Including
them keeps the rule uniform across all seven handles.

Verification
------------
Rebuilt and ran four scenarios cleanly (no HDF5 errors, all exit 0):
- v1 + Diver, single rank
- v1 + Diver, 2-rank MPI
- v1 + MultiNest, single rank (was the original #495 reproducer)
- v2 + Diver, single rank (regression check)

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
… leak)

In HDF5DataSet<T>::write_buffer (the v2 hdf5 printer's per-flush write
path for synchronised datasets), the on-disk dtype is queried with

    hid_t dtype = H5Dget_type(get_dset_id());

and used only for an H5Tequal sanity check against the in-memory
expected_dtype. The handle was never closed before the function
returned, leaking one HDF5 type handle per call. write_buffer is
invoked once per flush per synchronised dataset, so on long scans with
many datasets and frequent flushes the leak accumulates -- eventually
HDF5's internal table or the OS file-handle limit gives, depending on
the build/runtime.

The companion write_RA_buffer (lines 438/467 in the same header)
already closes its corresponding H5Dget_type handle, so this is purely
a missing close in the synchronised path.

Fix: H5Tclose(dtype) immediately after the H5Tequal check, where
dtype stops being needed. Mirrors the close site in write_RA_buffer.
A short comment notes why it's there so the next reader doesn't
"clean it up" again.

Verification
------------
- Built cleanly.
- v2 + Diver, single rank: exit 0, no HDF5 errors, output file has
  34 datasets all sized 9002.
- v2 + Diver, 2 MPI ranks: exit 0, no HDF5 errors.

(The leak itself is slow-burn -- not visible in a short spartan run --
but the fix is mechanical and the smoke tests confirm it doesn't
disturb the write path.)

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
…empty

The destructors of DataSetInterfaceBase and VertexBufferNumeric1D_HDF5
each carried a TODO suggesting that one day they ought to close their
HDF5 dataset / flush their pending writes, with the actual cleanup
calls left commented out. The TODOs implied the empty body was a
hack waiting to be fixed, but in fact the empty body is required by
how these objects are stored:

- DataSetInterfaceBase (and DataSetInterfaceScalar) are populated via
  the assign-from-temporary pattern, e.g.
      _dsetdata = DataSetInterfaceScalar<T,CL>(loc, name, ...);
      local_buffers[key] = BuffType(loc, label, vertexID, ...);
  With the default copy/move semantics, the temporary and the
  surviving slot share the same hid_t dset_id; closing in the
  destructor would invalidate the handle that the surviving copy
  still relies on.

- The same shape applies to VertexBufferNumeric1D_HDF5 itself, whose
  destructor would similarly run on a temporary that shares all the
  buffer state (HDF5 handles, sync counters, write queues) with the
  container slot. Doing write_to_disk() / RA_write_to_disk() there
  would either double-write or write through stale state.

The actual close / flush is driven deterministically from
HDF5Printer::flush() and HDF5Printer::finalise(), iterating over the
surviving buffers in the printer's all_buffers map. If finalise() is
bypassed (exception during scan setup or shutdown), HDF5's own
H5Fclose() at process exit auto-closes any still-open dataset handles
and flushes their pending I/O, so the file on disk remains
well-formed.

This commit replaces the misleading TODOs with comments explaining
the actual ownership invariant, so the next reader doesn't try to
"fix" the empty body and reintroduce the original handle-aliasing
bug. Properly closing in the destructor would require giving these
classes ownership semantics they currently don't have (e.g. wrapping
the hid_t in a shared_ptr with an H5Dclose deleter, or making the
chain non-copyable / move-only) -- a larger change that is out of
scope here.

No behaviour change. Verified by rebuilding and running the spartan
example with both the v1 and v2 hdf5 printer (both exit 0, no HDF5
errors).

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
…aise()

In HDF5Printer::common_constructor (Printers/src/printers/hdf5printer/
hdf5printer.cpp), the rank-0 setup block at lines 414-472 handles the
case where the requested output file already exists. Inside the
"restart, not resume, not overwrite" branch (lines 448-469), the code
opens the file, checks whether the requested group is already present,
and either:
  - group exists -> raise a printer_error with a clear, multi-option
    message explaining how to fix the situation (line 465); or
  - group missing -> closeFile + bare exit(1) (line 468), with no
    message at all.

The exit(1) path was buggy in two ways:

 1. exit(1) bypasses MPI shutdown -- no MPI_Finalize, no MPI_Abort.
    Worker ranks (which never enter this rank-0-only block) sail on
    to the upcoming myComm.Barrier() at line 610 and hang forever.
    The user sees a job that never terminates, with no log message
    explaining why.

 2. The user has no idea what went wrong: the path triggers when
    they pointed the printer at a file that already exists from
    some unrelated previous run, and the only signal is a non-zero
    exit code from rank 0.

Fix: mirror the sibling group-exists case at line 465. Build a
clear printer_error message ("file exists, requested group not in it,
not in resume mode, delete_file_on_restart not set; here are the
three things you can do") and raise it. printer_error().raise()
throws an exception that goes through GAMBIT's normal error-reporting
machinery, which at least tells the user what happened on rank 0
before tearing down -- a strict improvement over silent exit(1).

Note that this commit does *not* address the broader rank-0-error-
deadlocks-workers problem: if any of the four printer_error sites in
this constructor block (lines 438, 445, 465, and now this one) fire
on rank 0, worker ranks still hang at the Barrier downstream. That
deadlock-class issue is the same one tracked under "v1
masterWaitForAll(FINAL_SYNC) deadlocks" and "Same risk on the
resume-mode myComm.Barrier() at line 610" in
hdf5_printer_issues.md, and needs a separate, larger fix using
BarrierWithTimeout. The point of the present commit is just to bring
this one site up to the same error-reporting standard as its three
siblings.

Verification
------------
- Built cleanly.
- Regular v1 spartan run: exit 0, no change to the happy path.
- Reproduced the failing scenario by seeding an output file with
  group "/data" via a normal run, then re-running with the same
  output_file but group "/data2", restart mode (-rf), no
  delete_file_on_restart. Before the fix this would silent-exit(1);
  after the fix the job terminates with a FATAL ERROR carrying the
  full guidance message and "Calling MPI_Finalize..." in the trailer.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
…I_request_buffer_data path

The v2 hdf5 printer carried two parallel sets of MPI machinery for
shipping buffer data from worker ranks to the rank-0 master:

 1. The currently-used "gather and Gatherv" path in finalise() and
    flush(), built around HDF5MasterBuffer::add_to_buffers and the
    HDF5bufferchunk message format. This is what actually runs in
    every MPI scan today.

 2. A second, unwired Send/Recv path consisting of:
      - HDF5MasterBuffer::MPI_flush_to_rank
      - HDF5MasterBuffer::MPI_request_buffer_data
      - HDF5MasterBuffer::MPI_recv_all_buffers
      - HDF5MasterBuffer::MPI_recv_buffer<T>
      - HDF5Buffer<T>::MPI_flush_to_rank
      - HDF5Buffer<T>::MPI_recv_from_rank
      - HDF5BufferBase::MPI_flush_to_rank (pure-virtual decl)
    plus eight dedicated MPI tag constants
    (h5v2_bufname / h5v2_bufdata_points / h5v2_bufdata_ranks /
    h5v2_bufdata_valid / h5v2_bufdata_type / h5v2_bufdata_values /
    h5v2_BLOCK / h5v2_BEGIN) and some commented-out
    recv_counter / send_counter debug scaffolding.

The path in (2) is internally consistent (the functions only call each
other) but has no external callers anywhere in the codebase, and the
MPI_request_buffer_data side sends an h5v2_BEGIN tag for which there
is no matching Recv -- a strong sign the path was abandoned mid-wiring
rather than being a finished alternative.

This commit deletes (2) outright. Two reasons for choosing delete over
finishing the wiring:

 - The currently-used gather_and_print path is correct and tested
   (passes our spartan smoke tests in single-rank and 2-rank MPI
   modes, including with auxiliary RA streams).
 - The send-side comment in MPI_recv_all_buffers explicitly warns of a
   deadlock if MPI_request_buffer_data is not called first; revisiting
   that ordering and the request/recv handshake to make it deadlock-free
   would be a real design effort, not a hygiene fix. The audit
   recommended either deleting or wiring up; given there is no concrete
   reason to believe the unfinished path would outperform the working
   one, deletion is the lower-risk choice.

Net change: -266 lines / +4 lines across hdf5printer_v2.hpp and
hdf5printer_v2.cpp. add_to_buffers, MPI_add_int_block_to_buffer,
MPI_add_float_block_to_buffer (which ARE used by the live path) are
untouched.

Verification
------------
- Built cleanly.
- v2 + Diver, single rank: exit 0, output has 34 datasets all sized
  9001 (uniform, as expected).
- v2 + Diver, 2 MPI ranks: exit 0, output has 34 datasets all sized
  9001 (uniform).

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
hdf5_combine_tools.cpp called H5Dget_type at 8 sites to obtain the
HDF5 datatype of an existing dataset (in order to recreate the dataset
in the combined output via setup_hdf5_points), but never released the
returned datatype id. H5Dget_type returns a fresh id that the caller
owns and must close with H5Tclose; otherwise the id (and its underlying
in-memory state) lingers until the file/library is shut down.

The leaks were at three call sites within hdf5_stuff::Combine_HDF5:

 - Primary-dataset combine loop (~L949-956): type/type2 obtained per
   batch from either the previous combined dataset or a temp-file
   dataset, used by setup_hdf5_points only on batch 0, but allocated
   on every batch and never freed. Fixed by H5Tclose(type/type2) after
   the dataset_out/dataset2_out closes inside the same per-batch
   if(valid_dset>=0 or old_dataset>=0) block, so each iteration
   balances its allocations.

 - Auxiliary-dataset path (~L1144-1145): type/type2 obtained inside
   the "param not also a primary" branch, used immediately by
   setup_hdf5_points. Fixed by H5Tclose right after that call, guarded
   by >=0 since the variables are pre-initialised to -1 above.

 - Metadata-dataset combine loop (~L1273, L1278): single 'type' id,
   same pattern as the primary loop. Fixed by H5Tclose(type) after
   the dataset_out close inside the per-batch valid branch.

These were straightforward resource leaks: setup_hdf5_points only
reads 'type' to pass it to H5Dcreate2 and does not take ownership,
so adding the closes at the end of the using-scope is correct and
safe.

Verification
------------
- Built cleanly (only Printers translation unit recompiled).
- Ran spartan with the v1 (hdf5_v1) printer, 2 MPI ranks, with
  disable_combine_routines: false so the combine path actually
  exercises all three call sites. Run completed cleanly (exit 0):
   * 17 primary parameters combined, no errors
   * 51 metadata parameters combined, no errors
   * 0 auxiliary parameters in this scan (path not exercised, but
     that branch is structurally identical to the primary path)
   * combined output gambit_output.hdf5 contains uniform 9501-row
     1-D datasets across all primary parameters.

Out of scope (not fixed here)
-----------------------------
- hdf5_combine_tools.hpp Enter_HDF5() at line 377 calls
  H5Tget_native_type(dtype, ...) and stores the result in 'type', then
  closes only 'dtype' (line 448) before returning. The native-type id
  is also a fresh id that needs H5Tclose. This is a separate leak (a
  different HDF5 API) and the issue title is scoped specifically to
  H5Dget_type; flagging here for follow-up.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
…size-trim loops

Two near-identical loops in hdf5_stuff (one in the primary-dataset
scanning pass, one in the auxiliary-dataset Enter_Aux_Parameters pass)
walked a std::vector<bool> backwards to count the number of trailing
invalid entries to trim from a per-file size:

    for (auto it = valids.end()-1; size > 0; --it)
    {
        if (*it) break;
        else --size;
    }

There were two problems with this idiom:

 1. If valids.size() == 0 (which happens cleanly when the temp file's
    sync group is empty), valids.end()-1 is undefined behavior. In
    practice the loop condition `size > 0` would skip the body, but
    the iterator is still constructed.

 2. More importantly: when ALL entries in valids are invalid (so the
    "break" branch is never taken), the final iteration decrements
    size from 1 to 0 and dereferences valids.front() correctly, but
    then the for-loop's post-step runs `--it` once more, producing an
    iterator one before begin(). That decrement is undefined behavior
    on std::vector::iterator -- even though it's typically a pointer
    under the hood and "happens to work", a sanitizer or a debug
    iterator implementation can flag/abort. The condition `size > 0`
    is then evaluated on the already-invalid iterator's surrounding
    state, but the UB has already occurred at the --it.

Replaced both loops with a straightforward index-based trim:

    while (size > 0 && !valids[size-1])
        --size;

Equivalent semantics (trim trailing invalids, keep first valid),
but no iterator arithmetic, no past-the-begin decrement, and no
empty-vector trap. Both call sites already guarantee
size <= valids.size() (size and valids both come from the
pointID_isvalid / RA_pointID_isvalid dataset, with the prior
size != size2 check raising on mismatch), so valids[size-1] is
safely in range whenever size > 0.

Verification
------------
- Built cleanly (Printers TU only).
- Spartan + hdf5_v1 + 2 MPI ranks + disable_combine_routines: false:
  exit 0, combined output has uniform 9000-row primary datasets and
  1-row metadata datasets. The primary-pass trim loop runs once per
  temp file (2x in this scan); the aux-pass trim loop is structurally
  identical and exercised by the same Enter_HDF5/read_hdf5 path.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
Before this change, HDF5Printer2::flush() only invoked
buffermaster.flush() on the calling printer instance. That meant a
public flush() on the primary printer left every auxilliary printer's
buffermaster -- metadata streams, RA streams, and any other aux
streams the scan has registered -- with their points still in RAM
until either the per-stream sync auto-flush triggered (when a sync
buffer fills) or finalise() ran. The two known external callers of
the public API (twalk's out_stream->flush() at the end of a temp-file
replay batch and the Python scanner binding in py_module_scanbit.hpp)
both expect "persist what I've written so far", so the partial flush
was a behavioural mismatch with caller intent and a divergence from
the v1 printer (HDF5Printer::flush() iterates all buffers).

The fix: after flushing this->buffermaster, if this is the primary
printer, iterate the aux_buffers list and flush each. Aux printers
register their own buffermaster into the primary's aux_buffers in
their constructor (~hdf5printer_v2.cpp:1342 add_aux_buffer) and
have an empty aux_buffers themselves, so guarding the loop on
not is_auxilliary_printer() preserves aux-printer semantics
(unchanged: still just flushes its own buffermaster).

Two passes -- sync streams first, RA streams second -- mirror the
ordering already used in finalise() (~:1824-1833 then :1895-1909).
Reason: RA writes anchor to positions established by sync writes,
so flushing RA before its sync targets are on disk can leave RA
points stuck in the postpone queue.

What is intentionally NOT done here:

 * No MPI gather. The architecture defers cross-rank gathering to
   finalise() as a performance optimisation for HPC networked
   filesystems (see comment block at hdf5printer_v2.cpp:1786-1793).
   Mid-scan, every rank just writes its own buffered points to the
   shared file under the existing cross-rank Utils::FileLock taken
   inside HDF5MasterBuffer::flush() (lock_and_open_file ->
   hdf5out.get_lock() at :741). Adding an MPI collective inside the
   public flush() would deadlock for non-collective callers (twalk's
   per-rank flush, Python scanners) and is unnecessary for
   "persist my data" semantics.

 * No explicit H5Fflush. v1 keeps the file open between buffer
   flushes and ends flush() with H5Fflush(file_id, H5F_SCOPE_GLOBAL).
   v2's HDF5MasterBuffer::flush() always closes the file at the end
   via close_and_unlock_file -> H5Fclose, which already drains the
   HDF5 internal cache to disk, making H5Fflush redundant.

 * Auto-flush path unchanged. add_buffered_point's "buffer full" auto
   flush at hdf5printer_v2.hpp:1128 / 1137 calls
   HDF5MasterBuffer::flush() (same class), not HDF5Printer2::flush().
   That path is untouched.

 * finalise() unchanged. It still does its own MPI gather and
   per-buffer flush logic without going through the public flush().

Cost: the new loop incurs N additional lock_and_open_file ->
close_and_unlock_file cycles per public flush() call (N = number of
registered aux buffers), but HDF5MasterBuffer::flush() short-circuits
on get_Npoints() == 0 (:479) without taking the lock, so empty aux
buffers cost essentially zero. flush() is rare on the hot path; this
is fine.

Verification
------------
- Built cleanly (Printers TU only).
- Spartan + v2 + single rank: exit 0, 34 primary datasets all sized
  9502 (uniform).
- Spartan + v2 + 2 MPI ranks: exit 0, 34 primary datasets all sized
  9003 (uniform).

Both runs produce well-formed output; the change is additive and
preserves existing on-disk layout because finalise() already flushes
everything at end-of-scan -- this commit just makes the same
flushing happen on demand when flush() is called.

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
…buffer fast_forward

HDF5Printer's sync_pos member is unsigned long, initialised to 0
(hdf5printer.hpp:416), and only ever incremented (in
increment_sync_pos at hdf5printer.cpp:1650). Two call sites compute
get_sync_pos()-1 without first checking that sync_pos>0, which
would underflow to ULONG_MAX when sync_pos==0:

 1. HDF5Printer::synchronise_buffers (hdf5printer.cpp:1320):
        const unsigned long sync_pos = get_sync_pos()-1;
        ...
        it->second->synchronise_output_to_position(sync_pos);
    The underflowed ULONG_MAX would be stored into each buffer's
    target_sync_pos. For RA buffers this later feeds extend_dset()
    in RA_write_to_disk, which would attempt to extend the dataset
    to ULONG_MAX rows. For sync buffers, the movediff calculation
    in synchronise_output_to_position relies on sync_pos+1
    (ulong), which then wraps to 0 -- harmless only when
    dset_head_pos() also happens to be 0.

 2. H5P_LocalBufferManager::get_buffer (hdf5printer.hpp:558):
        if(synchronised) it->second.fast_forward(
            printer->get_sync_pos()-1);
    fast_forward takes a signed long target_pos. With sync_pos==0,
    get_sync_pos()-1 == ULONG_MAX (ulong), implicitly converted to
    long it becomes -1 on two's-complement platforms. fast_forward
    then computes needed_steps = -1 - 0 = -1 (negative) and raises
    the "would need to move backwards" error
    (VertexBufferBase.hpp:218).

In normal scans both bugs are dormant because of call ordering:

 - synchronise_buffers' first invocation (from the first
   check_for_new_point at :1632) happens before any buffer has been
   added to all_buffers, so the for-loop iterates over nothing and
   the underflowed value is never delivered.

 - fast_forward via get_buffer is reached only after
   check_for_new_point has run increment_sync_pos at :1650, taking
   sync_pos from 0 to 1, so by the time fast_forward executes
   sync_pos>=1 and target_pos>=0.

The underflow is therefore real but masked by today's lifecycle.
Adding a buffer earlier than the first check_for_new_point, or
calling synchronise_buffers from a new code path before any
pointID arrives, would surface either bug.

Fix: guard each call site with sync_pos>0. Semantically, sync_pos==0
means "no point has yet been registered as the current sync
position", so there is nothing to synchronise to or catch up to;
returning early / skipping the call is the correct no-op behaviour
and is consistent with buffers being at dset_head_pos==0 from
construction.

This is a surgical fix that preserves all existing behaviour: the
two guarded paths were never executed with sync_pos==0 in practice,
so guarding them doesn't change any observable behaviour today --
it only removes the latent underflow.

Verification
------------
- Built cleanly.
- Spartan + hdf5_v1 + 1 rank: exit 0, 34 primary datasets all
  sized 9000 (uniform).
- Spartan + hdf5_v1 + 2 MPI ranks (with combine routines enabled):
  exit 0, 34 primary datasets all sized 9501 (uniform).

https://claude.ai/code/session_01QZuy7gyVUMZa3FCXShjj4X
@anderkve anderkve requested a review from ChrisJChang May 5, 2026 08:06
@anderkve
Copy link
Copy Markdown
Collaborator Author

anderkve commented May 5, 2026

Sorry for requesting one more review from you, @ChrisJChang, feel free to forward it to someone else. Or I can just take care of it myself – I've already gone over the changes once to check that the Claude-suggested changes make sense to me, and to tweak some comments and code snippets.

@carlmfe carlmfe self-requested a review May 5, 2026 08:44
@anderkve anderkve removed the request for review from ChrisJChang May 5, 2026 08:47
@anderkve
Copy link
Copy Markdown
Collaborator Author

anderkve commented May 5, 2026

@carlmfe volunteered to review this one (thanks!), so you can forget about it @ChrisJChang.

@anderkve anderkve added Core Core group task labels May 6, 2026
Comment thread Printers/src/printers/hdf5printer/hdf5printer.cpp Outdated
Copy link
Copy Markdown
Contributor

@carlmfe carlmfe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is good. Comments are generally on the verbose side, but do convey clearly why the changes are as they are. Other than a specific commenting referencing old code, I approve of the rest.

@anderkve
Copy link
Copy Markdown
Collaborator Author

anderkve commented May 7, 2026

Thanks for reviewing – and I agree, I'll revise some of the comments before I merge.

@anderkve anderkve merged commit fb14173 into master May 8, 2026
2 of 4 checks passed
@anderkve anderkve deleted the hdf5_work branch May 8, 2026 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Core Core group task Printers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants