Skip to content

Feat/save persistency#144

Merged
ipmach merged 40 commits into
developmentfrom
feat/save_persistency
May 17, 2026
Merged

Feat/save persistency#144
ipmach merged 40 commits into
developmentfrom
feat/save_persistency

Conversation

@viktorbeck98
Copy link
Copy Markdown
Collaborator

@viktorbeck98 viktorbeck98 commented May 6, 2026

Task

Closes #140

Description

Adds a full state persistency system to DetectMateLibrary so detectors can checkpoint and resume their internal state across runs.

Core protocol

  • dump/load protocol on EventDataStructure, implemented for EventTracker (msgpack), EventDataFrame and ChunkedEventDataFrame (Parquet + msgpack header), and SingleStabilityTracker (to_state/from_state).
  • New PersistencyLoadError for load failures.

PersistencySaver (src/detectmatelibrary/utils/persistency/persistency_saver.py)

  • fsspec-backed save/load, backend-agnostic (local, in-memory, optional s3/gcs/azure).
  • Triggers: background _SaveTimer and events_until_save event-count threshold.
  • Idempotent stop() that joins the timer thread; atexit cleanup registered/unregistered correctly.
  • Non-serializable event_data_kwargs (e.g. callables like converter_function) are filtered before writing metadata.json.

Config + detector wiring

  • New PersistConfig model and persist block on CoreDetectorConfig, integrated into config serialization and the MissingParams warning, and preserved across set_configuration() rebuilds.
  • CoreDetector._register_persistency() helper wired into NewValueDetector, NewValueComboDetector, and NewEventDetector.
  • EventPersistency gains an ingest callback and events_since_save counter to drive event-count saves.
  • Component gains a context-manager protocol and _Stoppable protocol for clean saver shutdown.

Other

  • New deps: fsspec, msgpack, pyarrow (with optional s3/gcs/azure extras).
  • Docs added under docs/auxiliar/persistency.md, docs/detectors.md, and AGENTS.md (renamed from CLAUDE.md).
  • Also folds in from_polars helpers and the TrainBuffer class from the development merge.

How Has This Been Tested?

  • New unit tests for dump/load on each structure: tests/test_utils/test_persistency_dump_load.py
  • PersistencySaver trigger/save/load tests (timer, events_until_save, idempotent stop): tests/test_utils/test_persistency_saver.py
  • PersistConfig config-roundtrip tests: tests/test_common/test_persist_config.py
  • End-to-end detector save → reload integration tests across backends: tests/test_detectors/test_persist_integration.py
  • from_polars helper tests: tests/test_helper/test_from_to.py
  • Full suite green via pytest.

Checklist

  • This Pull-Request goes to the development branch.
  • I have successfully run prek locally.
  • I have added tests to cover my changes.
  • I have linked the issue-id to the task-description.
  • I have performed a self-review of my own code.

viktorbeck98 and others added 21 commits May 4, 2026 10:17
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds full PersistencySaver class to persistency_saver.py with save(),
load(), start(), stop() methods using fsspec for backend-agnostic I/O.
Includes 8 new TDD tests covering save/load round-trip via in-memory fs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@viktorbeck98 viktorbeck98 requested a review from ipmach May 6, 2026 07:28
@viktorbeck98 viktorbeck98 self-assigned this May 6, 2026
@viktorbeck98 viktorbeck98 added the enhancement New feature or request label May 6, 2026
@viktorbeck98 viktorbeck98 marked this pull request as draft May 6, 2026 08:06
viktorbeck98 and others added 10 commits May 6, 2026 12:33
…tent

- PersistencySaver.save() now calls _safe_event_data_kwargs() to exclude
  non-JSON-serializable values (e.g. callables like converter_function)
  before writing metadata.json, preventing silent save failures.
- PersistencySaver.stop() is now idempotent: returns early if _timer is
  None and unregisters the atexit hook after the first call, eliminating
  spurious I/O-on-closed-file tracebacks during test teardown.
- EventTracker.load() uses strict_map_key=False with object_pairs_hook
  to restore tuple dict keys that msgpack deserializes as lists.
- SingleStabilityTracker.from_state() converts list elements in unique_set
  back to tuples to preserve hashability after msgpack round-trip.
- Adds test_save_and_reload for NewValueComboDetector to verify full
  save/reload cycle with callable event_data_kwargs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ns examples

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add ingest callback hook to EventPersistency and wire _check_event_count
into PersistencySaver so state is saved automatically after N events,
independently of the background timer. Includes three tests covering
at-threshold, before-threshold, and re-trigger-after-reset behaviour.
@viktorbeck98 viktorbeck98 marked this pull request as ready for review May 6, 2026 12:39
@viktorbeck98
Copy link
Copy Markdown
Collaborator Author

@ipmach ready for review

Resolve conflict in core.py: keep both _Stoppable protocol (from this
branch) and TrainBuffer class (from development), plus use_config_data_as_training
and buffer_train additions from development.
@viktorbeck98 viktorbeck98 force-pushed the feat/save_persistency branch from bd149da to 57830a2 Compare May 6, 2026 12:55
@viktorbeck98 viktorbeck98 linked an issue May 12, 2026 that may be closed by this pull request
viktorbeck98 and others added 2 commits May 12, 2026 09:51
Use pandas' to_parquet/read_parquet instead of importing pyarrow directly.
pyarrow remains the underlying engine (declared dependency), but no source
file imports it anymore.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there is a protocol, and what is the purpose of this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s only there to keep the dependency clean. core.py only needs to call .stop() on whatever saver-like object gets attached. So we don't need to make core dependent on smth from the detector persistency

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is saturated. It needs some refactor.

All the methods outside the CoreDetector that are not call in the core detector should not be in this file as they are not part of it. I suggest to move this to a persist file.

_register_persistency method should be outside of the class. It takes to much space in the class an does not need to be part of it as it never use the self pointer.

Suggestion:

def init_persistency(config: CoreDetectorConfig, persistency: EventPersistency) -> EventPersistency:
        config = cast(CoreDetectorConfig, self.config)
        if config.persist is None:
            return
        p = config.persist
        saver = PersistencySaver(
            persistency,
            PersistencySaverConfig(
                path=f"{p.path}/{self.name}",
                save_interval_seconds=p.interval_seconds,
                events_until_save=p.events_until_save,
                auto_load=p.auto_load,
                storage_options=p.storage_options,
            ),
        )
        saver.start()
        return saver


class CoreDetector:
            ........
        def _register_persistency(self, persistency: EventPersistency) -> None:
                  self.saver = init_persistency(self.config, persistency=persistency)

Copy link
Copy Markdown
Contributor

@ipmach ipmach May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to treat every persistency method as its own library. (Similar as the schemas)

So, everytime you need a method you do:

import detectmatelibrary.utils.persistency as persistency

persistency.get_configure_variables(.....)

If you follow this syntax then is clear all the methods of each file that is part of the persistency and we have a better separation.

For example in the NewValueDetector:

class NewValueDetector(CoreDetector):
    """Detect new values in log data as anomalies based on learned values."""

    def __init__(
        self,
        name: str = "NewValueDetector",
        config: NewValueDetectorConfig = NewValueDetectorConfig()
    ) -> None:

        if isinstance(config, dict):
            config = NewValueDetectorConfig.from_dict(config, name)

        super().__init__(name=name, buffer_mode=BufferMode.NO_BUF, config=config)
        self.config: NewValueDetectorConfig  # type narrowing for IDE
        self.persistency = persistency.EventPersistency(
            event_data_class=persistency.EventStabilityTracker,
        )
        # auto config checks if individual variables are stable to select combos from
        self.auto_conf_persistency = persistency.EventPersistency(
            event_data_class=persistency.EventStabilityTracker
        )
        self._register_persistency(self.persistency)

So I do not need to look in the imports where each class from persistency comes from. Same as

import numpy as np

np.load()  # I know this load comes from numpy

but if I do:

from numpy import load

load()  # I need to go to the top of the document to know where this comes from

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

files that are only use in persistency, should be inside persistency module even if they are generic. Example: RLE_list.py should be inside persistency.

So the dependency lines can be easily draw as:

(DetectMate) <------ (Persistency)

DetectMate dependes on the persistency module, but not in viceversa. It will also help if persistency has a clear API to better split between the two so the module is more modular.

(DetectMate) <---- (Persistency API ) <--- (Persistency)

Same thing as the schemas, where the division is clear:
(DetectMate) <--- (schemas/_classes.py) <--- (schemas methods)

This will make the code more clear and would helps us to later refactor the Detector class or methods to reduce the replicated code

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets leave it like this for now, but I think this documentation can be a little hard to follow if you dont know how the persistency works

Copy link
Copy Markdown
Contributor

@ipmach ipmach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The persistency itself looks good, but it should be a more clear separation between the persistency module and the rest of DetectMate.

Otherwise we may run into future problems to maintain the code.

Lifts the persistency setup out of CoreDetector into a module-level
init_persistency helper, expands the persistency package's public API so
callers use the `persistency.X` namespace pattern, and rewrites the
persistency doc with a concepts-first structure. Also moves RLE_list
inside the persistency package since it has no consumers elsewhere, and
documents why _Stoppable is a Protocol (dependency direction).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@viktorbeck98
Copy link
Copy Markdown
Collaborator Author

@ipmach thanks for your comments, much cleaner now. Everything has been addressed :)

@ipmach ipmach merged commit 72b2f72 into development May 17, 2026
4 checks passed
@ipmach ipmach deleted the feat/save_persistency branch May 17, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add persistency in process logic

2 participants