Skip to content

feat: add Clock2QPlus eviction algorithm#314

Merged
1a1a11a merged 7 commits into
1a1a11a:developfrom
YiyanZhai:develop
May 4, 2026
Merged

feat: add Clock2QPlus eviction algorithm#314
1a1a11a merged 7 commits into
1a1a11a:developfrom
YiyanZhai:develop

Conversation

@YiyanZhai
Copy link
Copy Markdown
Contributor

@YiyanZhai YiyanZhai commented Apr 29, 2026

Summary

Adds Clock2QPlus, which is designed for metadata-cache workloads with correlated references. S3-FIFO sets the Small FIFO reference bit after a short-term re-reference, while Clock2Q+ suppresses reference-bit updates inside a correlation window near the head of the Small FIFO. This prevents bursty metadata accesses from being promoted into the Main Clock too early.

Behavior

New objects enter a small FIFO; those hit outside the correlation window are promoted to a main queue on eviction. A ghost cache admits repeatedly-seen objects directly into the main queue.

Changes

  • cache/eviction/Clock2QPlus.c: algorithm implementation
  • include/libCacheSim/cacheObj.h: add Clock2QPlus_obj_metadata_t and union member
  • include/libCacheSim/evictionAlgo.h: declare Clock2QPlus_init
  • cache/CMakeLists.txt: register source file
  • bin/cachesim/cache_init.h: add "clock2qplus" entry

Parameters

Parameter Default
fifo-size-ratio 0.10 (of cache)
ghost-size-ratio 0.90 (of cache)
move-to-main-threshold 1
corr-window-ratio 0.50 (of Small FIFO)

Results

The paper (Clock2Q+: A Simple and Efficient Replacement Algorithm for Metadata Cache in VMware vSAN) reports that Clock2Q+ consistently improves miss ratio on metadata-style traces and achieves the best miss-ratio improvement among the evaluated policies. On derived metadata traces, Clock2Q+ outperforms S3-FIFO 2-bit by up to 28.5% in miss ratio at the larger cache-size setting. It also improves over Clock, ARC, and S3-FIFO, etc across cache sizes on both derived metadata traces and original data traces.

Figure 1: miss_ratio_improvement_data.pdf
Figure 2: miss_ratio_improvement_metadata.pdf
Figure 1 & 2: Comparing Clock2Q+ with state-of-the-art eviction algorithms on data traces and metadata traces. Clock2Q+ shows strong improvement over the tested algorithms at both cache-size ratios.

Figure 3: CloudPhysics_miss_ratio_curves.pdf
Figure 3: Miss-ratio curves for CloudPhysics w56. Lower curves indicate better performance (fewer misses) at a given cache size. Clock2Q+ consistently has the lowest miss ratio across the tested cache sizes.

@YiyanZhai YiyanZhai requested a review from haochengxia as a code owner April 29, 2026 06:45
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the Clock2QPlus eviction algorithm, adding the necessary logic for initialization, insertion, and eviction across a multi-queue structure. The feedback identifies several critical issues: a potential use-after-free vulnerability when removing objects from the main cache, a possible crash during parameter parsing due to unchecked string splitting, and an integer overflow risk in the correlation window logic. Other recommendations include correctly accounting for metadata overhead in cache size calculations, removing unused configuration fields, and adjusting the insertion check to allow objects larger than the FIFO queue.

Comment on lines +481 to +482
char *key = strsep((char **)&params_str, "=");
char *value = strsep((char **)&params_str, ",");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The value pointer returned by strsep can be NULL if the = delimiter is missing in the parameter string. Subsequent calls to strtod or atoi with a NULL argument will cause a crash. It is safer to validate that both key and value are present.

    char *key = strsep((char **)&params_str, "=");
    char *value = strsep((char **)&params_str, ",");
    if (key == NULL || value == NULL) break;

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix this?

Comment on lines +378 to +381
bool removed = main->remove(main, obj_to_evict->obj_id);
if (!removed) {
ERROR("cannot remove obj %ld\n", obj_to_evict->obj_id);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Accessing obj_to_evict->obj_id after calling main->remove is potentially a use-after-free (UAF) vulnerability if the remove implementation frees the object. It is safer to store the obj_id in a local variable before performing the removal.

      obj_id_t obj_id = obj_to_evict->obj_id;
      bool removed = main->remove(main, obj_id);
      if (!removed) {
        ERROR("cannot remove obj %ld\n", obj_id);
      }

cache->get_occupied_byte = Clock2QPlus_get_occupied_byte;
cache->can_insert = Clock2QPlus_can_insert;

cache->obj_md_size = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Setting obj_md_size to 0 means the metadata overhead (Clock2QPlus_obj_metadata_t) is not accounted for in the cache's occupied bytes. This leads to inaccurate cache size simulations. It should be set to the size of the algorithm-specific metadata.

  cache->obj_md_size = sizeof(Clock2QPlus_obj_metadata_t);

Comment on lines +231 to +232
int time_since_insertion =
params->n_obj_admit_to_fifo - obj->Clock2QPlus.insertion_time;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

time_since_insertion is declared as an int, but it is calculated from int64_t counters. In long-running simulations where the number of object admissions exceeds 2^31, this will overflow and break the correlation window logic.

    int64_t time_since_insertion =
        params->n_obj_admit_to_fifo - obj->Clock2QPlus.insertion_time;

Comment on lines +34 to +35
int64_t corr_window_size_upper_bound;
int64_t corr_window_size_lower_bound;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fields corr_window_size_upper_bound and corr_window_size_lower_bound are calculated during initialization but never used in the eviction logic. They should be removed to keep the code clean.

const request_t *req) {
Clock2QPlus_params_t *params = (Clock2QPlus_params_t *)cache->eviction_params;

return req->obj_size <= params->fifo->cache_size;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The can_insert check is currently restricted to the size of the small FIFO. This prevents objects that are larger than the FIFO but smaller than the total cache from being admitted, even if they could be placed directly into the main cache (e.g., during warmup or on a ghost hit).

  return req->obj_size <= cache->cache_size;

@1a1a11a
Copy link
Copy Markdown
Owner

1a1a11a commented Apr 29, 2026

Can you add some results?

@1a1a11a
Copy link
Copy Markdown
Owner

1a1a11a commented May 1, 2026

@YiyanZhai

@YiyanZhai
Copy link
Copy Markdown
Contributor Author

Can you add some results?

Do we want to add results to the repo or show results in this PR?

@haochengxia
Copy link
Copy Markdown
Collaborator

Can you add some results?

Do we want to add results to the repo or show results in this PR?

We may need to add some test results, for example, these existing ones.

params->n_obj_admit_to_fifo - obj->Clock2QPlus.insertion_time;
if (time_since_insertion >= params->corr_window_size) {
obj->Clock2QPlus.freq += 1;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Clock2Q+ work under ignore-obj-size only? If so, we may need to add a clarification in comments.

Copy link
Copy Markdown
Owner

@1a1a11a 1a1a11a May 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment added, thanks for pointing out

@1a1a11a
Copy link
Copy Markdown
Owner

1a1a11a commented May 2, 2026

Can you add some results?

Do we want to add results to the repo or show results in this PR?

Just add to PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

📝 Walkthrough

Walkthrough

A new cache eviction algorithm Clock2QPlus is introduced, consisting of a FIFO cache routing objects to a main cache based on frequency thresholds and correlation windows, with an optional ghost cache. The implementation includes metadata definitions, a 521-line core algorithm file, build integration, API registration, and test coverage.

Changes

Clock2QPlus Algorithm Addition

Layer / File(s) Summary
Data Shape
libCacheSim/include/libCacheSim/cacheObj.h
Added Clock2QPlus_obj_metadata_t typedef with insertion_time and freq fields; extended cache_obj_t union to include the new metadata type.
Core Implementation
libCacheSim/cache/eviction/Clock2QPlus.c
Implemented the Clock2QPlus algorithm (521 lines) with FIFO + main cache + optional ghost cache structure. Includes parameter parsing, initialization/free, find/insert/evict logic with correlation-window checks, and ghost-cache promotion rules.
Public API
libCacheSim/include/libCacheSim/evictionAlgo.h
Added Clock2QPlus_init(...) function declaration for the new eviction algorithm.
Build Integration
libCacheSim/cache/CMakeLists.txt
Added eviction/Clock2QPlus.c to eviction_sources_c list to include the implementation in the build.
Algorithm Registry
libCacheSim/bin/cachesim/cache_init.h
Registered "clock2qplus" in the simple_algos lookup table, wired to Clock2QPlus_init.
Testing Support
test/common.h, test/test_evictionAlgo.c
Added Clock2QPlus recognition in create_test_cache helper; inserted new truth-data entry and test_Clock2QPlus function into test suite; registered test case in main().

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A new clock ticks in cache's halls,
Two queues and ghosts where wisdom calls,
FIFO feeds the main parade,
While correlation windows fade,
Eviction flows where frequency reigns! 🕐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add Clock2QPlus eviction algorithm' directly and clearly describes the main change in the pull request, which is the addition of a new cache eviction algorithm called Clock2QPlus.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libCacheSim/cache/eviction/Clock2QPlus.c`:
- Around line 111-119: The code computes fifo_cache_size/main_cache_size and
sets corr_window_* using parsed params without validating numeric inputs or
checking insert return values; update parsing call sites (where strtod/atoi are
used) to validate conversion (check errno, endptr, and numeric ranges) and
clamp/return on invalid values, ensure fifo_cache_size and main_cache_size are
>= 1 and that ghost sizes and window bounds are within sensible ranges before
using them in Clock2QPlus.c (references: ccache_params, params->fifo_size_ratio,
params->ghost_size_ratio, params->corr_window_ratio,
params->corr_window_size_upper_bound/lower_bound), and add NULL checks for
allocation/insert results (e.g., verify new_obj returned from insert operations
before accessing its fields at the insert sites referenced around the new_obj
usages) with proper error handling/logging and early exit to avoid dereference.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5cfc7988-3194-4ca8-a668-fd112a419ed8

📥 Commits

Reviewing files that changed from the base of the PR and between f2f8b79 and 25f2d3d.

📒 Files selected for processing (7)
  • libCacheSim/bin/cachesim/cache_init.h
  • libCacheSim/cache/CMakeLists.txt
  • libCacheSim/cache/eviction/Clock2QPlus.c
  • libCacheSim/include/libCacheSim/cacheObj.h
  • libCacheSim/include/libCacheSim/evictionAlgo.h
  • test/common.h
  • test/test_evictionAlgo.c

Comment on lines +111 to +119
int64_t fifo_cache_size =
(int64_t)ccache_params.cache_size * params->fifo_size_ratio;
int64_t main_cache_size = ccache_params.cache_size - fifo_cache_size;
int64_t fifo_ghost_cache_size =
(int64_t)(ccache_params.cache_size * params->ghost_size_ratio);
params->corr_window_size_upper_bound = (int64_t)(fifo_cache_size * 0.5);
params->corr_window_size_lower_bound = (int64_t)(fifo_cache_size * 0.1);
params->corr_window_size =
(double)(fifo_cache_size * params->corr_window_ratio);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate parsed parameters before sizing sub-caches and using insert results.

At Line 499–Line 506, non-numeric or out-of-range values are accepted (strtod/atoi without end/range checks). That can produce invalid fifo_cache_size/main_cache_size at Line 111–Line 115 and lead to NULL dereference at Line 345 or Line 377 (new_obj->...) if insert fails.

Proposed hardening patch
 static void Clock2QPlus_parse_params(cache_t *cache,
                                      const char *cache_specific_params) {
@@
-    if (strcasecmp(key, "fifo-size-ratio") == 0) {
-      params->fifo_size_ratio = strtod(value, NULL);
-    } else if (strcasecmp(key, "ghost-size-ratio") == 0) {
-      params->ghost_size_ratio = strtod(value, NULL);
-    } else if (strcasecmp(key, "corr-window-ratio") == 0) {
-      params->corr_window_ratio = strtod(value, NULL);
-    } else if (strcasecmp(key, "move-to-main-threshold") == 0) {
-      params->move_to_main_threshold = atoi(value);
+    if (strcasecmp(key, "fifo-size-ratio") == 0) {
+      char *end = NULL;
+      double v = strtod(value, &end);
+      if (end == value || *end != '\0' || v <= 0.0 || v >= 1.0) {
+        ERROR("invalid fifo-size-ratio: %s\n", value);
+        exit(1);
+      }
+      params->fifo_size_ratio = v;
+    } else if (strcasecmp(key, "ghost-size-ratio") == 0) {
+      char *end = NULL;
+      double v = strtod(value, &end);
+      if (end == value || *end != '\0' || v < 0.0 || v > 1.0) {
+        ERROR("invalid ghost-size-ratio: %s\n", value);
+        exit(1);
+      }
+      params->ghost_size_ratio = v;
+    } else if (strcasecmp(key, "corr-window-ratio") == 0) {
+      char *end = NULL;
+      double v = strtod(value, &end);
+      if (end == value || *end != '\0' || v < 0.0 || v > 1.0) {
+        ERROR("invalid corr-window-ratio: %s\n", value);
+        exit(1);
+      }
+      params->corr_window_ratio = v;
+    } else if (strcasecmp(key, "move-to-main-threshold") == 0) {
+      char *end = NULL;
+      long v = strtol(value, &end, 10);
+      if (end == value || *end != '\0' || v < 0) {
+        ERROR("invalid move-to-main-threshold: %s\n", value);
+        exit(1);
+      }
+      params->move_to_main_threshold = (int)v;
@@
 }
   int64_t fifo_cache_size =
       (int64_t)ccache_params.cache_size * params->fifo_size_ratio;
   int64_t main_cache_size = ccache_params.cache_size - fifo_cache_size;
+  if (fifo_cache_size <= 0 || main_cache_size <= 0) {
+    ERROR("invalid cache split: fifo=%" PRId64 ", main=%" PRId64 "\n",
+          fifo_cache_size, main_cache_size);
+    exit(1);
+  }
       cache_obj_t *new_obj = main->insert(main, params->req_local);
+      if (new_obj == NULL) {
+        ERROR("failed to move object %" PRIu64 " into main cache\n",
+              params->req_local->obj_id);
+        exit(1);
+      }
       new_obj->freq = obj_to_evict->freq;
@@
       cache_obj_t *new_obj = main->insert(main, params->req_local);
+      if (new_obj == NULL) {
+        ERROR("failed to reinsert object %" PRIu64 " into main cache\n",
+              params->req_local->obj_id);
+        exit(1);
+      }
       // 1-bit counter
       new_obj->Clock2QPlus.freq = 0;

Also applies to: 345-347, 377-380, 475-517

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libCacheSim/cache/eviction/Clock2QPlus.c` around lines 111 - 119, The code
computes fifo_cache_size/main_cache_size and sets corr_window_* using parsed
params without validating numeric inputs or checking insert return values;
update parsing call sites (where strtod/atoi are used) to validate conversion
(check errno, endptr, and numeric ranges) and clamp/return on invalid values,
ensure fifo_cache_size and main_cache_size are >= 1 and that ghost sizes and
window bounds are within sensible ranges before using them in Clock2QPlus.c
(references: ccache_params, params->fifo_size_ratio, params->ghost_size_ratio,
params->corr_window_ratio, params->corr_window_size_upper_bound/lower_bound),
and add NULL checks for allocation/insert results (e.g., verify new_obj returned
from insert operations before accessing its fields at the insert sites
referenced around the new_obj usages) with proper error handling/logging and
early exit to avoid dereference.

@YiyanZhai
Copy link
Copy Markdown
Contributor Author

YiyanZhai commented May 4, 2026

Can you add some results?

Do we want to add results to the repo or show results in this PR?

Just add to PR

Got it, I updated the PR message

@1a1a11a 1a1a11a merged commit da022c2 into 1a1a11a:develop May 4, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants