Skip to content

HBASE-29669 Implement basic row cache#7901

Open
EungsopYoo wants to merge 10 commits intoapache:HBASE-29585from
EungsopYoo:HBASE-29669
Open

HBASE-29669 Implement basic row cache#7901
EungsopYoo wants to merge 10 commits intoapache:HBASE-29585from
EungsopYoo:HBASE-29669

Conversation

@EungsopYoo
Copy link
Copy Markdown
Contributor

No description provided.

@EungsopYoo
Copy link
Copy Markdown
Contributor Author

EungsopYoo commented Mar 11, 2026

@wchevreuil
I have completed the implementation of the basic row cache. Please review it.

By the way, it seems like CI is not working, do you know why?

@liuxiaocs7
Copy link
Copy Markdown
Member

Hi, @EungsopYoo

CI has been migrated from Jenkins to GitHub Actions, and the pre-commit GitHub Jenkins job should have been disabled, more details: https://lists.apache.org/thread/z6o5hhsd9goh5j7fcl4bnwzsktlwlwl4

Some previously created feature branches also have this issue; perhaps we could give it a try, such as: https://lists.apache.org/thread/hrl7dktkrb8zwwbv62my68lvtbplv7sr

Thanks!

Comment on lines +258 to +264

void removeTableLevelBarrier(HRegion region) {
regionLevelBarrierMap.computeIfPresent(region, (k, counter) -> {
int remaining = counter.decrementAndGet();
return (remaining <= 0) ? null : counter;
});
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should be renamed by removeRegionLevelBarrier?

Copy link
Copy Markdown
Contributor Author

@EungsopYoo EungsopYoo Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, renamed to removeRegionLevelBarrier. (95de81b)

Comment on lines +960 to +963
// For testing only
void setRowCache(RowCache rowCache) {
this.rowCache = rowCache;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing only, we could use @RestrictedApi annotation?

Copy link
Copy Markdown
Contributor Author

@EungsopYoo EungsopYoo Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added @RestrictedApi annotation. (267f4ef)

private BlockCache l2Cache = null;
private MobFileCache mobFileCache;
private CacheStats cacheStats;
private final RowCache rowCache;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better move to L71 under MobFileCache?

Copy link
Copy Markdown
Contributor Author

@EungsopYoo EungsopYoo Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved rowCache field next to MobFileCache. (782dd75)

Comment on lines +103 to +104
RSRpcServices rsRpcServices = this.regionServer.getRSRpcServices();
this.rowCache = rsRpcServices == null ? null : rsRpcServices.getServer().getRowCache();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use a method initRowCache() here, just like blockcache and mobfilecache does?

Copy link
Copy Markdown
Contributor Author

@EungsopYoo EungsopYoo Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, extracted initRowCache() method. (782dd75)

Comment on lines +2362 to +2363
// The row cache for the region has been enabled again
rowCache.removeTableLevelBarrier(region);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto region level

Copy link
Copy Markdown
Contributor Author

@EungsopYoo EungsopYoo Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, renamed to removeRegionLevelBarrier. (95de81b)

Comment on lines +74 to +78
@Category({ RegionServerTests.class, MediumTests.class })
public class TestRowCache {
@ClassRule
public static final HBaseClassTestRule CLASS_RULE =
HBaseClassTestRule.forClass(TestRowCache.class);
Copy link
Copy Markdown
Member

@liuxiaocs7 liuxiaocs7 Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For new tests, it is best to use JUnit5 directly, since existing unit tests are currently being migrated to JUnit5.

Copy link
Copy Markdown
Contributor Author

@EungsopYoo EungsopYoo Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, migrated TestRowCache to JUnit5. (1fc9d5c)

@EungsopYoo
Copy link
Copy Markdown
Contributor Author

@liuxiaocs7
Sorry, I just noticed your comments. I’ll take a look and follow up early next week.

// TODO: implement row cache
rowCacheStrategy = null;
// Currently we only support TinyLfu implementation
rowCacheStrategy = new TinyLfuRowCacheStrategy(MemorySizeUtil.getRowCacheSize(conf));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be made pluggable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Introduced row.cache.strategy.class config key so a custom RowCacheStrategy can be plugged in via configuration; default remains TinyLfuRowCacheStrategy. This follows the same convention used by MemStore and RegionSplitPolicy (Configuration-arg constructor + ReflectionUtils.newInstance).

Commit: 6739f06

} catch (CloneNotSupportedException ignored) {
// Not able to cache row cells, ignore
}
void populateCache(HRegion region, List<Cell> results, RowCacheKey key) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just call this method "cache".

Do we really need the regionLevelBarrierMap key type to be HRegion, or could it be just the String encoded region name? That way, we don't need an extra HRegion parameter in this method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

  • Renamed populateCache to cache.
  • Changed regionLevelBarrierMap key type from HRegion to the encoded region name (String). The encoded name is the canonical region identifier already used elsewhere (e.g., RowCacheKey.isSameRegion). As a result, cache no longer needs an HRegion parameter; it derives the encoded region name from the RowCacheKey.

I kept the external signatures of create/remove/getRegionLevelBarrier taking HRegion to keep the caller's intent explicit; only the internal map key type changes. Let me know if you'd prefer those to take String as well.

Commit: e985135

Comment on lines +3476 to +3483
// Put with TTL is not allowed on tables with row cache enabled, because cached rows cannot
// track TTL expiration
if (isRowCacheEnabled) {
if (put.getTTL() != Long.MAX_VALUE) {
throw new DoNotRetryIOException(
"Tables with row cache enabled do not allow setting TTL on Puts");
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we simply implement TTL expiration within the RowCache.getRow logic?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — turned out to be feasible. Done.

Quick verification: I was initially worried that Get results don't carry TTL tags (the design doc assumed this), but that's actually a client-side property — the RPC codec strips tags. Server-side cells delivered to RowCache preserve their TTL tag (carried forward by TagUtil.carryForwardTTLTag during mutation, the same tag ScanQueryMatcher.isCellTTLExpired reads). Confirmed with a probe test against a live region.

Implementation:

  • RowCells now precomputes the earliest TTL expiration time across its cells during construction. Cells without a TTL tag yield Long.MAX_VALUE, so RowCells.isExpired(now) is an O(1) long comparison on every cache hit — no per-cell tag iteration on the hot path.
  • RowCache.tryGetFromCache: if the row is expired, evict and fall back to the storage read path.
  • RowCache.cache: skip caching when results are empty (otherwise an evicted-then-empty row would be re-cached as an empty entry).
  • HRegion.put: removed the guard that rejected Puts with TTL.

CF-level TTL still disables the row cache via canCacheRow's isDefaultTtl check; that policy is unchanged for now.

Commit: d8fef38

Introduce row.cache.strategy.class configuration key to allow
operators to plug in custom RowCacheStrategy implementations.
The default remains TinyLfuRowCacheStrategy.

RowCacheStrategy implementations must now provide a public
constructor that takes a Configuration argument, following the
same convention used by MemStore and RegionSplitPolicy.
Address review feedback:
- Rename RowCache.populateCache to cache.
- Change regionLevelBarrierMap key type from HRegion to the encoded
  region name (String). The encoded name is the canonical region
  identifier already used elsewhere (e.g., RowCacheKey.isSameRegion).
- The cache method no longer needs an HRegion parameter; it derives
  the encoded region name from the RowCacheKey.

The external signatures of create/remove/getRegionLevelBarrier still
take HRegion to make the caller's intent explicit; only the internal
map key type changes.
…n hit

Address review feedback: instead of rejecting Puts with TTL on row
cache-enabled tables, check TTL expiration when serving cache hits.

- RowCells: precompute the earliest TTL expiration time across the
  contained cells during construction, exposing isExpired(now) for an
  O(1) check on each cache hit. Cells without a TTL tag yield
  Long.MAX_VALUE so the check short-circuits.
- RowCache.tryGetFromCache: if the cached row is expired, evict it
  and fall back to the storage read path.
- RowCache.cache: skip caching when results are empty.
- HRegion.put: remove the guard that rejected Puts with TTL on row
  cache-enabled tables.

Server-side cells preserve their TTL tag (carried forward by
TagUtil.carryForwardTTLTag during mutation), so the same expiration
check used by ScanQueryMatcher can be applied at the cache layer.

CF-level TTL still disables the row cache via canCacheRow's
isDefaultTtl check; that policy is unchanged.
The CI workflows invoke scripts under dev-support/ that were missing
or outdated on this branch, causing GHA runs to fail at startup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants