Skip to content

MDEV-39325 Excessive purge-related failed index page merge attempts#4975

Draft
iMineLink wants to merge 6 commits intoMariaDB:11.8from
iMineLink:MDEV-39325
Draft

MDEV-39325 Excessive purge-related failed index page merge attempts#4975
iMineLink wants to merge 6 commits intoMariaDB:11.8from
iMineLink:MDEV-39325

Conversation

@iMineLink
Copy link
Copy Markdown
Contributor

Draft based on #4909, opened to get some CI coverage and allow initial review.

Optimize purge-related index page merge attempts by deferring them at the end of the batch.

Add 4 debug-only counters:

1. Innodb_btr_cur_n_index_lock_upgrades
2. Innodb_btr_cur_pessimistic_update_calls
3. Innodb_btr_cur_pessimistic_update_optim_err_underflows
4. Innodb_btr_cur_pessimistic_update_optim_err_overflows

to track pessimistic update fallbacks and monitor how often the
exclusive index lock is acquired, exposed via SHOW GLOBAL
STATUS in debug builds.

Add a test that inserts and updates 1000 rows in a 4K-page table,
verifying the current status of the counters.
Issue:
In btr_cur_optimistic_update(), freshly split pages are subject
to DB_UNDERFLOW due to the new size (after delete+insert) being
less than BTR_CUR_PAGE_COMPRESS_LIMIT(index) target, even if the record
itself is growing.

Fix:
In btr_cur_optimistic_update(), avoid this behavior by gating the
DB_UNDERFLOW error condition behind record shrinkage check.

Nothing is changed if the record is not shrinking.

The counters in the index_lock_upgrade.result file are updated
accordingly.
The new count of DB_UNDERFLOW optimistic update errors during the test
is reduced to 0.
The index lock upgrades are reduced accordingly.
Issue:
In btr_cur_optimistic_update(), DB_OVERFLOW error condition is returned
if the BTR_CUR_PAGE_REORGANIZE_LIMIT is not met, to prevent CPU trashing
due to excessive reorganization attempts in an almost full page.
In btr_cur_pessimistic_update() though, many of these errors were not
properly followed by a page split attempt, and therefore many
pessimistic fallbacks occur for the same page, which has less
than the BTR_CUR_PAGE_REORGANIZE_LIMIT free space.

Fix:
In btr_cur_pessimistic_update(), if the optimistic update error
is DB_OVERFLOW, the page is uncompressed and the
BTR_CUR_PAGE_REORGANIZE_LIMIT is not met, avoid attempting
btr_cur_insert_if_possible() and fallthrough to attempt a page split.

In the index_lock_upgrade.result file, the counters are updated
accordingly: one extra page split is recorded, and both the numbers of
reorganization attempts and index lock upgrades is reduced.
Extend debug-only monitoring to pessimistic inserts and deletes,
adding five new counters:

- INNODB_BTR_CUR_PESSIMISTIC_INSERT_CALLS
- INNODB_BTR_CUR_PESSIMISTIC_DELETE_CALLS
- INNODB_MTR_N_INDEX_S_LOCK_CALLS
- INNODB_MTR_N_INDEX_X_LOCK_CALLS
- INNODB_MTR_N_INDEX_SX_LOCK_CALLS

Add two include files to encapsule the monitoring operations and
reduce code duplication in index_lock_upgrade.test.

In index_lock_upgrade.test, add a second table t2 with secondary index
on datetime column, as well as same-size and decreasing-size UPDATES.
Add DELETE as well in both dense and scattered patterns, monitoring
the behavior of purge as well.
Unit test range_set.

Before this changes:

1. [64,281] + [281,282] produced [64,281] instead of [64,282]
2. [10,20],[30,40] would not contain(15)
3. [10,20],[30,40] would not contain(20)
Defer B-tree index merge operations during a purge batch processing at
the end of the batch.
This should avoid repeated failed merge attempts, decreasing pessimistic
delete fallbacks.

Optimistic delete during purge is decorated with a new
BTR_PURGE_DELETE_FLAG which allows it to proceed in more cases even if
the page would become underfull.
Pages are marked for deferred processing if they need compression after
any successful delete attempt, and removed from the deferred processing
set if a pessimistic deleted was successful and the page does not
require compression anymore.

A std::set is used to track the std::pair<dict_index_t*, page_id_t>
required for deferred processing.
This allows O(log(N)) deduplication and avoids the necessity of a
hashing function such as for std::unordered_set.

Deferred processing is executed after the purge batch is handled,
before closing the node.
Each deferred page is processed in two steps:

1. Page is peeked to get a key to enable traversal from the root
2. If page is valid, index is X-locked and traversed to get the leaf
   page which is possibly compressed

After processing, the tracking set is cleared.
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants