fix(trashbin): avoid full filecache scan when restoring versions (#31682)#41641
fix(trashbin): avoid full filecache scan when restoring versions (#31682)#41641DeepDiver1975 wants to merge 2 commits into
Conversation
`Trashbin::getVersionsFromTrash()` looked up a deleted file's versions with
`View::searchRaw('<name>.v%[.d<timestamp>]')`, which reaches
`Cache::search()` as `WHERE storage = ? AND name ILIKE ?`. The leading
literal + trailing `%` is non-sargable, and on MySQL the `ILIKE`->`COLLATE`
rewrite defeats any `name` index entirely, so every version restore — and
every `ExpireTrash`/`ExpireVersions` cron run that restores versions — full
scans `oc_filecache`. On large installations this is a heavy, repeated load
spike (issue #31682).
The version files for a given deleted file all live in a single known
directory under `files_trashbin/versions` (the folder the file was in, or
the versions root). Replace the whole-cache name search with an index-backed
`getDirectoryContent()` of that one directory (uses the existing
`(parent, name)` index) and filter the `<name>.v<version>[.d<timestamp>]`
entries in PHP. The directory to list is now passed in from both call sites
so version files of files deleted from inside a folder are still found.
Behavior is preserved: the same versions are returned, in the same shape.
Literal PHP matching is also slightly stricter than the previous SQL `LIKE`
(filenames containing `_`/`%` are no longer treated as wildcards).
Adds a regression test covering both the timestamped root lookup and the
non-timestamped sub-folder lookup, and asserting that version files of other
files / other timestamps are ignored.
Fixes #31682
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Thomas Müller <1005065+DeepDiver1975@users.noreply.github.com>
|
Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes. |
DeepDiver1975
left a comment
There was a problem hiding this comment.
🤖 Automated code review by Claude Code review agent
Overview
This PR fixes a real and well-diagnosed performance bug (#31682): Trashbin::getVersionsFromTrash() ran a non-sargable name ILIKE '<file>.v%[.d<ts>]' search that, combined with the MySQL ILIKE→COLLATE rewrite, full-scanned oc_filecache on every version restore / trash+version cron run. Replacing it with an index-backed getDirectoryContent() of the single known versions directory + PHP-side filtering is the right, appropriately-scoped approach (no schema change, no change to the generic Cache::search()).
The core fix looks correct, including the subtle part: passing $dir from both call sites so version files of files deleted from inside a sub-folder are still found (the old whole-cache search matched those by name regardless of location). 👍
🔴 Blocking — the new test will fail CI
In testGetVersionsFromTrash(), the first assertion is:
$versions = $method->invoke(null, 'test.txt', $timestamp, $this->user);
\sort($versions);
$this->assertEquals(['1', '10', '2'], $versions);sort() defaults to SORT_REGULAR, which sorts the numeric strings '1','2','10' numerically → ['1', '2', '10']. But the test asserts ['1', '10', '2']. assertEquals compares sequential arrays element-by-element (no canonicalization by default), so this fails:
mismatch at [1]: expected '10', got '2'
mismatch at [2]: expected '2', got '10'
Fix — make the expectation match numeric sort order:
$this->assertEquals(['1', '2', '10'], $versions);(The second assertion, ['5', '6'], is fine.) An even more robust option is sort($versions, SORT_STRING) paired with a string-sorted expectation, or assertEqualsCanonicalizing() — but the minimal correct change is just reordering the expected array. Since the suite is @group DB it can't be run in the agent's sandbox, which is exactly why this slipped through; worth running locally/CI before merge.
Minor / non-blocking
$prefixmatching is a substring-prefix, not a segment boundary.\strpos($name, $filename . '.v') !== 0correctly anchors at the start, and the'.backup.v9'ignore-case in the test confirms a longer name isn't matched. Good. One residual edge: a sibling file literally named<filename>.vNN...where<filename>is a prefix of another real filename is already handled because you match the full$filename+.v. No action needed — just noting it was checked.- Comment/style: matches surrounding code (tabs,
\-prefixed globals). Good. - Behavior note in the PR description is accurate: literal PHP matching is stricter than SQL
LIKEfor filenames containing_/%; that's a latent correctness improvement, not a regression.
Test coverage
Good instinct adding both the timestamped-root and non-timestamped-subfolder cases plus the "ignore other files/timestamps" entries — that subfolder case is precisely the regression risk of this change. Once the assertion above is corrected, coverage is solid. Consider also asserting the empty-result case (lookup for a filename with no versions) since getDirectoryContent on a non-existent dir returns [].
Verdict
Approach and core logic are correct; one blocking test bug (assertEquals ordering) must be fixed before this can pass CI. Everything else is minor.
The first assertion in testGetVersionsFromTrash() expected ['1', '10', '2'], but sort() uses SORT_REGULAR and sorts the numeric version strings numerically -> ['1', '2', '10'], so the test failed. Correct the expected order and add a case asserting an empty result for a file that has no stored versions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Thomas Müller <1005065+DeepDiver1975@users.noreply.github.com>
Summary
Trashbin::getVersionsFromTrash()looks up a deleted file's versions withView::searchRaw('<name>.v%[.d<timestamp>]'), which reachesCache::search()as:The pattern is a leading literal followed by a trailing
%, which is already non-sargable; on MySQL/MariaDB theILIKE→COLLATE ..._general_ci LIKErewrite (AdapterMySQL) additionally defeats any index onname. As a result every version restore — and everyExpireTrash/ExpireVersionscron run that restores versions — performs a full table scan ofoc_filecache. On installations with a large filecache this is a heavy, repeated DB load spike. (Fixes #31682.)Change
The version files for a given deleted file all live in a single, known directory under
files_trashbin/versions(the folder the file was in, or the versions root). This PR replaces the whole-cache name search with an index-backed directory listing of that one directory (View::getDirectoryContent(), which uses the existing(parent, name)index) and filters the<name>.v<version>[.d<timestamp>]entries in PHP.The directory to list is passed in from both call sites, so version files of files deleted from inside a folder are still found (the previous whole-cache search matched those by name regardless of location — the directory listing must look in the right sub-folder, not just the versions root).
Scope
db_structure.xmlchange.Cache::search()/AdapterMySQLILIKEbehavior — the fix is contained to the trashbin version-restore hot path.Behavior / compatibility
LIKE: filenames containing_or%are no longer treated as wildcards (a latent edge-case bug in the old code), so if anything the result set is more correct, never broader.Tests
Adds
StorageTest::testGetVersionsFromTrash()covering:🤖 This PR was prepared by the Claude Code review agent from the analysis of #31682. Please review carefully before merging.