Fix #2698: Prevent unnecessary cache purges for hierarchical pages#7613
Fix #2698: Prevent unnecessary cache purges for hierarchical pages#7613Jeroen-1978 wants to merge 19 commits intowp-media:developfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a critical caching bug by replacing parent page cache purging with child page cache purging. The change prevents unintentional deletion of child page caches and ensures proper database synchronization when page hierarchies are modified.
- Replaces parent page purging with explicit child page purging to prevent unintended cache deletion
- Adds a new helper function to recursively find all descendant pages
- Updates slug change handling to purge all affected child pages and maintain database consistency
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
This pull request is ment to solve issue wp-media#2698 This pull request fixes a critical bug where purging a parent page's cache would unintentionally clear the cache for all of its descendant pages. It also ensures the _wpr_rocket_cache database table is correctly and fully updated, especially when page slugs are modified within a hierarchy. Problem The previous implementation purged parent pages when a post was updated. This led to several issues: 1. Unintentional Cache Deletion: Purging a parent URL (e.g., /parent/) caused rocket_clean_files to delete the entire directory, which unintentionally wiped out the cache for all child pages (e.g., /parent/child/). 2. Database Inconsistency on Update: Because child pages were deleted indirectly via directory removal, they were not explicitly listed in the purge process. This meant their entries in the _wpr_rocket_cache table were not updated, leading to a mismatch between the filesystem cache and the database. 3. Database Inconsistency on Slug Change: When a slug was changed for a page in the middle of a page tree, only the cache for that single post was purged. The URLs of all its child pages also change as a result, but their old cached versions were not being purged, and their entries in the _wpr_rocket_cache table were not updated to reflect the new URL structure. Solution The logic has been inverted: instead of purging parent pages, the process now explicitly targets all affected child pages. 1. Added get_all_descendant() function: A new helper function has been added to recursively retrieve all descendant page IDs (children, grandchildren, etc.) for a given post. 2. Modified rocket_get_purge_urls() and rocket_clean_post_cache_on_slug_change(): Both functions have been updated to use the new get_all_descendant() helper. Instead of adding parent URLs to the purge list, they now build a precise list of all individual child pages that need to be purged. This change ensures that only the intended cache files are removed and that each purged URL is explicitly processed, guaranteeing that the _wpr_rocket_cache table is always kept in sync with the actual state of the cache.
|
@DahmaniAdame Can you have a look to maybe include this to a future release? |
|
Hello @Jeroen-1978, and thank you for this PR. I tested the changes and was able to reproduce the original issue using the same reproduction steps described in issue #2698. I tested both against the latest develop branch and the PR as-is, and observed the same behavior in both cases: when updating the related page that shares the same slug, all posts were cleared. Based on these results, I have moved the main ticket back to To Do. |
Add permalink structure check for cache deletion logic.
Removed handling of query parameters in file entry, this was added by mistake.
|
Hi @hanna-meda, thank you for having a look at the PR. You are right, the PR didn't work as it should. I will make the necessary changes and let you know when I'm done. |
Add logic to handle slug changes for child pages.
Added a function to count path segments in a URL and updated cache file deletion logic to handle directories and files more effectively.
|
Hi @hanna-meda , I’ve made several changes to ensure this PR resolves #2698 . The core fix is that we now only delete cached files for a page when the cache folder contains subfolders (i.e. when there are cached subpages). This prevents removing cache entries unnecessarily while still correctly handling hierarchical page trees. I considered adding extra logic to analyze custom slugs, but that’s only one specific case where a URL segment can be “predefined”. The same kind of edge case can occur with any custom post type or custom rewrite structure, so I focused the fix on the general folder-structure behavior rather than one slug scenario. There’s also a reference to #2549 , but I couldn’t reproduce that issue on the current version, so I didn’t include a targeted fix for it in this PR. Additionally, the behavior described in #2698 (comment) is addressed here as well: when a page in the middle of a hierarchy changes, we should not purge all parent/top-level pages. With these changes, when a page slug changes in the middle of a hierarchy, only the affected child pages are cleared, and cache for child pages under the new URL is rebuilt as expected. Let me know if you’d like me to also add a separate follow-up patch that tries to detect “custom slug” patterns more explicitly, but I’d prefer to keep that as a distinct enhancement since it goes beyond the general case fixed here. |
| $purge_urls = []; | ||
| $purge_urls[] = get_the_permalink( $post_id ); | ||
|
|
||
| // Clear cache for all child pages. | ||
| $children = rocket_get_all_descendant( $post_id ); | ||
| if ( (bool) $children ) { | ||
| foreach ( $children as $child_id ) { | ||
| $purge_urls[] = get_the_permalink( $child_id ); | ||
| } | ||
| } | ||
|
|
||
| rocket_clean_files( $purge_urls ); |
There was a problem hiding this comment.
rocket_clean_post_cache_on_slug_change() now purges an array of URLs (parent + all descendants) instead of a single permalink, and it introduces new behavior (child pages being cleared when a parent slug changes), but existing unit/integration tests for this function currently only assert a single call to rocket_clean_files() with one URL and don’t verify that child URLs are included. Please update those tests to expect the new array-based call signature and add assertions/fixtures that cover the descendant purge behavior so this logic is protected against regressions.
| // Check whether the directory contains subfolders (vs only files). | ||
| $has_subdirs = false; | ||
| try { | ||
| foreach ( new FilesystemIterator( $entry, FilesystemIterator::SKIP_DOTS ) as $child ) { | ||
| if ( $child->isDir() ) { | ||
| $has_subdirs = true; | ||
| break; | ||
| } | ||
| } | ||
| } catch ( Exception $e ) { | ||
| // If we can't inspect the directory, be conservative and only delete files. | ||
| $has_subdirs = true; | ||
| } | ||
|
|
||
| if ( $filesystem->is_dir( $entry ) ) { | ||
| rocket_rrmdir( $entry, [], $filesystem ); | ||
| } else { | ||
| $filesystem->delete( $entry ); | ||
| if ( ! $has_subdirs ) { | ||
| // Directory contains only files: remove it entirely. | ||
| rocket_rrmdir( $entry, [], $filesystem ); | ||
| continue; | ||
| } | ||
|
|
||
| // Directory contains subfolders: delete only the files in the top-level. | ||
| foreach ( _rocket_get_dir_files_by_regex( $entry, '#.+#' ) as $child ) { | ||
| if ( $child->isFile() ) { |
There was a problem hiding this comment.
This block changes rocket_clean_files() semantics in a non-trivial way: instead of always calling rocket_rrmdir() on matching directories, it now conditionally deletes entire directories only when they have no subdirectories and otherwise removes only top-level files. Given how widely rocket_clean_files() is used (including deprecated helpers like rocket_clean_directory_for_default_language_on_wpml()), please extend the existing unit/integration tests for rocket_clean_files() to cover these new branches, including cases where a cache directory contains nested subfolders that must be preserved and verifying that only the intended files are removed.
| // Sort: most segments first (deepest URLs first). | ||
| usort( | ||
| $urls, | ||
| static function ( $a, $b ) { | ||
| $da = rocket_count_path_segments( (string) $a ); | ||
| $db = rocket_count_path_segments( (string) $b ); | ||
|
|
||
| // First by depth (descending) | ||
| if ( $da !== $db ) { | ||
| return $db <=> $da; | ||
| } | ||
|
|
||
| // If depth is equal, alphabetically | ||
| return strcmp( (string) $a, (string) $b ); | ||
| } | ||
| ); |
There was a problem hiding this comment.
Introducing the usort() here changes the order of $urls before firing the before_rocket_clean_files and before_rocket_clean_file actions, and it also means deeper URLs are processed first. Existing tests and 3rd-party code may implicitly rely on the previous iteration order, so it would be good to (1) update the unit tests for rocket_clean_files() to assert the new ordering semantics where relevant and (2) document in tests/fixtures that purge order is depth-first (deepest paths first) to guard against accidental reordering in future refactors.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Add functionality to purge URLs for child pages when the post slug changes.
|
Hi @hanna-meda , while testing i found there was an error in this PR that now is corrected. Please let me know if you have any questions. |
Corrected comments for clarity in sorting function.
Description
This pull request is ment to solve issue #2698
This pull request fixes a critical bug where purging a parent page's cache would unintentionally clear the cache for all of its descendant pages. It also ensures the _wpr_rocket_cache database table is correctly and fully updated, especially when page slugs are modified within a hierarchy.
Type of change
Detailed scenario
What was tested
Describe the scenarios that you tested, and specify if it is automated or manual. For manual scenarios, provide a screenshot of the results.
How to test
Describe how the PR can be tested so that the validator can be autonomous: environment, dependencies, specific setup, steps to perform, API requests, etc.
Technical description
Documentation
Problem
The previous implementation purged parent pages when a post was updated. This led to several issues:
Solution
The logic has been inverted: instead of purging parent pages, the process now explicitly targets all affected child pages.
This change ensures that only the intended cache files are removed and that each purged URL is explicitly processed, guaranteeing that the _wpr_rocket_cache table is always kept in sync with the actual state of the cache.
New dependencies
List any new dependencies that are required for this change.
Risks
List possible performance & security issues or risks, and explain how they have been mitigated.
Mandatory Checklist
Code validation
Code style
Unticked items justification
If some mandatory items are not relevant, explain why in this section.
Additional Checks