Skip to content

Conversation

@jasonvarga
Copy link
Member

@jasonvarga jasonvarga commented Jun 13, 2024

At the moment when you enable static caching and hit a 404-ing page, it does not get cached. If that 404-ing page is heavy, you could visit it repeatedly and increase the load on the server.

This PR has two parts. Both are only supported when using half measure, for now.

Allow caching 404s

This PR allows 404s to be statically cached when using half measure. This is automatic and doesn't require a setting. This likely didn't already exist because it's not possible when using full measure. We handle this by a simple check for whether half measure is being used.

Sharing errors

Additionally, this PR allows those 404s to be shared. For instance, the first time you encounter a 404, it will be rendered and cached. Then if you visit another 404ing page, it will serve the previously cached version regardless of the URL.

Each URL will have their own cache, but the rendering will be avoided. Subsequent 404s will just get copied from the result of the first one.

Since we can't know in the middleware if a cold page will be a 404 or not, we will manually retrieve it from the static cache just before Statamic would normally render the 404's view.

@jasonvarga jasonvarga marked this pull request as ready for review June 17, 2024 19:45
@ChrisWorkhouse
Copy link

Forgive me if I've misunderstood this, but would this change not allow a nefarious actor to disable Statamic sites and their servers?

For example: if share_errors is set to false (the default behaviour), then Statamic would create a new static cache file for each new 404 page visited. So an attacker could write a script to visit a site with a randomly generated URL - which would always result in a 404. And if persistent enough, or with enough traffic doing this simultaneously then eventually a server may run out of available disk space, essentially resulting in the server going offline, a DOS attack of sorts if you will.

Similarly to what this recent change is also trying to address: #10701 only I think the issue is larger than simple bloat but could be used maliciously.

@duncanmcclean
Copy link
Member

Statamic would create a new static cache file for each new 404 page visited.

This isn't unique to 404 pages, you could visit any pages with query parameters and it'd cache each vartiation separately (provided the developer hasn't enabled ignore_query_strings).

Even without static caching, sending thousands of requests to a small website at the same time is likely to cause a DOS in other ways (like CPU, memory usage).

@ChrisWorkhouse
Copy link

ChrisWorkhouse commented Sep 16, 2024

@duncanmcclean I totally agree, and there is a change in the latest Statamic version to attempt to address the query strings aspect of this (I referenced in my comment).

And your comment on more traditional DOS attacks is also valid. Also an attacker could more closely mimic what I've described just in access / error log entries too.

But to have this enabled by default seems like an odd step when trying to limit the surface area for abuse is always a positive thing.

@duncanmcclean
Copy link
Member

duncanmcclean commented Sep 16, 2024

The share_errors config option is currently disabled by default since enabling it could cause an unintended behaviour for existing sites.

If the setting hasn't been added to a site's config file, it'll fall back to what we have in the cms config file. Although, we could enable it by default for new sites, feel free to open a pull request on the statamic/statamic repository.

@ChrisWorkhouse
Copy link

@duncanmcclean Thanks for your prompt replies on this.

I think I've perhaps misunderstood the purpose of the share_errors setting anyway. Having re-read the above and done some more digging:

Each URL will have their own cache, but the rendering will be avoided. Subsequent 404s will just get copied from the result of the first one.

So the static cache engine will add a new cache file for each 404 regardless of the share_errors setting anyway - and the setting only affects whether the 404 is re-rendered each time.

Thanks for your help.

@duncanmcclean
Copy link
Member

So the static cache engine will add a new cache file for each 404 regardless of the share_errors setting anyway - and the setting only affects whether the 404 is re-rendered each time.

Correct.

With share_errors enabled, it'll only cache the 404 page once, then all subsequent 404s will be served the already-cached 404 page.

@stuartcusackie
Copy link

stuartcusackie commented Oct 29, 2024

There still seems to be a problem here. Could a malicious actor write a simple loop to constantly request random URLs, forcing the 404 template to be regenerated over-and-over and crashing the server? This can happen naturally too with old domains - there could be tens of thousands of old URLs in Google indexes, which produces the same problem (more here).

Maybe error templates should be cached once using a different method; i.e. cache the template and not the url? I think this is how Laravel does it by default, where as Statamic static caching is url based.

@jasonvarga
Copy link
Member Author

jasonvarga commented Oct 29, 2024

Let us repeat:

With share_errors enabled, it'll only cache the 404 page once, then all subsequent 404s will be served the already-cached 404 page.

So, enable that and the page won't be generated every time. The cached version will be loaded, which is faster. That's the whole point of this PR - so that bad actors can't do exactly what you've described.

@stuartcusackie
Copy link

crap, sorry! Didn't read the full post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants