-
-
Notifications
You must be signed in to change notification settings - Fork 829
Description
Application Context
We have a rails application currently running in production that used to be able to handle a metric ton of jobs with ease.
We recently upgraded from Rails 7.0 to 7.1 (that was it's own can of worms) and ransack to v4.0
Before deployment we rarely if ever saw this query
SELECT a.attname, format_type(a.atttypid, a.atttypmod),
pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
c.collname, col_description(a.attrelid, a.attnum) AS comment,
a.attidentity AS identity,
attgenerated as attgenerated
FROM pg_attribute a
LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
LEFT JOIN pg_type t ON a.atttypid = t.oid
LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
WHERE a.attrelid = ?::regclass
AND a.attnum > ? AND NOT a.attisdropped
ORDER BY a.attnumAfter release this query (among others) started popping up during peak hours around 15k-20k times per hour. This leads to huge performance impacts on our users when trying to navigate the website during those peak job hours. At first I thought something was wrong with resque, but eventually I came across this chain of events that I think is the culprit.
We use authorizable_ransackable_attributes for many of our models because they have no sensitive data.
- in
authorizable_ransackable_attributesthere is a call tocolumn_names - in
column_namesthere is a call tocolumns - in
columnsthere is a call tocolumn_definitions - in
column_definitionswe find the query mentioned before that started appearing exactly as we upgraded ransack
The reason why this appeared so intensely and might not have been noticed is that in a web scenario the schema caching might come into play reducing the need from rails to run the column_definitions sql query. But in jobs like Resque, which we use, connections are forked for each job => if you have 400 workers starting and ending every couple of seconds non-stop for an hour or so, you end up with a non-responsive frontend and extremely slow queries because the database seems to be saturated with these calls.
I know it sounds odd that a query as "basic" as this might cause extreme slowdowns to the database, but at this point this is the only explanation I have left as to why upgrading rails/ransack led to extreme slowdowns we are currently facing.
We're currently looking into ways to minimize the forking process whilst preserving the recommendations from resque (to fork each worker) but I'd love to have an alternative way to whitelist all columns in a table for ransack (maybe allow the use of Rails.cache to cache the authorizable_ransackable_attributes with a TTL so it uses the default rails caching which in our case is configured in to a redis cache ?)