-
Notifications
You must be signed in to change notification settings - Fork 16k
Description
Apache Airflow version
3.0.3
If "Other Airflow 2 version" selected, which one?
No response
What happened?
Seeing long running active sessions in the postgres database and causing high CPU utilization.
Caused by this query:
SELECT serialized_dag.data, serialized_dag.data_compressed, serialized_dag.id, serialized_dag.dag_id, serialized_dag.created_at, serialized_dag.last_updated, serialized_dag.dag_hash, serialized_dag.dag_version_id
FROM serialized_dag
WHERE serialized_dag.dag_version_id IN (SELECT task_instance.dag_version_id
FROM task_instance JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id
WHERE dag_run.id IN (103279, 103260, 103240, 103221, 103202, 103183, 103164, 103143, 103125, 103104, 103086, 103068, 103047, 103037, 103009, 102990, 102971, 102952, 102933, 102916, 102895, 102874, 102856, 102837, 102817, 102800, 102777, 102762, 102740, 102723, 102702, 102686, 102663, 102645, 102624, 102606, 102589, 102573, 102553, 102533, 102511, 102495, 102476, 102458, 102436, 102415, 102398, 102376, 102358, 102344, 102322, 102297, 102282, 102261, 102243, 102223, 102202, 102181, 102162, 102146, 102124, 102106, 102090, 102068, 102047, 102031, 102012, 101992, 101976, 101953, 101932, 101912, 101894, 101877, 101859, 101836, 101819, 101797, 101780, 101764, 101743, 101722, 101704, 101685, 101668, 101646, 101634, 101609, 101592, 101571, 101550, 101533, 101516, 101497, 101479, 101460, 101441, 101423, 101406, 101386, 101365, 101346, 101326, 101308, 101286, 101266, 101246, 101224, 101208, 101191, 101172, 101152, 101128, 101113, 101091, 101070, 101053, 101033, 101014, 100993, 100975, 100956, 100938, 100916, 100896, 100877, 100860, 100841, 100821, 100805, 100783, 100766, 100745, 100727, 100711, 100686, 100666, 100648, 100631, 100609, 100589, 100574, 100551, 100535, 100515, 100493, 100473, 100458, 100436, 100417, 100400, 100381, 100363, 100341, 100324, 100308, 100286, 100271, 100249, 100232, 100211, 100200, 100195, 100182, 100167, 100162, 100152, 100140, 100130, 100122, 100109, 100098, 100092, 100082, 100070, 100064, 100056, 100041, 100035, 100025, 100018, 100010, 100001, 99991, 99978, 99970, 99960, 99954, 99942, 99933, 99925, 99915, 99906, 99897, 99885, 99878, 99867, 99856, 99850, 99839, 99830, 99820, 99808, 99799, 99791, 99783, 99775, 99764, 99751, 99744, 99736, 99727, 99712, 99707, 99695, 99687, 99678, 99667, 99655, 99646, 99637, 99629, 99618, 99613, 99597, 99589, 99578, 99572, 99560, 99551, 99542, 99533, 99525, 99515, 99506, 99496, 99488, 99476, 99468, 99459, 99450, 99443, 99433, 99421, 99412, 99403, 99395, 99386, 99377, 99361, 99352, 99345, 99334, 99324, 99319, 99305, 99294, 99289, 99277, 99270, 99256, 99249, 99237, 99229, 99223, 99210, 99201, 99192, 99183, 99174, 99162, 99156, 99143, 99133, 99125, 99114, 99107, 99097, 99085, 99077, 99066, 99058, 99050, 99040, 99032, 99021, 99008, 99001, 98991, 98983, 98976, 98963, 98956, 98947, 98936, 98927, 98915, 98908, 98897, 98890, 98879, 98871, 98859, 98851, 98843, 98832, 98825, 98812, 98805, 98795, 98783, 98776, 98766, 98758, 98749, 98737, 98730, 98722, 98712, 98702, 98693, 98682, 98672, 98662, 98650, 98644, 98633, 98622, 98612, 98602, 98593, 98584, 98573, 98564, 98554, 98546, 98538, 98529, 98520, 98505, 98496, 98487, 98481, 98468, 98459, 98450, 98440, 98430, 98421, 98413, 98402, 98392, 98384, 98375, 98365, 98359, 98351, 98339, 98328, 98318, 98308, 98300, 98290, 98281, 98272) AND serialized_dag.id != '01997ae2-22b1-7959-8639-dcf8ab870d45'::uuid);
looping (i.e correlation) and and sequential scan due to serialized_dag.id present inside the inner sql block is causing the query to run for long..
What you think should happen instead?
Take out "AND serialized_dag.id != '01997ae2-22b1-7959-8639-dcf8ab870d45'::uuid)” from inner SQL and add this condition to outer/main/parent sql. This will help to reduce looping (i.e avoid correlation) and avoid sequential scan and use index scan which helps to seed-up up the execution.
We have seen significant improvement from 4+ hours execution time to 35ms with same result set as per our analysis/observation.
SELECT serialized_dag.data, serialized_dag.data_compressed, serialized_dag.id, serialized_dag.dag_id, serialized_dag.created_at, serialized_dag.last_updated, serialized_dag.dag_hash, serialized_dag.dag_version_id
FROM serialized_dag
WHERE serialized_dag.dag_version_id IN (SELECT task_instance.dag_version_id
FROM task_instance JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id
WHERE dag_run.id IN (103279, 103260, 103240, 103221, 103202, 103183, 103164, 103143, 103125, 103104, 103086, 103068, 103047, 103037, 103009, 102990, 102971, 102952, 102933, 102916, 102895, 102874, 102856, 102837, 102817, 102800, 102777, 102762, 102740, 102723, 102702, 102686, 102663, 102645, 102624, 102606, 102589, 102573, 102553, 102533, 102511, 102495, 102476, 102458, 102436, 102415, 102398, 102376, 102358, 102344, 102322, 102297, 102282, 102261, 102243, 102223, 102202, 102181, 102162, 102146, 102124, 102106, 102090, 102068, 102047, 102031, 102012, 101992, 101976, 101953, 101932, 101912, 101894, 101877, 101859, 101836, 101819, 101797, 101780, 101764, 101743, 101722, 101704, 101685, 101668, 101646, 101634, 101609, 101592, 101571, 101550, 101533, 101516, 101497, 101479, 101460, 101441, 101423, 101406, 101386, 101365, 101346, 101326, 101308, 101286, 101266, 101246, 101224, 101208, 101191, 101172, 101152, 101128, 101113, 101091, 101070, 101053, 101033, 101014, 100993, 100975, 100956, 100938, 100916, 100896, 100877, 100860, 100841, 100821, 100805, 100783, 100766, 100745, 100727, 100711, 100686, 100666, 100648, 100631, 100609, 100589, 100574, 100551, 100535, 100515, 100493, 100473, 100458, 100436, 100417, 100400, 100381, 100363, 100341, 100324, 100308, 100286, 100271, 100249, 100232, 100211, 100200, 100195, 100182, 100167, 100162, 100152, 100140, 100130, 100122, 100109, 100098, 100092, 100082, 100070, 100064, 100056, 100041, 100035, 100025, 100018, 100010, 100001, 99991, 99978, 99970, 99960, 99954, 99942, 99933, 99925, 99915, 99906, 99897, 99885, 99878, 99867, 99856, 99850, 99839, 99830, 99820, 99808, 99799, 99791, 99783, 99775, 99764, 99751, 99744, 99736, 99727, 99712, 99707, 99695, 99687, 99678, 99667, 99655, 99646, 99637, 99629, 99618, 99613, 99597, 99589, 99578, 99572, 99560, 99551, 99542, 99533, 99525, 99515, 99506, 99496, 99488, 99476, 99468, 99459, 99450, 99443, 99433, 99421, 99412, 99403, 99395, 99386, 99377, 99361, 99352, 99345, 99334, 99324, 99319, 99305, 99294, 99289, 99277, 99270, 99256, 99249, 99237, 99229, 99223, 99210, 99201, 99192, 99183, 99174, 99162, 99156, 99143, 99133, 99125, 99114, 99107, 99097, 99085, 99077, 99066, 99058, 99050, 99040, 99032, 99021, 99008, 99001, 98991, 98983, 98976, 98963, 98956, 98947, 98936, 98927, 98915, 98908, 98897, 98890, 98879, 98871, 98859, 98851, 98843, 98832, 98825, 98812, 98805, 98795, 98783, 98776, 98766, 98758, 98749, 98737, 98730, 98722, 98712, 98702, 98693, 98682, 98672, 98662, 98650, 98644, 98633, 98622, 98612, 98602, 98593, 98584, 98573, 98564, 98554, 98546, 98538, 98529, 98520, 98505, 98496, 98487, 98481, 98468, 98459, 98450, 98440, 98430, 98421, 98413, 98402, 98392, 98384, 98375, 98365, 98359, 98351, 98339, 98328, 98318, 98308, 98300, 98290, 98281, 98272)) AND serialized_dag.id != '01997ae2-22b1-7959-8639-dcf8ab870d45'::uuid;
How to reproduce
Connect airflow to postgres and have it query multiple dag ids and observe postgres metrics.
Operating System
linux
Versions of Apache Airflow Providers
apache-airflow-providers-common-sql==1.27.3
apache-airflow-providers-fab==2.3.0
apache-airflow-providers-mysql==6.3.2
apache-airflow-providers-odbc==4.10.1
apache-airflow-providers-postgres==6.2.1
apache-airflow-providers-sendgrid==4.1.2
apache-airflow-providers-standard==1.4.1
Deployment
Official Apache Airflow Helm Chart
Deployment details
Airflow: 3.0.3
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct