Skip to content

Conversation

@kaxil
Copy link
Member

@kaxil kaxil commented Oct 16, 2025

Workers no longer import the full kubernetes client library (~32-42 MB) when performing routine operations like secret masking and DAG serialization. The kubernetes client is only imported when actually processing kubernetes objects.

With the default 32 LocalExecutor workers, this could reduce memory usage by approximately 1 GB in deployments that don't all use k8s.

Part of #56641 (Kudos to @wjddn279 for investigation)

import sys
import tracemalloc

assert 'kubernetes' not in sys.modules

tracemalloc.start()
snapshot_before = tracemalloc.take_snapshot()

from kubernetes.client import V1EnvVar

snapshot_after = tracemalloc.take_snapshot()

top_stats = snapshot_after.compare_to(snapshot_before, 'traceback')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
    print(stat)

total = sum(stat.size_diff for stat in top_stats)
print(f"\nTotal memory increase: {total / 1024 / 1024:.2f} MB")

Output: Total memory increase: 41.62 MB


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Workers no longer import the full kubernetes client library (~32-42 MB)
when performing routine operations like secret masking and DAG
serialization. The kubernetes client is only imported when actually
processing kubernetes objects.

With the default 32 LocalExecutor workers, this could reduce memory usage
by approximately 1 GB in deployments that don't all use k8s.

Part of apache#56641

```py
import sys
import tracemalloc

assert 'kubernetes' not in sys.modules

tracemalloc.start()
snapshot_before = tracemalloc.take_snapshot()

from kubernetes.client import V1EnvVar

snapshot_after = tracemalloc.take_snapshot()

top_stats = snapshot_after.compare_to(snapshot_before, 'traceback')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
    print(stat)

total = sum(stat.size_diff for stat in top_stats)
print(f"\nTotal memory increase: {total / 1024 / 1024:.2f} MB")
```
Output: Total memory increase: 41.62 MB
@kaxil kaxil added this to the Airflow 3.1.1 milestone Oct 16, 2025
@kaxil kaxil requested a review from potiuk October 16, 2025 02:07
@kaxil kaxil merged commit 4926999 into apache:main Oct 16, 2025
62 checks passed
@kaxil kaxil deleted the skip-k8s-client-import branch October 16, 2025 10:40
snreddygopu pushed a commit to Teradata/airflow that referenced this pull request Oct 16, 2025
potiuk added a commit to potiuk/airflow that referenced this pull request Oct 16, 2025
The apache#56692 introduced optimization for PodGenerator imports - but
there was a problem that when deserializing Pod it failed when no
k8s classes were loaded - but it really is not optimisation but
failure - nothing actually prevents us from importing the k8s
classes and we actually **have to** do it in case we want to
deserialize serialized Pod.
potiuk added a commit to potiuk/airflow that referenced this pull request Oct 16, 2025
The apache#56692 introduced optimization for PodGenerator imports - but
there was a problem that when deserializing Pod it failed when no
k8s classes were loaded - but it really is not optimisation but
failure - nothing actually prevents us from importing the k8s
classes and we actually **have to** do it in case we want to
deserialize serialized Pod.
potiuk added a commit to potiuk/airflow that referenced this pull request Oct 16, 2025
The apache#56692 introduced optimization for PodGenerator imports - but there
was a problem that when deserializing Pod it failed when no k8s classes
were loaded - but it really is not optimisation but failure - nothing
actually prevents us from importing the k8s classes and we actually have
to do it in case we want to deserialize serialized Pod.  # Please enter
the commit message for your changes. Lines starting
kaxil added a commit that referenced this pull request Oct 16, 2025
The #56692 introduced optimization for PodGenerator imports - but there
was a problem that when deserializing Pod it failed when no k8s classes
were loaded - but it really is not optimisation but failure - nothing
actually prevents us from importing the k8s classes and we actually have
to do it in case we want to deserialize serialized Pod.  # Please enter
the commit message for your changes. Lines starting

* fixup! Skip PodGenerator import for deserialization when no k8s installed

* fixup! fixup! Skip PodGenerator import for deserialization when no k8s installed

---------

Co-authored-by: Kaxil Naik <[email protected]>
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 17, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 17, 2025
The apache#56692 introduced optimization for PodGenerator imports - but there
was a problem that when deserializing Pod it failed when no k8s classes
were loaded - but it really is not optimisation but failure - nothing
actually prevents us from importing the k8s classes and we actually have
to do it in case we want to deserialize serialized Pod.  # Please enter
the commit message for your changes. Lines starting

* fixup! Skip PodGenerator import for deserialization when no k8s installed

* fixup! fixup! Skip PodGenerator import for deserialization when no k8s installed

---------

Co-authored-by: Kaxil Naik <[email protected]>
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 19, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 19, 2025
The apache#56692 introduced optimization for PodGenerator imports - but there
was a problem that when deserializing Pod it failed when no k8s classes
were loaded - but it really is not optimisation but failure - nothing
actually prevents us from importing the k8s classes and we actually have
to do it in case we want to deserialize serialized Pod.  # Please enter
the commit message for your changes. Lines starting

* fixup! Skip PodGenerator import for deserialization when no k8s installed

* fixup! fixup! Skip PodGenerator import for deserialization when no k8s installed

---------

Co-authored-by: Kaxil Naik <[email protected]>
kaxil added a commit that referenced this pull request Oct 21, 2025
kaxil added a commit that referenced this pull request Oct 21, 2025
The #56692 introduced optimization for PodGenerator imports - but there
was a problem that when deserializing Pod it failed when no k8s classes
were loaded - but it really is not optimisation but failure - nothing
actually prevents us from importing the k8s classes and we actually have
to do it in case we want to deserialize serialized Pod.  # Please enter
the commit message for your changes. Lines starting

* fixup! Skip PodGenerator import for deserialization when no k8s installed

* fixup! fixup! Skip PodGenerator import for deserialization when no k8s installed

---------

Co-authored-by: Kaxil Naik <[email protected]>
(cherry picked from commit 17037e6)
TyrellHaywood pushed a commit to TyrellHaywood/airflow that referenced this pull request Oct 22, 2025
TyrellHaywood pushed a commit to TyrellHaywood/airflow that referenced this pull request Oct 22, 2025
The apache#56692 introduced optimization for PodGenerator imports - but there
was a problem that when deserializing Pod it failed when no k8s classes
were loaded - but it really is not optimisation but failure - nothing
actually prevents us from importing the k8s classes and we actually have
to do it in case we want to deserialize serialized Pod.  # Please enter
the commit message for your changes. Lines starting

* fixup! Skip PodGenerator import for deserialization when no k8s installed

* fixup! fixup! Skip PodGenerator import for deserialization when no k8s installed

---------

Co-authored-by: Kaxil Naik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants