Commit 88681bd
[SPARK-54134][SQL] Optimize Arrow memory usage
### What changes were proposed in this pull request?
This patch proposes some changes to optimize memory usage on Arrow in Spark. It compress Arrow IPC data when serializing.
### Why are the changes needed?
We have encountered OOM when loading data and processing them in PySpark through `toArrow` or `toPandas`. The same data could be loaded by PyArrow directly but fails to load through `toArrow` or `toPandas` into PySpark due to OOM issues.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Unit tests. Manually test it locally.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code v2.0.13
Closes apache#52747 from viirya/release_buffers.
Authored-by: Liang-Chi Hsieh <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>1 parent 7247261 commit 88681bd
File tree
6 files changed
+137
-2
lines changed- dev/deps
- python/pyspark/sql/tests/arrow
- sql
- catalyst/src/main/scala/org/apache/spark/sql/internal
- core
- src/main/scala/org/apache/spark/sql/execution/arrow
6 files changed
+137
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2519 | 2519 | | |
2520 | 2520 | | |
2521 | 2521 | | |
| 2522 | + | |
| 2523 | + | |
| 2524 | + | |
| 2525 | + | |
| 2526 | + | |
| 2527 | + | |
| 2528 | + | |
| 2529 | + | |
| 2530 | + | |
| 2531 | + | |
| 2532 | + | |
| 2533 | + | |
| 2534 | + | |
| 2535 | + | |
| 2536 | + | |
| 2537 | + | |
| 2538 | + | |
| 2539 | + | |
| 2540 | + | |
2522 | 2541 | | |
2523 | 2542 | | |
2524 | 2543 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1810 | 1810 | | |
1811 | 1811 | | |
1812 | 1812 | | |
| 1813 | + | |
| 1814 | + | |
| 1815 | + | |
| 1816 | + | |
| 1817 | + | |
| 1818 | + | |
| 1819 | + | |
| 1820 | + | |
| 1821 | + | |
| 1822 | + | |
| 1823 | + | |
| 1824 | + | |
| 1825 | + | |
| 1826 | + | |
| 1827 | + | |
| 1828 | + | |
| 1829 | + | |
| 1830 | + | |
| 1831 | + | |
| 1832 | + | |
| 1833 | + | |
| 1834 | + | |
| 1835 | + | |
| 1836 | + | |
| 1837 | + | |
| 1838 | + | |
| 1839 | + | |
| 1840 | + | |
| 1841 | + | |
| 1842 | + | |
| 1843 | + | |
| 1844 | + | |
| 1845 | + | |
| 1846 | + | |
| 1847 | + | |
| 1848 | + | |
| 1849 | + | |
| 1850 | + | |
| 1851 | + | |
| 1852 | + | |
| 1853 | + | |
| 1854 | + | |
| 1855 | + | |
| 1856 | + | |
| 1857 | + | |
| 1858 | + | |
| 1859 | + | |
| 1860 | + | |
| 1861 | + | |
| 1862 | + | |
| 1863 | + | |
| 1864 | + | |
| 1865 | + | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
| 1874 | + | |
| 1875 | + | |
| 1876 | + | |
| 1877 | + | |
| 1878 | + | |
| 1879 | + | |
| 1880 | + | |
| 1881 | + | |
| 1882 | + | |
| 1883 | + | |
| 1884 | + | |
| 1885 | + | |
| 1886 | + | |
| 1887 | + | |
1813 | 1888 | | |
1814 | 1889 | | |
1815 | 1890 | | |
| |||
Lines changed: 16 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3987 | 3987 | | |
3988 | 3988 | | |
3989 | 3989 | | |
| 3990 | + | |
| 3991 | + | |
| 3992 | + | |
| 3993 | + | |
| 3994 | + | |
| 3995 | + | |
| 3996 | + | |
| 3997 | + | |
| 3998 | + | |
| 3999 | + | |
| 4000 | + | |
| 4001 | + | |
| 4002 | + | |
| 4003 | + | |
3990 | 4004 | | |
3991 | 4005 | | |
3992 | 4006 | | |
| |||
7332 | 7346 | | |
7333 | 7347 | | |
7334 | 7348 | | |
| 7349 | + | |
| 7350 | + | |
7335 | 7351 | | |
7336 | 7352 | | |
7337 | 7353 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
279 | 279 | | |
280 | 280 | | |
281 | 281 | | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
282 | 286 | | |
283 | 287 | | |
284 | 288 | | |
| |||
Lines changed: 22 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| |||
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
| 42 | + | |
40 | 43 | | |
41 | 44 | | |
42 | 45 | | |
| |||
92 | 95 | | |
93 | 96 | | |
94 | 97 | | |
95 | | - | |
96 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
97 | 117 | | |
98 | 118 | | |
99 | 119 | | |
| |||
0 commit comments