Commit 11e5cd7
committed
[SPARK-19019] [PYTHON] Fix hijacked
## What changes were proposed in this pull request?
Currently, PySpark does not work with Python 3.6.0.
Running `./bin/pyspark` simply throws the error as below and PySpark does not work at all:
```
Traceback (most recent call last):
File ".../spark/python/pyspark/shell.py", line 30, in <module>
import pyspark
File ".../spark/python/pyspark/__init__.py", line 46, in <module>
from pyspark.context import SparkContext
File ".../spark/python/pyspark/context.py", line 36, in <module>
from pyspark.java_gateway import launch_gateway
File ".../spark/python/pyspark/java_gateway.py", line 31, in <module>
from py4j.java_gateway import java_import, JavaGateway, GatewayClient
File "<frozen importlib._bootstrap>", line 961, in _find_and_load
File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 646, in _load_unlocked
File "<frozen importlib._bootstrap>", line 616, in _load_backward_compatible
File ".../spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 18, in <module>
File "/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pydoc.py", line 62, in <module>
import pkgutil
File "/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pkgutil.py", line 22, in <module>
ModuleInfo = namedtuple('ModuleInfo', 'module_finder name ispkg')
File ".../spark/python/pyspark/serializers.py", line 394, in namedtuple
cls = _old_namedtuple(*args, **kwargs)
TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'
```
The root cause seems because some arguments of `namedtuple` are now completely keyword-only arguments from Python 3.6.0 (See https://bugs.python.org/issue25628).
We currently copy this function via `types.FunctionType` which does not set the default values of keyword-only arguments (meaning `namedtuple.__kwdefaults__`) and this seems causing internally missing values in the function (non-bound arguments).
This PR proposes to work around this by manually setting it via `kwargs` as `types.FunctionType` seems not supporting to set this.
Also, this PR ports the changes in cloudpickle for compatibility for Python 3.6.0.
## How was this patch tested?
Manually tested with Python 2.7.6 and Python 3.6.0.
```
./bin/pyspsark
```
, manual creation of `namedtuple` both in local and rdd with Python 3.6.0,
and Jenkins tests for other Python versions.
Also,
```
./run-tests --python-executables=python3.6
```
```
Will test against the following Python executables: ['python3.6']
Will test the following Python modules: ['pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming']
Finished test(python3.6): pyspark.sql.tests (192s)
Finished test(python3.6): pyspark.accumulators (3s)
Finished test(python3.6): pyspark.mllib.tests (198s)
Finished test(python3.6): pyspark.broadcast (3s)
Finished test(python3.6): pyspark.conf (2s)
Finished test(python3.6): pyspark.context (14s)
Finished test(python3.6): pyspark.ml.classification (21s)
Finished test(python3.6): pyspark.ml.evaluation (11s)
Finished test(python3.6): pyspark.ml.clustering (20s)
Finished test(python3.6): pyspark.ml.linalg.__init__ (0s)
Finished test(python3.6): pyspark.streaming.tests (240s)
Finished test(python3.6): pyspark.tests (240s)
Finished test(python3.6): pyspark.ml.recommendation (19s)
Finished test(python3.6): pyspark.ml.feature (36s)
Finished test(python3.6): pyspark.ml.regression (37s)
Finished test(python3.6): pyspark.ml.tuning (28s)
Finished test(python3.6): pyspark.mllib.classification (26s)
Finished test(python3.6): pyspark.mllib.evaluation (18s)
Finished test(python3.6): pyspark.mllib.clustering (44s)
Finished test(python3.6): pyspark.mllib.linalg.__init__ (0s)
Finished test(python3.6): pyspark.mllib.feature (26s)
Finished test(python3.6): pyspark.mllib.fpm (23s)
Finished test(python3.6): pyspark.mllib.random (8s)
Finished test(python3.6): pyspark.ml.tests (92s)
Finished test(python3.6): pyspark.mllib.stat.KernelDensity (0s)
Finished test(python3.6): pyspark.mllib.linalg.distributed (25s)
Finished test(python3.6): pyspark.mllib.stat._statistics (15s)
Finished test(python3.6): pyspark.mllib.recommendation (24s)
Finished test(python3.6): pyspark.mllib.regression (26s)
Finished test(python3.6): pyspark.profiler (9s)
Finished test(python3.6): pyspark.mllib.tree (16s)
Finished test(python3.6): pyspark.shuffle (1s)
Finished test(python3.6): pyspark.mllib.util (18s)
Finished test(python3.6): pyspark.serializers (11s)
Finished test(python3.6): pyspark.rdd (20s)
Finished test(python3.6): pyspark.sql.conf (8s)
Finished test(python3.6): pyspark.sql.catalog (17s)
Finished test(python3.6): pyspark.sql.column (18s)
Finished test(python3.6): pyspark.sql.context (18s)
Finished test(python3.6): pyspark.sql.group (27s)
Finished test(python3.6): pyspark.sql.dataframe (33s)
Finished test(python3.6): pyspark.sql.functions (35s)
Finished test(python3.6): pyspark.sql.types (6s)
Finished test(python3.6): pyspark.sql.streaming (13s)
Finished test(python3.6): pyspark.streaming.util (0s)
Finished test(python3.6): pyspark.sql.session (16s)
Finished test(python3.6): pyspark.sql.window (4s)
Finished test(python3.6): pyspark.sql.readwriter (35s)
Tests passed in 433 seconds
```
Author: hyukjinkwon <[email protected]>
Closes apache#16429 from HyukjinKwon/SPARK-19019.collections.namedtuple and port cloudpickle changes for PySpark to work with Python 3.6.01 parent abe36c5 commit 11e5cd7
2 files changed
+87
-31
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| |||
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
| 57 | + | |
| 58 | + | |
56 | 59 | | |
57 | 60 | | |
58 | 61 | | |
| |||
68 | 71 | | |
69 | 72 | | |
70 | 73 | | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
75 | 78 | | |
76 | 79 | | |
77 | 80 | | |
| |||
90 | 93 | | |
91 | 94 | | |
92 | 95 | | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
93 | 133 | | |
94 | 134 | | |
95 | 135 | | |
| |||
245 | 285 | | |
246 | 286 | | |
247 | 287 | | |
248 | | - | |
249 | | - | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
250 | 295 | | |
251 | 296 | | |
252 | 297 | | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
264 | 308 | | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | | - | |
270 | | - | |
271 | | - | |
272 | | - | |
273 | | - | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
274 | 314 | | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
| 315 | + | |
280 | 316 | | |
281 | 317 | | |
282 | 318 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
370 | 370 | | |
371 | 371 | | |
372 | 372 | | |
| 373 | + | |
373 | 374 | | |
374 | 375 | | |
375 | 376 | | |
376 | 377 | | |
377 | 378 | | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
378 | 394 | | |
| 395 | + | |
379 | 396 | | |
380 | 397 | | |
| 398 | + | |
| 399 | + | |
381 | 400 | | |
382 | 401 | | |
383 | 402 | | |
384 | 403 | | |
| 404 | + | |
385 | 405 | | |
386 | 406 | | |
387 | 407 | | |
| |||
0 commit comments