Skip to content

Commit 4f9b148

Browse files
committed
Address comment.
1 parent 1d03d3b commit 4f9b148

4 files changed

Lines changed: 18 additions & 24 deletions

File tree

python/pyspark/sql/readwriter.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -211,13 +211,12 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,
211211
212212
* ``PERMISSIVE`` : when it meets a corrupted record, puts the malformed string \
213213
into a field configured by ``columnNameOfCorruptRecord``, and sets other \
214-
fields to ``null``. To keep corrupt records, an user can set a string type \
215-
field named ``columnNameOfCorruptRecord`` in an user-defined schema. If a \
214+
fields to ``null``. It does not support partial results. To keep corrupt \
215+
records, an user can set a string type field named \
216+
``columnNameOfCorruptRecord`` in an user-defined schema. If a \
216217
schema does not have the field, it drops corrupt records during parsing. \
217218
When inferring a schema, it implicitly adds a ``columnNameOfCorruptRecord`` \
218-
field in an output schema. It does not support partial results. Even just one \
219-
field can not be correctly parsed, all fields except for the field of \
220-
``columnNameOfCorruptRecord`` will be set to ``null``.
219+
field in an output schema.
221220
* ``DROPMALFORMED`` : ignores the whole corrupted records.
222221
* ``FAILFAST`` : throws an exception when it meets corrupted records.
223222

python/pyspark/sql/streaming.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -444,13 +444,12 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,
444444
445445
* ``PERMISSIVE`` : when it meets a corrupted record, puts the malformed string \
446446
into a field configured by ``columnNameOfCorruptRecord``, and sets other \
447-
fields to ``null``. To keep corrupt records, an user can set a string type \
448-
field named ``columnNameOfCorruptRecord`` in an user-defined schema. If a \
447+
fields to ``null``. It does not support partial results. To keep corrupt \
448+
records, an user can set a string type field named \
449+
``columnNameOfCorruptRecord`` in an user-defined schema. If a \
449450
schema does not have the field, it drops corrupt records during parsing. \
450451
When inferring a schema, it implicitly adds a ``columnNameOfCorruptRecord`` \
451-
field in an output schema. It does not support partial results. Even just one \
452-
field can not be correctly parsed, all fields except for the field of \
453-
``columnNameOfCorruptRecord`` will be set to ``null``.
452+
field in an output schema.
454453
* ``DROPMALFORMED`` : ignores the whole corrupted records.
455454
* ``FAILFAST`` : throws an exception when it meets corrupted records.
456455

sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -346,13 +346,11 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging {
346346
* during parsing.
347347
* <ul>
348348
* <li>`PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a
349-
* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To keep
350-
* corrupt records, an user can set a string type field named `columnNameOfCorruptRecord`
351-
* in an user-defined schema. If a schema does not have the field, it drops corrupt records
352-
* during parsing. When inferring a schema, it implicitly adds a `columnNameOfCorruptRecord`
353-
* field in an output schema. It does not support partial results. Even just one field can not
354-
* be correctly parsed, all fields except for the field of `columnNameOfCorruptRecord` will
355-
* be set to `null`.</li>
349+
* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. It
350+
* does not support partial results. To keep corrupt records, an user can set a string
351+
* type field named `columnNameOfCorruptRecord` in an user-defined schema. If a schema
352+
* does not have the field, it drops corrupt records during parsing. When inferring a schema,
353+
* it implicitly adds a `columnNameOfCorruptRecord` field in an output schema.</li>
356354
* <li>`DROPMALFORMED` : ignores the whole corrupted records.</li>
357355
* <li>`FAILFAST` : throws an exception when it meets corrupted records.</li>
358356
* </ul>

sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -237,13 +237,11 @@ final class DataStreamReader private[sql](sparkSession: SparkSession) extends Lo
237237
* during parsing.
238238
* <ul>
239239
* <li>`PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a
240-
* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To keep
241-
* corrupt records, an user can set a string type field named `columnNameOfCorruptRecord`
242-
* in an user-defined schema. If a schema does not have the field, it drops corrupt records
243-
* during parsing. When inferring a schema, it implicitly adds a `columnNameOfCorruptRecord`
244-
* field in an output schema. It does not support partial results. Even just one field can not
245-
* be correctly parsed, all fields except for the field of `columnNameOfCorruptRecord` will
246-
* be set to `null`.</li>
240+
* field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. It
241+
* does not support partial results. To keep corrupt records, an user can set a string
242+
* type field named `columnNameOfCorruptRecord` in an user-defined schema. If a schema
243+
* does not have the field, it drops corrupt records during parsing. When inferring a schema,
244+
* it implicitly adds a `columnNameOfCorruptRecord` field in an output schema.</li>
247245
* <li>`DROPMALFORMED` : ignores the whole corrupted records.</li>
248246
* <li>`FAILFAST` : throws an exception when it meets corrupted records.</li>
249247
* </ul>

0 commit comments

Comments
 (0)