Commit 96c9589
[SPARK-35045][SQL] Add an internal option to control input buffer in univocity
### What changes were proposed in this pull request?
This PR makes the input buffer configurable (as an internal option). This is mainly to work around uniVocity/univocity-parsers#449.
### Why are the changes needed?
To work around uniVocity/univocity-parsers#449.
### Does this PR introduce _any_ user-facing change?
No, it's only internal option.
### How was this patch tested?
Manually tested by modifying the unittest added in apache#31858 as below:
```diff
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
index fd25a79..b58f0bd3661 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
-2460,6 +2460,7 abstract class CSVSuite
Seq(line).toDF.write.text(path.getAbsolutePath)
assert(spark.read.format("csv")
.option("delimiter", "|")
+ .option("inputBufferSize", "128")
.option("ignoreTrailingWhiteSpace", "true").load(path.getAbsolutePath).count() == 1)
}
}
```
Closes apache#32145 from HyukjinKwon/SPARK-35045.
Lead-authored-by: Hyukjin Kwon <[email protected]>
Co-authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit 1f56215)
Signed-off-by: Max Gekk <[email protected]>1 parent 8494c14 commit 96c9589
1 file changed
Lines changed: 3 additions & 0 deletions
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
| 214 | + | |
| 215 | + | |
214 | 216 | | |
215 | 217 | | |
216 | 218 | | |
| |||
257 | 259 | | |
258 | 260 | | |
259 | 261 | | |
| 262 | + | |
260 | 263 | | |
261 | 264 | | |
262 | 265 | | |
| |||
0 commit comments