Skip to content

Commit 714434b

Browse files
srowenkai-chi
authored andcommitted
[MINOR][DOCS] Clarify that Spark apps should mark Spark as a 'provided' dependency, not package it
## What changes were proposed in this pull request? Spark apps do not need to package Spark. In fact it can cause problems in some cases. Our examples should show depending on Spark as a 'provided' dependency. Packaging Spark makes the app much bigger by tens of megabytes. It can also bring in conflicting dependencies that wouldn't otherwise be a problem. https://issues.apache.org/jira/browse/SPARK-26146 was what reminded me of this. ## How was this patch tested? Doc build Closes apache#23938 from srowen/Provided. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com> (cherry picked from commit 3909223) Signed-off-by: Sean Owen <sean.owen@databricks.com>
1 parent 8315029 commit 714434b

3 files changed

Lines changed: 4 additions & 1 deletion

File tree

docs/cloud-integration.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ is set to the chosen version of Spark:
8787
<groupId>org.apache.spark</groupId>
8888
<artifactId>hadoop-cloud_2.11</artifactId>
8989
<version>${spark.version}</version>
90+
<scope>provided</scope>
9091
</dependency>
9192
...
9293
</dependencyManagement>

docs/quick-start.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -341,6 +341,7 @@ Note that Spark artifacts are tagged with a Scala version.
341341
<groupId>org.apache.spark</groupId>
342342
<artifactId>spark-sql_{{site.SCALA_BINARY_VERSION}}</artifactId>
343343
<version>{{site.SPARK_VERSION}}</version>
344+
<scope>provided</scope>
344345
</dependency>
345346
</dependencies>
346347
</project>

docs/streaming-programming-guide.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -385,11 +385,12 @@ Similar to Spark, Spark Streaming is available through Maven Central. To write y
385385
<groupId>org.apache.spark</groupId>
386386
<artifactId>spark-streaming_{{site.SCALA_BINARY_VERSION}}</artifactId>
387387
<version>{{site.SPARK_VERSION}}</version>
388+
<scope>provided</scope>
388389
</dependency>
389390
</div>
390391
<div data-lang="SBT" markdown="1">
391392

392-
libraryDependencies += "org.apache.spark" % "spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}"
393+
libraryDependencies += "org.apache.spark" % "spark-streaming_{{site.SCALA_BINARY_VERSION}}" % "{{site.SPARK_VERSION}}" % "provided"
393394
</div>
394395
</div>
395396

0 commit comments

Comments
 (0)