-
Notifications
You must be signed in to change notification settings - Fork 29k
Improve maven plugin configuration #590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
4e96c01
1342886
ddf2547
2204539
889bf4e
460510a
f1c7535
8540e83
c4c6e45
9f08e80
e1a7e00
effe79c
9ea1af9
0ed124d
03b136f
d3488c6
779ae5d
27bd426
54a86b0
882e35d
31451df
5fb961f
ea53549
a5ff7d1
17f6e7d
3218d3b
e788690
8b0c63f
427d499
f1eb268
4cc0c90
4277fed
31c6409
7d8cabf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -45,17 +45,20 @@ For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop versions wit | |
| For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions with YARN, you can enable the "yarn-alpha" or "yarn" profile and set the "hadoop.version", "yarn.version" property. Note that Hadoop 0.23.X requires a special `-Phadoop-0.23` profile: | ||
|
|
||
| # Apache Hadoop 2.0.5-alpha | ||
| $ mvn -Pyarn-alpha -Dhadoop.version=2.0.5-alpha -Dyarn.version=2.0.5-alpha -DskipTests clean package | ||
| $ mvn -Pyarn-alpha -Dhadoop.version=2.0.5-alpha -DskipTests clean package | ||
|
|
||
| # Cloudera CDH 4.2.0 with MapReduce v2 | ||
| $ mvn -Pyarn-alpha -Dhadoop.version=2.0.0-cdh4.2.0 -Dyarn.version=2.0.0-cdh4.2.0 -DskipTests clean package | ||
| $ mvn -Pyarn-alpha -Dhadoop.version=2.0.0-cdh4.2.0 -DskipTests clean package | ||
|
|
||
| # Apache Hadoop 2.2.X (e.g. 2.2.0 as below) and newer | ||
| $ mvn -Pyarn -Dhadoop.version=2.2.0 -Dyarn.version=2.2.0 -DskipTests clean package | ||
| $ mvn -Pyarn -Dhadoop.version=2.2.0 -DskipTests clean package | ||
|
|
||
| # Apache Hadoop 0.23.x | ||
| $ mvn -Pyarn-alpha -Phadoop-0.23 -Dhadoop.version=0.23.7 -Dyarn.version=0.23.7 -DskipTests clean package | ||
|
|
||
| # Different versions of HDFS vs YARN. | ||
| $ mvn -Pyarn-alpha -Dhadoop.version=2.3.0 -Dyarn.version= 0.23.7 -DskipTests clean package | ||
|
||
|
|
||
| ## Spark Tests in Maven ## | ||
|
|
||
| Tests are run by default via the [ScalaTest Maven plugin](http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin). Some of the require Spark to be packaged first, so always run `mvn package` with `-DskipTests` the first time. You can then run the tests with `mvn -Dhadoop.version=... test`. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -16,7 +16,8 @@ | |
| ~ limitations under the License. | ||
| --> | ||
|
|
||
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
| xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
| <modelVersion>4.0.0</modelVersion> | ||
| <parent> | ||
| <groupId>org.apache</groupId> | ||
|
|
@@ -119,7 +120,7 @@ | |
| <log4j.version>1.2.17</log4j.version> | ||
| <hadoop.version>1.0.4</hadoop.version> | ||
| <protobuf.version>2.4.1</protobuf.version> | ||
| <yarn.version>0.23.7</yarn.version> | ||
| <yarn.version>${hadoop.version}</yarn.version> | ||
| <hbase.version>0.94.6</hbase.version> | ||
| <hive.version>0.12.0</hive.version> | ||
| <parquet.version>1.3.2</parquet.version> | ||
|
|
@@ -135,7 +136,8 @@ | |
|
|
||
| <repositories> | ||
| <repository> | ||
| <id>maven-repo</id> <!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution --> | ||
| <id>maven-repo</id> | ||
| <!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution --> | ||
| <name>Maven Repository</name> | ||
| <!-- HTTPS is unavailable for Maven Central --> | ||
| <url>http://repo.maven.apache.org/maven2</url> | ||
|
|
@@ -558,64 +560,7 @@ | |
| <artifactId>jets3t</artifactId> | ||
| <version>0.7.1</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-yarn-api</artifactId> | ||
| <version>${yarn.version}</version> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.ow2.asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jboss.netty</groupId> | ||
| <artifactId>netty</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-yarn-common</artifactId> | ||
| <version>${yarn.version}</version> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.ow2.asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jboss.netty</groupId> | ||
| <artifactId>netty</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-yarn-client</artifactId> | ||
| <version>${yarn.version}</version> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.ow2.asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jboss.netty</groupId> | ||
| <artifactId>netty</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| <dependency> | ||
| <!-- Matches the version of jackson-core-asl pulled in by avro --> | ||
| <groupId>org.codehaus.jackson</groupId> | ||
|
|
@@ -850,12 +795,6 @@ | |
| <modules> | ||
| <module>yarn</module> | ||
| </modules> | ||
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.avro</groupId> | ||
| <artifactId>avro</artifactId> | ||
| </dependency> | ||
| </dependencies> | ||
| </profile> | ||
|
|
||
| <!-- Ganglia integration is not included by default due to LGPL-licensed code --> | ||
|
|
@@ -895,13 +834,74 @@ | |
| <id>yarn</id> | ||
| <properties> | ||
| <hadoop.major.version>2</hadoop.major.version> | ||
| <hadoop.version>2.2.0</hadoop.version> | ||
| <hadoop.version>2.3.0</hadoop.version> | ||
|
||
| <protobuf.version>2.5.0</protobuf.version> | ||
| </properties> | ||
| <modules> | ||
| <module>yarn</module> | ||
| </modules> | ||
| <dependencyManagement> | ||
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-yarn-api</artifactId> | ||
| <version>${yarn.version}</version> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.ow2.asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jboss.netty</groupId> | ||
| <artifactId>netty</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-yarn-common</artifactId> | ||
| <version>${yarn.version}</version> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.ow2.asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jboss.netty</groupId> | ||
| <artifactId>netty</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-yarn-client</artifactId> | ||
| <version>${yarn.version}</version> | ||
| <exclusions> | ||
| <exclusion> | ||
| <groupId>asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.ow2.asm</groupId> | ||
| <artifactId>asm</artifactId> | ||
| </exclusion> | ||
| <exclusion> | ||
| <groupId>org.jboss.netty</groupId> | ||
| <artifactId>netty</artifactId> | ||
| </exclusion> | ||
| </exclusions> | ||
| </dependency> | ||
| </dependencies> | ||
| </dependencyManagement> | ||
| </profile> | ||
|
|
||
| <!-- Build without Hadoop dependencies that are included in some runtime environments. --> | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -55,7 +55,7 @@ object SparkBuild extends Build { | |
| val SCALAC_JVM_VERSION = "jvm-1.6" | ||
| val JAVAC_JVM_VERSION = "1.6" | ||
|
|
||
| lazy val root = Project("root", file("."), settings = rootSettings) aggregate(allProjects: _*) | ||
| lazy val root = Project("spark", file("."), settings = rootSettings) aggregate(allProjects: _*) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just wondering - what is the benefit of this change?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to increase readability. |
||
|
|
||
| lazy val core = Project("core", file("core"), settings = coreSettings) | ||
|
|
||
|
|
@@ -569,7 +569,7 @@ object SparkBuild extends Build { | |
| libraryDependencies += "net.sf.py4j" % "py4j" % "0.8.1", | ||
| name := "spark-assembly", | ||
| assembleDeps in Compile <<= (packageProjects.map(packageBin in Compile in _) ++ Seq(packageDependency in Compile)).dependOn, | ||
| jarName in assembly <<= version map { v => "spark-assembly-" + v + "-hadoop" + hadoopVersion + ".jar" }, | ||
| jarName in assembly <<= version map { v => s"spark-assembly-${v}-hadoop${hadoopVersion}${if (isYarnEnabled) "-yarn" else ""}.jar" }, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer to keep this as-is for now. We have many parameters in the build (hadoop version, hive support, ganglia, yarn etc) and I don't think it will scale to add new suffixes for each one.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also this doesn't seem consistent with the maven build.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added this because it was getting a bit confusing if some of the assembled artifacts were built with YARN or not. If we can't qualify how about creating a Manifest of some sort the deployment?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now I'd just prefer to keep it simple and use uniform naming. One thing we should probably do is in
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
| jarName in packageDependency <<= version map { v => "spark-assembly-" + v + "-hadoop" + hadoopVersion + "-deps.jar" } | ||
| ) ++ assemblySettings ++ extraAssemblySettings | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,7 +21,6 @@ | |
| <groupId>org.apache.spark</groupId> | ||
| <artifactId>yarn-parent_2.10</artifactId> | ||
| <version>1.0.0-SNAPSHOT</version> | ||
| <relativePath>../pom.xml</relativePath> | ||
| </parent> | ||
|
||
|
|
||
| <groupId>org.apache.spark</groupId> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but could this say:
Differing versions of HDFS and YARN(no period)