Skip to content
Closed
84 changes: 70 additions & 14 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@
<hive.classifier></hive.classifier>
<!-- Version used in Maven Hive dependency -->
<hive.version>1.2.1.spark2</hive.version>
<hive23.version>2.3.4</hive23.version>
<!-- Version used for internal directory structure -->
<hive.version.short>1.2.1</hive.version.short>
<!-- note that this should be compatible with Kafka brokers version 0.10 and up -->
Expand Down Expand Up @@ -1414,7 +1415,7 @@
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
<!-- Begin of Hive 2.3.4 exclusion -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- jetty-all conflict with jetty 9.4.12.v20180830 -->
<exclusion>
Copy link
Member Author

@wangyum wangyum Mar 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exclude jetty-all, it conflict with jetty 9.4.12.v20180830:

build/sbt clean package -Phadoop-3.2 -Phive
...
[error] /home/yumwang/opensource/spark/core/src/main/scala/org/apache/spark/SSLOptions.scala:78: value setTrustStorePath is not a member of org.eclipse.jetty.util.ssl.SslContextFactory
[error]         trustStore.foreach(file => sslContextFactory.setTrustStorePath(file.getAbsolutePath))
[error]

<groupId>org.eclipse.jetty.aggregate</groupId>
Expand All @@ -1425,7 +1426,17 @@
<groupId>org.apache.logging.log4j</groupId>
<artifactId>*</artifactId>
</exclusion>
<!-- End of Hive 2.3.4 exclusion -->
<!-- Hive includes javax.servlet to fix the Hive on Spark test failure; see HIVE-12783 -->
<exclusion>
<groupId>org.eclipse.jetty.orbit</groupId>
<artifactId>javax.servlet</artifactId>
</exclusion>
<!-- hive-storage-api is needed and must be explicitly included later -->
<exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-storage-api</artifactId>
</exclusion>
<!-- End of Hive 2.3 exclusion -->
</exclusions>
</dependency>

Expand Down Expand Up @@ -1544,7 +1555,7 @@
<groupId>org.json</groupId>
<artifactId>json</artifactId>
</exclusion>
<!-- Begin of Hive 2.3.4 exclusion -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- Do not need Tez -->
<exclusion>
<groupId>${hive.group}</groupId>
Expand All @@ -1564,7 +1575,7 @@
<groupId>org.apache.logging.log4j</groupId>
<artifactId>*</artifactId>
</exclusion>
<!-- End of Hive 2.3.4 exclusion -->
<!-- End of Hive 2.3 exclusion -->
</exclusions>
</dependency>
<dependency>
Expand Down Expand Up @@ -1673,6 +1684,17 @@
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<!-- Begin of Hive 2.3 exclusion -->
<!-- Hive removes the HBase Metastore; see HIVE-17234 -->
<exclusion>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
</exclusion>
<exclusion>
<groupId>co.cask.tephra</groupId>
<artifactId>*</artifactId>
</exclusion>
<!-- End of Hive 2.3 exclusion -->
</exclusions>
</dependency>

Expand Down Expand Up @@ -1730,7 +1752,7 @@
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy-all</artifactId>
</exclusion>
<!-- Begin of Hive 2.3.4 exclusion -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- parquet-hadoop-bundle:1.8.1 conflict with 1.10.1 -->
<exclusion>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exclude parquet-hadoop-bundle, otherwise:

build/sbt clean package -Phadoop-3.2 -Phive
...
[error] /home/yumwang/opensource/spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala:36: value JobSummaryLevel is not a member of object org.apache.parquet.hadoop.ParquetOutputFormat
[error] import org.apache.parquet.hadoop.ParquetOutputFormat.JobSummaryLevel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These several exclusions would apply to both Hive 2 and Hive 1 in the build as it is now. That's probably OK; maybe they don't even exist in Hive 1. But some like this one I'm not as sure about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. org.apache.parquet:parquet-hadoop-bundle don't exist in Hive 1. It shoud be com.twitter:parquet-hadoop-bundle in Hive 1: https://github.com/apache/hive/blob/release-1.2.1/pom.xml#L256-L260

<groupId>org.apache.parquet</groupId>
Expand All @@ -1745,7 +1767,7 @@
<groupId>tomcat</groupId>
<artifactId>jasper-runtime</artifactId>
</exclusion>
<!-- End of Hive 2.3.4 exclusion -->
<!-- End of Hive 2.3 exclusion -->
</exclusions>
</dependency>

Expand Down Expand Up @@ -1811,21 +1833,22 @@
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy-all</artifactId>
</exclusion>
<!-- Begin of Hive 2.3.4 exclusion -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- Exclude log4j-slf4j-impl, otherwise throw NCDFE when starting spark-shell -->
<exclusion>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exclude log4j-slf4j-impl, otherwise:

$ build/sbt clean package -Phadoop-3.2 -Phive
$ export SPARK_PREPEND_CLASSES=true
$ bin/spark-shell
NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/logging/log4j/spi/AbstractLoggerAdapter
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.slf4j.impl.StaticLoggerBinder.<clinit>(StaticLoggerBinder.java:36)
	at org.apache.spark.internal.Logging$.org$apache$spark$internal$Logging$$isLog4j12(Logging.scala:217)
	at org.apache.spark.internal.Logging.initializeLogging(Logging.scala:122)
	at org.apache.spark.internal.Logging.initializeLogIfNecessary(Logging.scala:111)
	at org.apache.spark.internal.Logging.initializeLogIfNecessary$(Logging.scala:105)
	at org.apache.spark.deploy.SparkSubmit.initializeLogIfNecessary(SparkSubmit.scala:73)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:81)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:939)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:948)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.logging.log4j.spi.AbstractLoggerAdapter
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 22 more

<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
</exclusion>
<!-- End of Hive 2.3.4 exclusion -->
<!-- End of Hive 2.3 exclusion -->
</exclusions>
</dependency>

<!-- Hive 2.3 need hive-llap-client, We add it here, otherwise the scope won't work -->
<!-- hive-llap-common is needed when registry UDFs in Hive 2.3.
We add it here, otherwise -Phive-provided won't work. -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-client</artifactId>
<version>2.3.4</version>
<artifactId>hive-llap-common</artifactId>
<version>${hive23.version}</version>
<scope>${hive.deps.scope}</scope>
<exclusions>
<exclusion>
Expand All @@ -1836,6 +1859,31 @@
<groupId>org.apache.hive</groupId>
<artifactId>hive-serde</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- hive-llap-client is needed when run MapReduce test in Hive 2.3. -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-client</artifactId>
<version>${hive23.version}</version>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-serde</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
Expand All @@ -1844,6 +1892,14 @@
<groupId>org.apache.curator</groupId>
<artifactId>apache-curator</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</exclusion>
</exclusions>
</dependency>

Expand Down Expand Up @@ -2741,11 +2797,11 @@
<zookeeper.version>3.4.13</zookeeper.version>
<hive.group>org.apache.hive</hive.group>
<hive.classifier>core</hive.classifier>
<hive.version>2.3.4</hive.version>
<hive.version.short>${hive.version}</hive.version.short>
<hive.version>${hive23.version}</hive.version>
<hive.version.short>2.3.4</hive.version.short>
<hive.parquet.version>${parquet.version}</hive.parquet.version>
<orc.classifier></orc.classifier>
<hive.parquet.group>org.apache.parquet</hive.parquet.group>
<orc.classifier></orc.classifier>
<datanucleus-core.version>4.1.17</datanucleus-core.version>
</properties>
<dependencies>
Expand Down
4 changes: 4 additions & 0 deletions sql/hive/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,10 @@
<groupId>${hive.group}</groupId>
<artifactId>hive-shims</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-common</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-client</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -710,6 +710,8 @@ private[hive] class HiveClientImpl(
/**
* Execute the command using Hive and return the results as a sequence. Each element
* in the sequence is one row.
* Since upgrading the built-in Hive to 2.3, hive-llap-client is needed when
* running MapReduce jobs with `runHive`.
*/
protected def runHive(cmd: String, maxRows: Int = 1000): Seq[String] = withHiveState {
logDebug(s"Running hiveql '$cmd'")
Expand Down