Skip to content

Conversation

@lianhuiwang
Copy link
Contributor

What changes were proposed in this pull request?

Currently if some partitions of a partitioned table are used in join operation we rely on Metastore returned size of table to calculate if we can convert the operation to Broadcast join.
if Filter can prune some partitions, Hive can prune partition before determining to use broadcast joins according to HDFS size of partitions that are involved in Query.So sparkSQL needs it that can improve join's performance for partitioned table.

How was this patch tested?

integration tests

@SparkQA
Copy link

SparkQA commented May 28, 2016

Test build #59552 has finished for PR 13373 at commit 77737f1.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 28, 2016

Test build #59555 has finished for PR 13373 at commit ca78723.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 6, 2016

Test build #60017 has finished for PR 13373 at commit dd6bdf0.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class PushFilterIntoRelation(conf: SQLConf) extends Rule[LogicalPlan] with PredicateHelper
    • case class PushProjectIntoRelation(conf: SQLConf) extends Rule[LogicalPlan]

@SparkQA
Copy link

SparkQA commented Jun 6, 2016

Test build #60018 has finished for PR 13373 at commit 8b9b07d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 6, 2016

Test build #60044 has finished for PR 13373 at commit ab86f14.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 7, 2016

Test build #60094 has finished for PR 13373 at commit fc25f72.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 7, 2016

Test build #60110 has finished for PR 13373 at commit c4f9bc6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lianhuiwang lianhuiwang changed the title [SPARK-15616] [SQL] Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available. [SPARK-15616] [SQL] Metastore relation should fallback to HDFS size of partitions that are involved in Query for JoinSelection. Jul 22, 2016
@SparkQA
Copy link

SparkQA commented Jul 22, 2016

Test build #62707 has finished for PR 13373 at commit f3da998.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 22, 2016

Test build #62710 has finished for PR 13373 at commit 2d9e321.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 22, 2016

Test build #62711 has finished for PR 13373 at commit 1e0a6f2.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lianhuiwang
Copy link
Contributor Author

cc @cloud-fan @rxin @hvanhovell

@SparkQA
Copy link

SparkQA commented Jul 22, 2016

Test build #62704 has finished for PR 13373 at commit 4a3e72e.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 22, 2016

Test build #62713 has finished for PR 13373 at commit b573919.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 22, 2016

Test build #62715 has finished for PR 13373 at commit bf74b0e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63612 has finished for PR 13373 at commit c7b181e.

  • This patch fails to build.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63614 has finished for PR 13373 at commit 7d5371a.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

…on_broadcast

# Conflicts:
#	sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala
@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67765 has finished for PR 13373 at commit 76a847c.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67768 has finished for PR 13373 at commit 5d90974.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67771 has finished for PR 13373 at commit 1a63649.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2016

Test build #67777 has finished for PR 13373 at commit a6e2c57.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@lianhuiwang, I understand it is painful to keep the PR up-to-date. However, shouldn't we probably have the Jenkins build passed at the last even if it has conflicts?

@cloud-fan
Copy link
Contributor

Sorry that I think it's not valid anymore after we have PruneFileSourcePartitions rule.

@lianhuiwang
Copy link
Contributor Author

@cloud-fan I do not think that PruneFileSourcePartitions rule is for Hive's CatalogRelation. example in this PR with master branch cannot get expected result. So i will update it with the latest code.

@lianhuiwang
Copy link
Contributor Author

@HyukjinKwon @cloud-fan I will close this PR and create new PR #18193 for it. Thanks.

@lianhuiwang lianhuiwang closed this Jun 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants