Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -613,7 +613,7 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging {

isStatement = statementInProgress(index)
}
if (isStatement) {
if (beginIndex < line.length()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this condition mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid pass an blank string to processLine, although in processLine it will handle this

      // we can not use "split" function directly as ";" may be quoted
      val commands = splitSemiColon(line).asScala
      var command: String = ""
      for (oneCmd <- commands) {
        if (StringUtils.endsWith(oneCmd, "\\")) {
          command += StringUtils.chop(oneCmd) + ";"
        } else {
          command += oneCmd
          if (!StringUtils.isBlank(command)) {
            val ret = processCmd(command)
            command = ""
            lastRet = ret
            val ignoreErrors = HiveConf.getBoolVar(conf, HiveConf.ConfVars.CLIIGNOREERRORS)
            if (ret != 0 && !ignoreErrors) {
              CommandProcessorFactory.clean(conf.asInstanceOf[HiveConf])
              return ret
            }
          }
        }
      }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also keep same cod with hive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AngersZhuuuu @cloud-fan Wondering why do we keep the CLI in the hive-thriftserver package. It has nothing to do with ThriftServer.

ret.add(line.substring(beginIndex))
}
ret
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -620,4 +620,17 @@ class CliSuite extends SparkFunSuite with BeforeAndAfterAll with Logging {
|""".stripMargin -> "SELECT 1"
)
}

test("SPARK-37555: spark-sql should pass last unclosed comment to backend") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AngersZhuuuu This test is flaky, fails quite often on repeated runs. Here's a sample error:

2021-12-08 12:01:27.68 - stderr> Setting default log level to "WARN".
2021-12-08 12:01:27.68 - stderr> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2021-12-08 12:01:39.459 - stderr> Spark master: local, Application Id: local-1638993689929
2021-12-08 12:01:40.688 - stdout> spark-sql> /* SELECT /*+ HINT() 4; */;
2021-12-08 12:01:41.299 - stdout> spark-sql> /* SELECT /*+ HINT() 4; */ SELECT 1;
2021-12-08 12:01:41.56 - stderr> Error in query: 
2021-12-08 12:01:41.56 - stderr> mismatched input ';' expecting {'(', 'APPLY', 'CONVERT', 'COPY', 'OPTIMIZE', 'RESTORE', 'ADD', 'ALTER', 'ANALYZE', 'CACHE', 'CLEAR', 'COMMENT', 'COMMIT', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DFS', 'DROP', 'EXPLAIN', 'EXPORT', 'FROM', 'GRANT', 'IMPORT', 'INSERT', 'LIST', 'LOAD', 'LOCK', 'MAP', 'MERGE', 'MSCK', 'REDUCE', 'REFRESH', 'REPLACE', 'RESET', 'REVOKE', 'ROLLBACK', 'SELECT', 'SET', 'SHOW', 'START', 'TABLE', 'TRUNCATE', 'UNCACHE', 'UNLOCK', 'UPDATE', 'USE', 'VALUES', 'WITH'}(line 1, pos 26)
2021-12-08 12:01:41.56 - stderr> 
2021-12-08 12:01:41.56 - stderr> == SQL ==
2021-12-08 12:01:41.56 - stderr> /* SELECT /*+ HINT() 4; */;
2021-12-08 12:01:41.56 - stderr> --------------------------^^^
2021-12-08 12:01:41.56 - stderr> 
2021-12-08 12:01:47.573 - stdout> 1
2021-12-08 12:01:47.573 - stderr> Time taken: 6.272 seconds, Fetched 1 row(s)
2021-12-08 12:01:47.592 - stdout> spark-sql> /* Here is a unclosed bracketed comment SELECT 1;
2021-12-08 12:01:47.601 - stderr> Error in query: 
2021-12-08 12:01:47.601 - stderr> Unclosed bracketed comment(line 1, pos 0)
2021-12-08 12:01:47.601 - stderr> 
2021-12-08 12:01:47.601 - stderr> == SQL ==
2021-12-08 12:01:47.601 - stderr> /* Here is a unclosed bracketed comment SELECT 1;
2021-12-08 12:01:47.601 - stderr> ^^^
2021-12-08 12:01:47.601 - stderr> 
2021-12-08 12:01:47.612 - stdout> spark-sql> /* SELECT /*+ HINT() */ 4; */;
2021-12-08 12:01:49.552 - stdout> spark-sql> 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run this many times. Not failed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fails on my PRs periodically too. For instance, see https://github.com/MaxGekk/spark/actions/runs/3383705774/jobs/5619867814:

[info] - SPARK-37555: spark-sql should pass last unclosed comment to backend *** FAILED *** (2 minutes, 8 seconds)
[info]   =======================
[info]   CliSuite failure output
[info]   =======================
[info]   Spark SQL CLI command line: ../../bin/spark-sql --master local --driver-java-options -Dderby.system.durability=test --conf spark.ui.enabled=false --hiveconf javax.jdo.option.ConnectionURL=jdbc:derby:;databaseName=/home/runner/work/spark/spark/target/tmp/spark-16a98152-adaa-4653-93d1-82425ddc6ed5;create=true --hiveconf hive.exec.scratchdir=/home/runner/work/spark/spark/target/tmp/spark-d3ccd810-41f0-4a29-9592-f8c08ad7f50a --hiveconf conf1=conftest --hiveconf conf2=1 --hiveconf hive.metastore.warehouse.dir=/home/runner/work/spark/spark/target/tmp/spark-c80a8015-a17d-422d-aeff-11be2a4b9986
[info]   Exception: java.util.concurrent.TimeoutException: Futures timed out after [2 minutes]
[info]   Failed to capture next expected output "Unclosed bracketed comment" within 2 minutes.
[info]   
[info]   2022-11-03 01:00:52.245 - stderr> Setting default log level to "WARN".
[info]   2022-11-03 01:00:52.245 - stderr> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
[info]   2022-11-03 01:00:59.101 - stderr> Spark master: local, Application Id: local-1667462453894
[info]   2022-11-03 01:00:59.939 - stdout> spark-sql> /* SELECT /*+ HINT() 4; */;
[info]   2022-11-03 01:01:00.378 - stderr> 
[info]   2022-11-03 01:01:00.379 - stderr> [PARSE_SYNTAX_ERROR] Syntax error at or near ';'(line 1, pos 26)
[info]   2022-11-03 01:01:00.379 - stderr> 
[info]   2022-11-03 01:01:00.379 - stderr> == SQL ==
[info]   2022-11-03 01:01:00.379 - stderr> /* SELECT /*+ HINT() 4; */;
[info]   2022-11-03 01:01:00.379 - stderr> --------------------------^^^
[info]   2022-11-03 01:01:00.379 - stderr> 
[info]   2022-11-03 01:01:00.401 - stdout> spark-sql> /* SELECT /*+ HINT() 4; */ SELECT 1;
[info]   2022-11-03 01:01:02.78 - stdout> 1
[info]   2022-11-03 01:01:02.78 - stderr> Time taken: 2.384 seconds, Fetched 1 row(s)
[info]   2022-11-03 01:01:02.791 - stderr> 
[info]   2022-11-03 01:01:02.791 - stderr> Unclosed bracketed comment(line 1, pos 0)
[info]   2022-11-03 01:01:02.791 - stderr> 
[info]   2022-11-03 01:01:02.791 - stderr> == SQL ==
[info]   2022-11-03 01:01:02.791 - stderr> /* Here is a unclosed bracketed comment SELECT 1;
[info]   2022-11-03 01:01:02.791 - stderr> ^^^
[info]   2022-11-03 01:01:02.791 - stdout> spark-sql> /* Here is a unclosed bracketed comment SELECT 1;
[info]   2022-11-03 01:01:02.792 - stderr> 
[info]   2022-11-03 01:01:02.795 - stdout> spark-sql> /* SELECT /*+ HINT() */ 4; */;
[info]   2022-11-03 01:01:02.995 - stdout> spark-sql> 
[info]   ===========================
[info]   End CliSuite failure output
[info]   =========================== (CliSuite.scala:213)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)

Maybe we don't wait for 2 minutes in real life.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let me check again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey this also fails on my PR. Just FYI didn't meant to push!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AngersZhuuuu shall we increase the timeout? The github action machines are not stable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AngersZhuuuu shall we increase the timeout? The github action machines are not stable.

Yea, the logic should be ok...
Add below PR and we can trigger it many times to see if it's stable?
#38571

runCliWithin(2.minute)(
// Only unclosed comment.
"/* SELECT /*+ HINT() 4; */;".stripMargin -> "mismatched input ';'",
// Unclosed nested bracketed comment.
"/* SELECT /*+ HINT() 4; */ SELECT 1;".stripMargin -> "1",
// Unclosed comment with query.
"/* Here is a unclosed bracketed comment SELECT 1;"-> "Unclosed bracketed comment",
// Whole comment.
"/* SELECT /*+ HINT() */ 4; */;".stripMargin -> ""
)
}
}