File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -1903,6 +1903,25 @@ releases of Spark SQL.
19031903 Hive can optionally merge the small files into fewer large files to avoid overflowing the HDFS
19041904 metadata. Spark SQL does not support that.
19051905
1906+ ** Hive UDF/UDTF/UDAF**
1907+
1908+ Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
1909+ Some of them are meaningless in Spark and the others are rarely used by users.
1910+ Below is a list of major APIs we don't support in Spark SQL:
1911+
1912+ * ` getRequiredJars ` and ` getRequiredFiles ` (` UDF ` and ` GenericUDF ` ) are functions to to automatically
1913+ include additional resources required by this UDF.
1914+ * ` initialize(StructObjectInspector) ` in ` GenericUDTF ` is not supported yet. Spark SQL currently uses
1915+ a deprecated interface ` initialize(ObjectInspector[]) ` only.
1916+ * ` configure ` (` GenericUDF ` , ` GenericUDTF ` , and ` GenericUDAFEvaluator ` ) is a function to initialize
1917+ functions with ` MapredContext ` . But, Spark SQL does not use ` MapredContext ` internally.
1918+ * ` close ` (` GenericUDF ` and ` GenericUDAFEvaluator ` ) is a function to release associated resources.
1919+ Spark SQL does not call this function when tasks finished.
1920+ * ` reset ` (` GenericUDAFEvaluator ` ) is a function to re-initialize aggregation for reusing the same aggregation.
1921+ Spark SQL currently does not support the reuse of aggregation.
1922+ * ` getWindowingEvaluator ` (` GenericUDAFEvaluator ` ) is a function to optimize aggregation by evaluating
1923+ an aggregate over a fixed window. Spark SQL does not support this optimization yet.
1924+
19061925# Reference
19071926
19081927## Data Types
You can’t perform that action at this time.
0 commit comments