Skip to content

Commit 3e09a0f

Browse files
huaxingaogatorsmile
authored andcommitted
[SPARK-28786][DOC][SQL] Document INSERT statement in SQL Reference
### What changes were proposed in this pull request? Document INSERT statement in SQL Reference ### Why are the changes needed? To complete SQL reference. ### Does this PR introduce any user-facing change? Yes. ### How was this patch tested? Manually checked newly added doc. Here are the screen shots: ![image](https://user-images.githubusercontent.com/13592258/63490232-0a01a180-c469-11e9-82de-cfdc7c2343e7.png) ![image](https://user-images.githubusercontent.com/13592258/63903006-cce56400-c9c0-11e9-9f24-badd586227a2.png) <img width="1100" alt="Screen Shot 2019-08-27 at 5 01 48 PM" src="https://user-images.githubusercontent.com/13592258/63816303-845c7680-c8ec-11e9-8c36-1b8e4d3e6286.png"> <img width="1100" alt="Screen Shot 2019-08-27 at 5 03 22 PM" src="https://user-images.githubusercontent.com/13592258/63816347-ac4bda00-c8ec-11e9-9470-fa99522e6f14.png"> ![image](https://user-images.githubusercontent.com/13592258/63817393-fc2ca000-c8f0-11e9-9d66-dd9b22a9d900.png) <img width="1102" alt="Screen Shot 2019-08-27 at 5 05 13 PM" src="https://user-images.githubusercontent.com/13592258/63816423-ea48fe00-c8ec-11e9-8f66-5b226a1ff693.png"> ![image](https://user-images.githubusercontent.com/13592258/63903080-0e760f00-c9c1-11e9-966a-f45b0b1c1ea6.png) <img width="1100" alt="Screen Shot 2019-08-27 at 5 07 19 PM" src="https://user-images.githubusercontent.com/13592258/63816494-37c56b00-c8ed-11e9-88e1-27a9101eb09d.png"> ![image](https://user-images.githubusercontent.com/13592258/63816712-131dc300-c8ee-11e9-8ee7-d83b8ad07bf2.png) ![image](https://user-images.githubusercontent.com/13592258/63817479-5a598300-c8f1-11e9-8789-adae7df5535a.png) ![image](https://user-images.githubusercontent.com/13592258/63817900-4adb3980-c8f3-11e9-94fe-d60f7d61c4b4.png) ![image](https://user-images.githubusercontent.com/13592258/63903155-4da46000-c9c1-11e9-88dd-609d4fe685a9.png) ![image](https://user-images.githubusercontent.com/13592258/63817157-d652cb80-c8ef-11e9-944c-99391cf2fb0a.png) ![image](https://user-images.githubusercontent.com/13592258/63903259-aa077f80-c9c1-11e9-982f-b8590ce0270d.png) ![image](https://user-images.githubusercontent.com/13592258/63903270-b1c72400-c9c1-11e9-85c6-6d8e8cd7f006.png) Closes #25525 from huaxingao/spark-28786. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>
1 parent 2465558 commit 3e09a0f

5 files changed

Lines changed: 580 additions & 4 deletions
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
---
2+
layout: global
3+
title: INSERT INTO
4+
displayTitle: INSERT INTO
5+
license: |
6+
Licensed to the Apache Software Foundation (ASF) under one or more
7+
contributor license agreements. See the NOTICE file distributed with
8+
this work for additional information regarding copyright ownership.
9+
The ASF licenses this file to You under the Apache License, Version 2.0
10+
(the "License"); you may not use this file except in compliance with
11+
the License. You may obtain a copy of the License at
12+
13+
http://www.apache.org/licenses/LICENSE-2.0
14+
15+
Unless required by applicable law or agreed to in writing, software
16+
distributed under the License is distributed on an "AS IS" BASIS,
17+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
18+
See the License for the specific language governing permissions and
19+
limitations under the License.
20+
---
21+
22+
### Description
23+
24+
The `INSERT INTO` statement inserts new rows into a table. The inserted rows can be specified by value expressions or result from a query.
25+
26+
### Syntax
27+
{% highlight sql %}
28+
INSERT INTO [ TABLE ] table_name
29+
[ PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] ) ]
30+
{ { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query }
31+
{% endhighlight %}
32+
33+
### Parameters
34+
<dl>
35+
<dt><code><em>table_name</em></code></dt>
36+
<dd>The name of an existing table.</dd>
37+
</dl>
38+
39+
<dl>
40+
<dt><code><em>PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] )</em></code></dt>
41+
<dd>Specifies one or more partition column and value pairs. The partition value is optional.</dd>
42+
</dl>
43+
44+
<dl>
45+
<dt><code><em>VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ]</em></code></dt>
46+
<dd>Specifies the values to be inserted. Either an explicitly specified value or a NULL can be inserted. A comma must be used to seperate each value in the clause. More than one set of values can be specified to insert multiple rows.</dd>
47+
</dl>
48+
49+
<dl>
50+
<dt><code><em>query</em></code></dt>
51+
<dd>A query that produces the rows to be inserted. It can be in one of following formats:
52+
<ul>
53+
<li>a <code>SELECT</code> statement</li>
54+
<li>a <code>TABLE</code> statement</li>
55+
<li>a <code>FROM</code> statement</li>
56+
</ul>
57+
</dd>
58+
</dl>
59+
60+
### Examples
61+
#### Single Row Insert Using a VALUES Clause
62+
{% highlight sql %}
63+
CREATE TABLE students (name VARCHAR(64), address VARCHAR(64), student_id INT)
64+
USING PARQUET PARTITIONED BY (student_id);
65+
66+
INSERT INTO students
67+
VALUES ('Amy Smith', '123 Park Ave, San Jose', 111111);
68+
69+
SELECT * FROM students;
70+
71+
+ -------------- + ------------------------------ + -------------- +
72+
| name | address | student_id |
73+
+ -------------- + ------------------------------ + -------------- +
74+
| Amy Smith | 123 Park Ave, San Jose | 111111 |
75+
+ -------------- + ------------------------------ + -------------- +
76+
{% endhighlight %}
77+
78+
#### Multi-Row Insert Using a VALUES Clause
79+
{% highlight sql %}
80+
INSERT INTO students
81+
VALUES ('Bob Brown', '456 Taylor St, Cupertino', 222222),
82+
('Cathy Johnson', '789 Race Ave, Palo Alto', 333333);
83+
84+
SELECT * FROM students;
85+
86+
+ -------------- + ------------------------------ + -------------- +
87+
| name | address | student_id |
88+
+ -------------- + ------------------------------ + -------------- +
89+
| Amy Smith | 123 Park Ave, San Jose | 111111 |
90+
+ -------------- + ------------------------------ + -------------- +
91+
| Bob Brown | 456 Taylor St, Cupertino | 222222 |
92+
+ -------------- + ------------------------------ + -------------- +
93+
| Cathy Johnson | 789 Race Ave, Palo Alto | 333333 |
94+
+ -------------- + ------------------------------ + -------------- +
95+
{% endhighlight %}
96+
97+
#### Insert Using a SELECT Statement
98+
Assuming the `persons` table has already been created and populated.
99+
100+
{% highlight sql %}
101+
SELECT * FROM persons;
102+
103+
+ -------------- + ------------------------------ + -------------- +
104+
| name | address | ssn |
105+
+ -------------- + ------------------------------ + -------------- +
106+
| Dora Williams | 134 Forest Ave, Melo Park | 123456789 |
107+
+ -------------- + ------------------------------ + -------------- +
108+
| Eddie Davis | 245 Market St, Milpitas | 345678901 |
109+
+ -------------- + ------------------------------ + ---------------+
110+
111+
INSERT INTO students PARTITION (student_id = 444444)
112+
SELECT name, address FROM persons WHERE name = "Dora Williams";
113+
114+
SELECT * FROM students;
115+
116+
+ -------------- + ------------------------------ + -------------- +
117+
| name | address | student_id |
118+
+ -------------- + ------------------------------ + -------------- +
119+
| Amy Smith | 123 Park Ave, San Jose | 111111 |
120+
+ -------------- + ------------------------------ + -------------- +
121+
| Bob Brown | 456 Taylor St, Cupertino | 222222 |
122+
+ -------------- + ------------------------------ + -------------- +
123+
| Cathy Johnson | 789 Race Ave, Palo Alto | 333333 |
124+
+ -------------- + ------------------------------ + -------------- +
125+
| Dora Williams | 134 Forest Ave, Melo Park | 444444 |
126+
+ -------------- + ------------------------------ + -------------- +
127+
{% endhighlight %}
128+
129+
#### Insert Using a TABLE Statement
130+
Assuming the `visiting_students` table has already been created and populated.
131+
132+
{% highlight sql %}
133+
SELECT * FROM visiting_students;
134+
135+
+ -------------- + ------------------------------ + -------------- +
136+
| name | address | student_id |
137+
+ -------------- + ------------------------------ + -------------- +
138+
| Fleur Laurent | 345 Copper St, London | 777777 |
139+
+ -------------- + ------------------------------ + -------------- +
140+
| Gordon Martin | 779 Lake Ave, Oxford | 888888 |
141+
+ -------------- + ------------------------------ + -------------- +
142+
143+
INSERT INTO students TABLE visiting_students;
144+
145+
SELECT * FROM students;
146+
147+
+ -------------- + ------------------------------ + -------------- +
148+
| name | address | student_id |
149+
+ -------------- + ------------------------------ + -------------- +
150+
| Amy Smith | 123 Park Ave, San Jose | 111111 |
151+
+ -------------- + ------------------------------ + -------------- +
152+
| Bob Brown | 456 Taylor St, Cupertino | 222222 |
153+
+ -------------- + ------------------------------ + -------------- +
154+
| Cathy Johnson | 789 Race Ave, Palo Alto | 333333 |
155+
+ -------------- + ------------------------------ + -------------- +
156+
| Dora Williams | 134 Forest Ave, Melo Park | 444444 |
157+
+ -------------- + ------------------------------ + -------------- +
158+
| Fleur Laurent | 345 Copper St, London | 777777 |
159+
+ -------------- + ------------------------------ + -------------- +
160+
| Gordon Martin | 779 Lake Ave, Oxford | 888888 |
161+
+ -------------- + ------------------------------ + -------------- +
162+
{% endhighlight %}
163+
164+
#### Insert Using a FROM Statement
165+
Assuming the `applicants` table has already been created and populated.
166+
167+
{% highlight sql %}
168+
SELECT * FROM applicants;
169+
170+
+ -------------- + ------------------------------ + -------------- + -------------- +
171+
| name | address | student_id | qualified |
172+
+ -------------- + ------------------------------ + -------------- + -------------- +
173+
| Helen Davis | 469 Mission St, San Diego | 999999 | true |
174+
+ -------------- + ------------------------------ + -------------- + -------------- +
175+
| Ivy King | 367 Leigh Ave, Santa Clara | 101010 | false |
176+
+ -------------- + ------------------------------ + -------------- + -------------- +
177+
| Jason Wang | 908 Bird St, Saratoga | 121212 | true |
178+
+ -------------- + ------------------------------ + -------------- + -------------- +
179+
180+
INSERT INTO students
181+
FROM applicants SELECT name, address, id applicants WHERE qualified = true;
182+
183+
SELECT * FROM students;
184+
185+
+ -------------- + ------------------------------ + -------------- +
186+
| name | address | student_id |
187+
+ -------------- + ------------------------------ + -------------- +
188+
| Amy Smith | 123 Park Ave, San Jose | 111111 |
189+
+ -------------- + ------------------------------ + -------------- +
190+
| Bob Brown | 456 Taylor St, Cupertino | 222222 |
191+
+ -------------- + ------------------------------ + -------------- +
192+
| Cathy Johnson | 789 Race Ave, Palo Alto | 333333 |
193+
+ -------------- + ------------------------------ + -------------- +
194+
| Dora Williams | 134 Forest Ave, Melo Park | 444444 |
195+
+ -------------- + ------------------------------ + -------------- +
196+
| Fleur Laurent | 345 Copper St, London | 777777 |
197+
+ -------------- + ------------------------------ + -------------- +
198+
| Gordon Martin | 779 Lake Ave, Oxford | 888888 |
199+
+ -------------- + ------------------------------ + -------------- +
200+
| Helen Davis | 469 Mission St, San Diego | 999999 |
201+
+ -------------- + ------------------------------ + -------------- +
202+
| Jason Wang | 908 Bird St, Saratoga | 121212 |
203+
+ -------------- + ------------------------------ + -------------- +
204+
{% endhighlight %}
205+
206+
Related Statements:
207+
* [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html)
208+
* [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html)
209+
* [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html)
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
layout: global
3+
title: INSERT OVERWRITE DIRECTORY with Hive format
4+
displayTitle: INSERT OVERWRITE DIRECTORY with Hive format
5+
license: |
6+
Licensed to the Apache Software Foundation (ASF) under one or more
7+
contributor license agreements. See the NOTICE file distributed with
8+
this work for additional information regarding copyright ownership.
9+
The ASF licenses this file to You under the Apache License, Version 2.0
10+
(the "License"); you may not use this file except in compliance with
11+
the License. You may obtain a copy of the License at
12+
13+
http://www.apache.org/licenses/LICENSE-2.0
14+
15+
Unless required by applicable law or agreed to in writing, software
16+
distributed under the License is distributed on an "AS IS" BASIS,
17+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
18+
See the License for the specific language governing permissions and
19+
limitations under the License.
20+
---
21+
22+
### Description
23+
The `INSERT OVERWRITE DIRECTORY` with Hive format overwrites the existing data in the directory with the new values using Hive `SerDe`.
24+
Hive support must be enabled to use this command. The inserted rows can be specified by value expressions or result from a query.
25+
26+
### Syntax
27+
{% highlight sql %}
28+
INSERT OVERWRITE [ LOCAL ] DIRECTORY directory_path
29+
[ ROW FORMAT row_format ] [ STORED AS file_format ]
30+
{ { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query }
31+
{% endhighlight %}
32+
33+
### Parameters
34+
<dl>
35+
<dt><code><em>directory_path</em></code></dt>
36+
<dd>
37+
Specifies the destination directory. The <code>LOCAL</code> keyword is used to specify that the directory is on the local file system.
38+
</dd>
39+
</dl>
40+
41+
<dl>
42+
<dt><code><em>row_format</em></code></dt>
43+
<dd>
44+
Specifies the row format for this insert. Valid options are <code>SERDE</code> clause and <code>DELIMITED</code> clause. <code>SERDE</code> clause can be used to specify a custom <code>SerDe</code> for this insert. Alternatively, <code>DELIMITED</code> clause can be used to specify the native <code>SerDe</code> and state the delimiter, escape character, null character, and so on.
45+
</dd>
46+
</dl>
47+
48+
<dl>
49+
<dt><code><em>file_format</em></code></dt>
50+
<dd>
51+
Specifies the file format for this insert. Valid options are <code>TEXTFILE</code>, <code>SEQUENCEFILE</code>, <code>RCFILE</code>, <code>ORC</code>, <code>PARQUET</code>, and <code>AVRO</code>. You can also specify your own input and output format using <code>INPUTFORMAT</code> and <code>OUTPUTFORMAT</code>. <code>ROW FORMAT SERDE</code> can only be used with <code>TEXTFILE</code>, <code>SEQUENCEFILE</code>, or <code>RCFILE</code>, while <code>ROW FORMAT DELIMITED</code> can only be used with <code>TEXTFILE</code>.
52+
</dd>
53+
</dl>
54+
55+
<dl>
56+
<dt><code><em>VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ]</em></code></dt>
57+
<dd>
58+
Specifies the values to be inserted. Either an explicitly specified value or a NULL can be inserted. A comma must be used to seperate each value in the clause. More than one set of values can be specified to insert multiple rows.
59+
</dd>
60+
</dl>
61+
62+
<dl>
63+
<dt><code><em>query</em></code></dt>
64+
<dd>A query that produces the rows to be inserted. It can be in one of following formats:
65+
<ul>
66+
<li>a <code>SELECT</code> statement</li>
67+
<li>a <code>TABLE</code> statement</li>
68+
<li>a <code>FROM</code> statement</li>
69+
</ul>
70+
</dd>
71+
</dl>
72+
73+
### Examples
74+
{% highlight sql %}
75+
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/destination'
76+
STORED AS orc
77+
SELECT * FROM test_table;
78+
79+
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/destination'
80+
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
81+
SELECT * FROM test_table;
82+
{% endhighlight %}
83+
84+
Related Statements:
85+
* [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html)
86+
* [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html)
87+
* [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html)
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
layout: global
3+
title: INSERT OVERWRITE DIRECTORY
4+
displayTitle: INSERT OVERWRITE DIRECTORY
5+
license: |
6+
Licensed to the Apache Software Foundation (ASF) under one or more
7+
contributor license agreements. See the NOTICE file distributed with
8+
this work for additional information regarding copyright ownership.
9+
The ASF licenses this file to You under the Apache License, Version 2.0
10+
(the "License"); you may not use this file except in compliance with
11+
the License. You may obtain a copy of the License at
12+
13+
http://www.apache.org/licenses/LICENSE-2.0
14+
15+
Unless required by applicable law or agreed to in writing, software
16+
distributed under the License is distributed on an "AS IS" BASIS,
17+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
18+
See the License for the specific language governing permissions and
19+
limitations under the License.
20+
---
21+
### Description
22+
The `INSERT OVERWRITE DIRECTORY` statement overwrites the existing data in the directory with the new values using Spark native format. The inserted rows can be specified by value expressions or result from a query.
23+
24+
### Syntax
25+
{% highlight sql %}
26+
INSERT OVERWRITE [ LOCAL ] DIRECTORY [ directory_path ]
27+
USING file_format [ OPTIONS ( key = val [ , ... ] ) ]
28+
{ { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query }
29+
{% endhighlight %}
30+
31+
### Parameters
32+
<dl>
33+
<dt><code><em>directory_path</em></code></dt>
34+
<dd>
35+
Specifies the destination directory. It can also be specified in <code>OPTIONS</code> using <code>path</code>. The <code>LOCAL</code> keyword is used to specify that the directory is on the local file system.
36+
</dd>
37+
</dl>
38+
39+
<dl>
40+
<dt><code><em>file_format</em></code></dt>
41+
<dd>
42+
Specifies the file format to use for the insert. Valid options are <code>TEXT</code>, <code>CSV</code>, <code>JSON</code>, <code>JDBC</code>, <code>PARQUET</code>, <code>ORC</code>, <code>HIVE</code>, <code>DELTA</code>, <code>LIBSVM</code>, or a fully qualified class name of a custom implementation of <code>org.apache.spark.sql.sources.DataSourceRegister</code>.
43+
</dd>
44+
</dl>
45+
46+
<dl>
47+
<dt><code><em>OPTIONS ( key = val [ , ... ] )</em></code></dt>
48+
<dd>Specifies one or more table property key and value pairs.</dd>
49+
</dl>
50+
51+
<dl>
52+
<dt><code><em>VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ]</em></code></dt>
53+
<dd>
54+
Specifies the values to be inserted. Either an explicitly specified value or a NULL can be inserted. A comma must be used to seperate each value in the clause. More than one set of values can be specified to insert multiple rows.
55+
</dd>
56+
</dl>
57+
58+
<dl>
59+
<dt><code><em>query</em></code></dt>
60+
<dd>A query that produces the rows to be inserted. It can be in one of following formats:
61+
<ul>
62+
<li>a <code>SELECT</code> statement</li>
63+
<li>a <code>TABLE</code> statement</li>
64+
<li>a <code>FROM</code> statement</li>
65+
</ul>
66+
</dd>
67+
</dl>
68+
69+
### Examples
70+
{% highlight sql %}
71+
INSERT OVERWRITE DIRECTORY '/tmp/destination'
72+
USING parquet
73+
OPTIONS (col1 1, col2 2, col3 'test')
74+
SELECT * FROM test_table;
75+
76+
INSERT OVERWRITE DIRECTORY
77+
USING parquet
78+
OPTIONS ('path' '/tmp/destination', col1 1, col2 2, col3 'test')
79+
SELECT * FROM test_table;
80+
{% endhighlight %}
81+
82+
Related Statements:
83+
* [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html)
84+
* [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html)
85+
* [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html)

0 commit comments

Comments
 (0)