Skip to content

Commit d1a03da

Browse files
airflow-provider-vdk: Initial Airflow provider structure (#772)
Versatile Data Kit allows users to schedule job executions using a cron-like interface, however there is no way to express job dependencies through VDK. For example, a user might want to run two different ingestion jobs, and when both pass successfully, to run a transformation job on the ingested data. Apache Airflow is an open-source workload scheduling framework which allows exactly that sort of job dependency specification. To take advantage of this, this PR sets the initial structure of the VDK Airflow provider, which will integrate Airflow's scheduling capabilities with VDK. Signed-off-by: Gabriel Georgiev <gageorgiev@vmware.com>
1 parent bceee9d commit d1a03da

File tree

20 files changed

+415
-0
lines changed

20 files changed

+415
-0
lines changed

.gitlab-ci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ include:
1010
- "projects/control-service/cicd/.gitlab-ci.yml"
1111
- "projects/vdk-plugins/.plugin-common.yml"
1212
- "projects/vdk-plugins/*/.plugin-ci.yml"
13+
- "projects/vdk-plugins/airflow-provider-vdk/.airflow-ci.yml"
1314

1415
stages:
1516
- build
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
.retry:
5+
retry_options:
6+
max: 1
7+
when:
8+
- always
9+
10+
11+
build-airflow-provider-vdk:
12+
stage: build
13+
image: "python:3.7"
14+
script:
15+
- cd projects/vdk-plugins/airflow-provider-vdk
16+
- echo "Building VDK Airflow provider..."
17+
- export PIP_EXTRA_INDEX_URL=${PIP_EXTRA_INDEX_URL:-'https://pypi.org/simple'}
18+
- pip install -U pip setuptools pre-commit
19+
- pre-commit install --hook-type commit-msg --hook-type pre-commit
20+
- pip install -e . --extra-index-url $PIP_EXTRA_INDEX_URL
21+
- pip install pytest-cov
22+
- pytest --junitxml=tests.xml --cov vdk_airflow --cov-report term-missing --cov-report xml:coverage.xml
23+
retry: !reference [.retry, retry_options]
24+
rules: # we want to trigger build jobs if there are changes to this package,
25+
# but not if there are changes to VDK plugins or the main directory
26+
- if: '$CI_COMMIT_BRANCH == "main" || $CI_PIPELINE_SOURCE == "external_pull_request_event"'
27+
changes:
28+
- "projects/vdk-plugins/airflow-provider-vdk/**/*"
29+
artifacts:
30+
when: always
31+
reports:
32+
junit: tests.xml
33+
34+
35+
release-airflow-provider-vdk:
36+
stage: release
37+
image: "python:3.7"
38+
script:
39+
- cd projects/vdk-plugins/
40+
- echo "Releasing airflow-provider-vdk..."
41+
- cd airflow-provider-vdk/ || exit 1
42+
- pip install -U pip setuptools wheel twine
43+
- python setup.py sdist --formats=gztar
44+
# provide the credentials as Gitlab variables
45+
- twine upload --repository-url $PIP_REPO_UPLOAD_URL -u "$PIP_REPO_UPLOAD_USER_NAME" -p "$PIP_REPO_UPLOAD_USER_PASSWORD" dist/*tar.gz --verbose
46+
retry: !reference [.retry, retry_options]
47+
rules:
48+
- if: '$CI_COMMIT_BRANCH == "main"'
49+
changes:
50+
- "projects/vdk-plugins/airflow-provider-vdk/**/*"
Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
Apache License
2+
3+
Version 2.0, January 2004
4+
http://www.apache.org/licenses/
5+
6+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
7+
8+
1. Definitions.
9+
10+
"License" shall mean the terms and conditions for use, reproduction,
11+
and distribution as defined by Sections 1 through 9 of this document.
12+
13+
"Licensor" shall mean the copyright owner or entity authorized by the
14+
copyright owner that is granting the License.
15+
16+
"Legal Entity" shall mean the union of the acting entity and all other
17+
entities that control, are controlled by, or are under common control
18+
with that entity. For the purposes of this definition, "control" means
19+
(i) the power, direct or indirect, to cause the direction or management
20+
of such entity, whether by contract or otherwise, or (ii) ownership
21+
of fifty percent (50%) or more of the outstanding shares, or (iii)
22+
beneficial ownership of such entity.
23+
24+
"You" (or "Your") shall mean an individual or Legal Entity exercising
25+
permissions granted by this License.
26+
27+
"Source" form shall mean the preferred form for making modifications,
28+
including but not limited to software source code, documentation source,
29+
and configuration files.
30+
31+
"Object" form shall mean any form resulting from mechanical transformation
32+
or translation of a Source form, including but not limited to compiled
33+
object code, generated documentation, and conversions to other media
34+
types.
35+
36+
"Work" shall mean the work of authorship, whether in Source or
37+
Object form, made available under the License, as indicated by a copyright
38+
notice that is included in or attached to the work (an example is provided
39+
in the Appendix below).
40+
41+
"Derivative Works" shall mean any work, whether in Source or Object form,
42+
that is based on (or derived from) the Work and for which the editorial
43+
revisions, annotations, elaborations, or other modifications represent,
44+
as a whole, an original work of authorship. For the purposes of this
45+
License, Derivative Works shall not include works that remain separable
46+
from, or merely link (or bind by name) to the interfaces of, the Work
47+
and Derivative Works thereof.
48+
49+
"Contribution" shall mean any work of authorship, including the
50+
original version of the Work and any modifications or additions to
51+
that Work or Derivative Works thereof, that is intentionally submitted
52+
to Licensor for inclusion in the Work by the copyright owner or by an
53+
individual or Legal Entity authorized to submit on behalf of the copyright
54+
owner. For the purposes of this definition, "submitted" means any form of
55+
electronic, verbal, or written communication sent to the Licensor or its
56+
representatives, including but not limited to communication on electronic
57+
mailing lists, source code control systems, and issue tracking systems
58+
that are managed by, or on behalf of, the Licensor for the purpose of
59+
discussing and improving the Work, but excluding communication that is
60+
conspicuously marked or otherwise designated in writing by the copyright
61+
owner as "Not a Contribution."
62+
63+
"Contributor" shall mean Licensor and any individual or Legal Entity
64+
on behalf of whom a Contribution has been received by Licensor and
65+
subsequently incorporated within the Work.
66+
67+
2. Grant of Copyright License.
68+
Subject to the terms and conditions of this License, each Contributor
69+
hereby grants to You a perpetual, worldwide, non-exclusive, no-charge,
70+
royalty-free, irrevocable copyright license to reproduce, prepare
71+
Derivative Works of, publicly display, publicly perform, sublicense, and
72+
distribute the Work and such Derivative Works in Source or Object form.
73+
74+
3. Grant of Patent License.
75+
Subject to the terms and conditions of this License, each Contributor
76+
hereby grants to You a perpetual, worldwide, non-exclusive, no-charge,
77+
royalty- free, irrevocable (except as stated in this section) patent
78+
license to make, have made, use, offer to sell, sell, import, and
79+
otherwise transfer the Work, where such license applies only to those
80+
patent claims licensable by such Contributor that are necessarily
81+
infringed by their Contribution(s) alone or by combination of
82+
their Contribution(s) with the Work to which such Contribution(s)
83+
was submitted. If You institute patent litigation against any entity
84+
(including a cross-claim or counterclaim in a lawsuit) alleging that the
85+
Work or a Contribution incorporated within the Work constitutes direct
86+
or contributory patent infringement, then any patent licenses granted
87+
to You under this License for that Work shall terminate as of the date
88+
such litigation is filed.
89+
90+
4. Redistribution.
91+
You may reproduce and distribute copies of the Work or Derivative Works
92+
thereof in any medium, with or without modifications, and in Source or
93+
Object form, provided that You meet the following conditions:
94+
95+
a. You must give any other recipients of the Work or Derivative Works
96+
a copy of this License; and
97+
98+
b. You must cause any modified files to carry prominent notices stating
99+
that You changed the files; and
100+
101+
c. You must retain, in the Source form of any Derivative Works that
102+
You distribute, all copyright, patent, trademark, and attribution
103+
notices from the Source form of the Work, excluding those notices
104+
that do not pertain to any part of the Derivative Works; and
105+
106+
d. If the Work includes a "NOTICE" text file as part of its
107+
distribution, then any Derivative Works that You distribute must
108+
include a readable copy of the attribution notices contained
109+
within such NOTICE file, excluding those notices that do not
110+
pertain to any part of the Derivative Works, in at least one of
111+
the following places: within a NOTICE text file distributed as part
112+
of the Derivative Works; within the Source form or documentation,
113+
if provided along with the Derivative Works; or, within a display
114+
generated by the Derivative Works, if and wherever such third-party
115+
notices normally appear. The contents of the NOTICE file are for
116+
informational purposes only and do not modify the License. You
117+
may add Your own attribution notices within Derivative Works that
118+
You distribute, alongside or as an addendum to the NOTICE text
119+
from the Work, provided that such additional attribution notices
120+
cannot be construed as modifying the License. You may add Your own
121+
copyright statement to Your modifications and may provide additional
122+
or different license terms and conditions for use, reproduction, or
123+
distribution of Your modifications, or for any such Derivative Works
124+
as a whole, provided Your use, reproduction, and distribution of the
125+
Work otherwise complies with the conditions stated in this License.
126+
127+
5. Submission of Contributions.
128+
Unless You explicitly state otherwise, any Contribution intentionally
129+
submitted for inclusion in the Work by You to the Licensor shall be
130+
under the terms and conditions of this License, without any additional
131+
terms or conditions. Notwithstanding the above, nothing herein shall
132+
supersede or modify the terms of any separate license agreement you may
133+
have executed with Licensor regarding such Contributions.
134+
135+
6. Trademarks.
136+
This License does not grant permission to use the trade names, trademarks,
137+
service marks, or product names of the Licensor, except as required for
138+
reasonable and customary use in describing the origin of the Work and
139+
reproducing the content of the NOTICE file.
140+
141+
7. Disclaimer of Warranty.
142+
Unless required by applicable law or agreed to in writing, Licensor
143+
provides the Work (and each Contributor provides its Contributions) on
144+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
145+
express or implied, including, without limitation, any warranties or
146+
conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR
147+
A PARTICULAR PURPOSE. You are solely responsible for determining the
148+
appropriateness of using or redistributing the Work and assume any risks
149+
associated with Your exercise of permissions under this License.
150+
151+
8. Limitation of Liability.
152+
In no event and under no legal theory, whether in tort (including
153+
negligence), contract, or otherwise, unless required by applicable law
154+
(such as deliberate and grossly negligent acts) or agreed to in writing,
155+
shall any Contributor be liable to You for damages, including any direct,
156+
indirect, special, incidental, or consequential damages of any character
157+
arising as a result of this License or out of the use or inability to
158+
use the Work (including but not limited to damages for loss of goodwill,
159+
work stoppage, computer failure or malfunction, or any and all other
160+
commercial damages or losses), even if such Contributor has been advised
161+
of the possibility of such damages.
162+
163+
9. Accepting Warranty or Additional Liability.
164+
While redistributing the Work or Derivative Works thereof, You may
165+
choose to offer, and charge a fee for, acceptance of support, warranty,
166+
indemnity, or other liability obligations and/or rights consistent with
167+
this License. However, in accepting such obligations, You may act only
168+
on Your own behalf and on Your sole responsibility, not on behalf of
169+
any other Contributor, and only if You agree to indemnify, defend, and
170+
hold each Contributor harmless for any liability incurred by, or claims
171+
asserted against, such Contributor by reason of your accepting any such
172+
warranty or additional liability.
173+
174+
END OF TERMS AND CONDITIONS
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Versatile Data Kit Airflow provider
2+
3+
A set of Airflow operators, sensors and a connection hook intended to help schedule Versatile Data Kit jobs using Apache Airflow.
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
3+
from setuptools import setup
4+
5+
__version__ = "0.0.1"
6+
7+
with open("README.md") as fh:
8+
long_description = fh.read()
9+
10+
setup(
11+
name="airflow-provider-vdk",
12+
version=__version__,
13+
description="Airflow provider for Versatile Data Kit.",
14+
long_description=long_description,
15+
long_description_content_type="text/markdown",
16+
entry_points={
17+
"apache_airflow_provider": [
18+
"provider_info=vdk_provider.__init__:get_provider_info"
19+
]
20+
},
21+
license="Apache License 2.0",
22+
packages=[
23+
"vdk_provider",
24+
"vdk_provider.hooks",
25+
"vdk_provider.sensors",
26+
"vdk_provider.operators",
27+
],
28+
install_requires=[
29+
"apache-airflow>=2.0",
30+
"tenacity>=6.2.0",
31+
"vdk-core",
32+
"vdk-control-cli",
33+
],
34+
setup_requires=["setuptools", "wheel"],
35+
author="Versatile Data Kit Development Team",
36+
author_email="versatile-data-kit@vmware.com",
37+
url="https://github.com/vmware/versatile-data-kit",
38+
python_requires="~=3.7",
39+
)
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
3+
from vdk_provider.hooks.vdk import VDKHook
4+
5+
6+
def test_dummy():
7+
hook = VDKHook("conn_id", "job_name", "team_name")
8+
9+
assert hook
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright 2021 VMware, Inc.
2+
# SPDX-License-Identifier: Apache-2.0

0 commit comments

Comments
 (0)