Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ if you need it.
* **Your data - your queries**: Use Python user-defined functions (UDFs) in SQL without any performance drawback and extend your SQL queries with the large number of Python libraries, e.g. machine learning, different complicated input formats, complex statistics.
* **Easy to install and maintain**: `dask-sql` is just a pip/conda install away (or a docker run if you prefer). No need for complicated cluster setups - `dask-sql` will run out of the box on your machine and can be easily connected to your computing cluster.
* **Use SQL from wherever you like**: `dask-sql` integrates with your jupyter notebook, your normal Python module or can be used as a standalone SQL server from any BI tool. It even integrates natively with [Apache Hue](https://gethue.com/).
* **GPU Support**: `dask-sql` has _experimental_ support for running SQL queries on CUDA-enabled GPUs by utilizing [RAPIDS](https://rapids.ai) libraries like [`cuDF`](https://github.com/rapidsai/cudf), enabling accelerated compute for SQL.

Read more in the [documentation](https://dask-sql.readthedocs.io/en/latest/).

Expand Down Expand Up @@ -71,9 +72,6 @@ Have a look into the [documentation](https://dask-sql.readthedocs.io/en/latest/)
> `dask-sql` is currently under development and does so far not understand all SQL commands (but a large fraction).
We are actively looking for feedback, improvements and contributors!

If you would like to utilize GPUs for your SQL queries, have a look into the [blazingSQL](https://github.com/BlazingDB/blazingsql) project.


## Installation

`dask-sql` can be installed via `conda` (preferred) or `pip` - or in a development environment.
Expand Down
4 changes: 1 addition & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ if you need it.
* **Your data - your queries**: Use Python user-defined functions (UDFs) in SQL without any performance drawback and extend your SQL queries with the large number of Python libraries, e.g. machine learning, different complicated input formats, complex statistics.
* **Easy to install and maintain**: ``dask-sql`` is just a pip/conda install away (or a docker run if you prefer). No need for complicated cluster setups - ``dask-sql`` will run out of the box on your machine and can be easily connected to your computing cluster.
* **Use SQL from wherever you like**: ``dask-sql`` integrates with your jupyter notebook, your normal Python module or can be used as a standalone SQL server from any BI tool. It even integrates natively with `Apache Hue <https://gethue.com/>`_.
* **GPU Support**: ``dask-sql`` has `experimental` support for running SQL queries on CUDA-enabled GPUs by utilizing `RAPIDS <https://rapids.ai>`_ libraries like `cuDF <https://github.com/rapidsai/cudf>`_ , enabling accelerated compute for SQL.


Example
Expand Down Expand Up @@ -54,9 +55,6 @@ Any pandas or dask dataframe can be used as input and ``dask-sql`` understands a
# ... or use it for any other dask calculation
print(result.x.mean().compute())

The API of ``dask-sql`` is very similar to the one from `blazingsql <http://blazingsql.com/>`_,
which makes interchanging distributed CPU and GPU calculation easy.


.. toctree::
:maxdepth: 1
Expand Down
2 changes: 1 addition & 1 deletion docs/pages/custom.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ After registration, the function can be used as any other usual SQL function:
Scalar functions can have one or more input parameters and can combine columns and literal values.

Row-Wise Pandas UDFs
----------------
--------------------
In some cases it may be easier to write custom functions which process a dict like row object, such as those consumed by ``pandas.DataFrame.apply``.
These functions may be registered as above and flagged as row UDFs using the `row_udf` keyword argument:

Expand Down