diff --git a/README.md b/README.md index 26fc4c6da..02dc2c206 100644 --- a/README.md +++ b/README.md @@ -22,6 +22,7 @@ if you need it. * **Your data - your queries**: Use Python user-defined functions (UDFs) in SQL without any performance drawback and extend your SQL queries with the large number of Python libraries, e.g. machine learning, different complicated input formats, complex statistics. * **Easy to install and maintain**: `dask-sql` is just a pip/conda install away (or a docker run if you prefer). No need for complicated cluster setups - `dask-sql` will run out of the box on your machine and can be easily connected to your computing cluster. * **Use SQL from wherever you like**: `dask-sql` integrates with your jupyter notebook, your normal Python module or can be used as a standalone SQL server from any BI tool. It even integrates natively with [Apache Hue](https://gethue.com/). +* **GPU Support**: `dask-sql` has _experimental_ support for running SQL queries on CUDA-enabled GPUs by utilizing [RAPIDS](https://rapids.ai) libraries like [`cuDF`](https://github.com/rapidsai/cudf), enabling accelerated compute for SQL. Read more in the [documentation](https://dask-sql.readthedocs.io/en/latest/). @@ -71,9 +72,6 @@ Have a look into the [documentation](https://dask-sql.readthedocs.io/en/latest/) > `dask-sql` is currently under development and does so far not understand all SQL commands (but a large fraction). We are actively looking for feedback, improvements and contributors! -If you would like to utilize GPUs for your SQL queries, have a look into the [blazingSQL](https://github.com/BlazingDB/blazingsql) project. - - ## Installation `dask-sql` can be installed via `conda` (preferred) or `pip` - or in a development environment. diff --git a/docs/index.rst b/docs/index.rst index 2dfeb9d96..f9fbeeba1 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -13,6 +13,7 @@ if you need it. * **Your data - your queries**: Use Python user-defined functions (UDFs) in SQL without any performance drawback and extend your SQL queries with the large number of Python libraries, e.g. machine learning, different complicated input formats, complex statistics. * **Easy to install and maintain**: ``dask-sql`` is just a pip/conda install away (or a docker run if you prefer). No need for complicated cluster setups - ``dask-sql`` will run out of the box on your machine and can be easily connected to your computing cluster. * **Use SQL from wherever you like**: ``dask-sql`` integrates with your jupyter notebook, your normal Python module or can be used as a standalone SQL server from any BI tool. It even integrates natively with `Apache Hue `_. +* **GPU Support**: ``dask-sql`` has `experimental` support for running SQL queries on CUDA-enabled GPUs by utilizing `RAPIDS `_ libraries like `cuDF `_ , enabling accelerated compute for SQL. Example @@ -54,9 +55,6 @@ Any pandas or dask dataframe can be used as input and ``dask-sql`` understands a # ... or use it for any other dask calculation print(result.x.mean().compute()) -The API of ``dask-sql`` is very similar to the one from `blazingsql `_, -which makes interchanging distributed CPU and GPU calculation easy. - .. toctree:: :maxdepth: 1 diff --git a/docs/pages/custom.rst b/docs/pages/custom.rst index 889adf3bc..2eefe6813 100644 --- a/docs/pages/custom.rst +++ b/docs/pages/custom.rst @@ -34,7 +34,7 @@ After registration, the function can be used as any other usual SQL function: Scalar functions can have one or more input parameters and can combine columns and literal values. Row-Wise Pandas UDFs ----------------- +-------------------- In some cases it may be easier to write custom functions which process a dict like row object, such as those consumed by ``pandas.DataFrame.apply``. These functions may be registered as above and flagged as row UDFs using the `row_udf` keyword argument: