Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions docs/source/machine_learning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,34 +75,35 @@ create and train a model from SQL:
.. code-block:: sql

CREATE MODEL my_model WITH (
model_class = 'sklearn.ensemble.GradientBoostingClassifier',
model_class = 'sklearn.linear_model.LogisticRegression',
wrap_predict = True,
target_column = 'target'
) AS (
SELECT x, y, target
SELECT x, y, x*y > 0 as target
FROM timeseries
LIMIT 100
)

.. group-tab:: GPU

.. code-block:: sql

CREATE MODEL my_model WITH (
model_class = 'cuml.linear_model.LogisticRegression',
wrap_predict = True,
wrap_fit = False, -- can we explain why this needs to be added?
target_column = 'target'
) AS (
SELECT x, y, target
SELECT x, y, x*y > 0 as target
FROM timeseries
LIMIT 100
)

This call will create a new instance of ``sklearn.ensemble.GradientBoostingClassifier``
This call will create a new instance of ``linear_model.LogisticRegression``
and train it with the data collected from the ``SELECT`` call (again, every valid ``SELECT``
query can be given). The model can than be used in subsequent calls to ``PREDICT``
using the given name.
using the given name.
We set ``wrap_predict`` = ``True`` here to parallelize post fit prediction task of non distributed models (sklearn/cuML etc) across workers.

Have a look into :ref:`ml` for more information.

4. Check Model parameters - Model meta data
Expand Down