diff --git a/doc/getting_started/index.rst b/doc/getting_started/index.rst index 947bf24dd..36b3ea766 100644 --- a/doc/getting_started/index.rst +++ b/doc/getting_started/index.rst @@ -5,7 +5,7 @@ Getting Started Installation ------------ -Datashader supports Python 3.9, 3.10, 3.11 and 3.12 on Linux, Windows, or Mac +Datashader supports Python 3.10, 3.11, 3.12 and 3.13 on Linux, Windows, or Mac and can be installed with conda:: conda install datashader diff --git a/examples/FAQ.ipynb b/examples/FAQ.ipynb index a6b47f312..db0e786ff 100644 --- a/examples/FAQ.ipynb +++ b/examples/FAQ.ipynb @@ -26,7 +26,7 @@ "If you have a very small number of data points (in the hundreds\n", "or thousands) or curves (in the tens or several tens, each with\n", "hundreds or thousands of points), then conventional plotting packages\n", - "like [Bokeh](https://bokeh.pydata.org) may be more suitable. With conventional browser-based\n", + "like [Bokeh](https://bokeh.org) may be more suitable. With conventional browser-based\n", "packages, all of the data points are passed directly to the browser for\n", "display, allowing specific interaction with each curve or point,\n", "including display of metadata, linking to sources, etc. This approach\n", @@ -100,5 +100,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/getting_started/1_Introduction.ipynb b/examples/getting_started/1_Introduction.ipynb index 72f337e66..45339fac0 100644 --- a/examples/getting_started/1_Introduction.ipynb +++ b/examples/getting_started/1_Introduction.ipynb @@ -8,10 +8,10 @@ "\n", "**Datashader turns even the largest datasets into images, faithfully preserving the data's distribution.**\n", "\n", - "Datashader is an [open-source](https://github.com/bokeh/datashader/) Python library for analyzing and visualizing large datasets. Specifically, Datashader is designed to \"rasterize\" or \"aggregate\" datasets into regular grids that can be analyzed further or viewed as images, making it simple and quick to see the properties and patterns of your data. Datashader can plot a billion points in a second or so on a 16GB laptop, and scales up easily to out-of-core, distributed, or GPU processing for even larger datasets.\n", + "Datashader is an [open-source](https://github.com/holoviz/datashader/) Python library for analyzing and visualizing large datasets. Specifically, Datashader is designed to \"rasterize\" or \"aggregate\" datasets into regular grids that can be analyzed further or viewed as images, making it simple and quick to see the properties and patterns of your data. Datashader can plot a billion points in a second or so on a 16GB laptop, and scales up easily to out-of-core, distributed, or GPU processing for even larger datasets.\n", "\n", "This page of the getting-started guide will give a simple example to show how it works, and the following page will show how to use Datashader as a standalone library for generating arrays or images directly\n", - "([Pipeline](2_Pipeline.ipynb)). Next we'll show how to use Datashader as a component in a larger visualization system like [HoloViews](http://holoviews.org) or [Bokeh](http://bokeh.pydata.org) that provides interactive plots with dynamic zooming, labeled axes, and overlays and layouts ([3-Interactivity](3-Interactivity.ipynb)). More detailed information about each topic is then provided in the [User Guide](../user_guide/).\n", + "([Pipeline](2_Pipeline.ipynb)). Next we'll show how to use Datashader as a component in a larger visualization system like [HoloViews](https://holoviews.org) or [Bokeh](https://bokeh.org) that provides interactive plots with dynamic zooming, labeled axes, and overlays and layouts ([3-Interactivity](3-Interactivity.ipynb)). More detailed information about each topic is then provided in the [User Guide](../user_guide/).\n", "\n", "## Example: NYC taxi trips\n", "\n", @@ -91,5 +91,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/getting_started/2_Pipeline.ipynb b/examples/getting_started/2_Pipeline.ipynb index 6a2f69d81..e0951dcc0 100644 --- a/examples/getting_started/2_Pipeline.ipynb +++ b/examples/getting_started/2_Pipeline.ipynb @@ -56,7 +56,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Datashader can work many different data objects provided by different data libraries depending on the type of data involved, such as columnar data in [Pandas](http://pandas.pydata.org) or [Dask](http://dask.pydata.org) dataframes, gridded multidimensional array data using [xarray](http://xarray.pydata.org), columnar data on GPUs using [cuDF](https://github.com/rapidsai/cudf), multidimensional arrays on GPUs using [CuPy](https://cupy.chainer.org/), and ragged arrays using [SpatialPandas](https://github.com/holoviz/spatialpandas) (see the [Performance User Guide](../10_Performance.ipynb) for a guide to selecting an appropriate library). Here, we're using a Pandas dataframe, with 50,000 rows by default:" + "Datashader can work many different data objects provided by different data libraries depending on the type of data involved, such as columnar data in [Pandas](https://pandas.pydata.org) or [Dask](https://dask.org) dataframes, gridded multidimensional array data using [xarray](https://xarray.dev), columnar data on GPUs using [cuDF](https://github.com/rapidsai/cudf), multidimensional arrays on GPUs using [CuPy](https://cupy.chainer.org/), and ragged arrays using [SpatialPandas](https://github.com/holoviz/spatialpandas) (see the [Performance User Guide](../10_Performance.ipynb) for a guide to selecting an appropriate library). Here, we're using a Pandas dataframe, with 50,000 rows by default:" ] }, { @@ -177,7 +177,7 @@ "source": [ "### 2D Reductions\n", "\n", - "One you have determined your mapping, you'll next need to choose a reduction operator to use when aggregating multiple datapoints into a given pixel. For points, each datapoint is mapped into a single pixel, while the other glyphs have spatial extent and can thus map into multiple pixels, each of which operates the same way. All glyphs act like points if the entire glyph is contained within that pixel. Here we will talk only about \"datapoints\" for simplicity, which for an area-based glyph should be interpreted as \"the part of that glyph that falls into this pixel\".\n", + "Once you have determined your mapping, you'll next need to choose a reduction operator to use when aggregating multiple datapoints into a given pixel. For points, each datapoint is mapped into a single pixel, while the other glyphs have spatial extent and can thus map into multiple pixels, each of which operates the same way. All glyphs act like points if the entire glyph is contained within that pixel. Here we will talk only about \"datapoints\" for simplicity, which for an area-based glyph should be interpreted as \"the part of that glyph that falls into this pixel\".\n", "\n", "All of the currently supported reduction operators are incremental, which means that we can efficiently process datasets in a single pass. Given an aggregate bin to update (typically corresponding to one eventual pixel) and a new datapoint, the reduction operator updates the state of the bin in some way. (Actually, datapoints are normally processed in batches for efficiency, but it's simplest to think about the operator as being applied per data point, and the mathematical result should be the same.) A large number of useful [reduction operators](https://datashader.org/api.html#reductions) are supplied in `ds.reductions`, including:\n", "\n", @@ -213,7 +213,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The result of will be an [xarray](http://xarray.pydata.org) `DataArray` data structure containing the bin values (typically one value per bin, but more for multiple category or multiple-aggregate operators) along with axis range and type information.\n", + "The result of will be an [xarray](https://xarray.dev) `DataArray` data structure containing the bin values (typically one value per bin, but more for multiple category or multiple-aggregate operators) along with axis range and type information.\n", "\n", "We can visualize this array in many different ways by customizing the pipeline stages described in the following sections, but for now we'll simply render images using the default parameters to show the effects of a few different aggregate operators:" ] @@ -359,7 +359,7 @@ "\n", "Now that the data has been projected and aggregated into a 2D or 3D gridded data structure, it can be processed in any way you like, before converting it to an image as will be described in the following section. At this stage, the data is still stored as bin data, not pixels, which makes a very wide variety of operations and transformations simple to express. \n", "\n", - "For instance, instead of plotting all the data, we can easily plot only those bins in the 99th percentile by count (left), or apply any [NumPy ufunc](http://docs.scipy.org/doc/numpy/reference/ufuncs.html) to the bin values (whether or not it makes any sense!):" + "For instance, instead of plotting all the data, we can easily plot only those bins in the 99th percentile by count (left), or apply any [NumPy ufunc](https://docs.scipy.org/doc/numpy/reference/ufuncs.html) to the bin values (whether or not it makes any sense!):" ] }, { @@ -388,11 +388,11 @@ "metadata": {}, "outputs": [], "source": [ - "sel1 = agg_d3_d5.where(aggc.sel(cat='d3') == aggc.sel(cat='d5')).astype('uint32')\n", - "sel2 = agg.where(aggc.sel(cat='d3') == aggc.sel(cat='d5')).astype('uint32')\n", + "sel1 = agg_d3_d5.where(aggc.sel(cat='d3') == aggc.sel(cat='d5'), other=-1).astype('uint32')\n", + "sel2 = agg.where(aggc.sel(cat='d3') == aggc.sel(cat='d5'), other=-1).astype('uint32')\n", "\n", - "tf.Images(tf.shade(sel1, name=\"d3+d5 where d3==d5\"),\n", - " tf.shade(sel2, name=\"d1+d2+d3+d4+d5 where d3==d5\"))" + "tf.Images(tf.shade(sel1, name='d3+d5 where d3==d5'),\n", + " tf.shade(sel2, name='d1+d2+d3+d4+d5 where d3==d5'))" ] }, { @@ -408,7 +408,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The [xarray documentation](http://xarray.pydata.org/en/stable/computation.html) describes all the various transformations you can apply from within xarray, and of course you can always extract the data values and operate on them outside of xarray for any transformation not directly supported by xarray, then construct a suitable xarray object for use in the following stage. Once the data is in the aggregate array, you generally don't have to worry much about optimization, because it's a fixed-sized grid regardless of your data size, and so it is very straightforward to apply arbitrary transformations to the aggregates." + "The [xarray documentation](https://docs.xarray.dev/en/stable/user-guide/computation.html) describes all the various transformations you can apply from within xarray, and of course you can always extract the data values and operate on them outside of xarray for any transformation not directly supported by xarray, then construct a suitable xarray object for use in the following stage. Once the data is in the aggregate array, you generally don't have to worry much about optimization, because it's a fixed-sized grid regardless of your data size, and so it is very straightforward to apply arbitrary transformations to the aggregates." ] }, { @@ -782,7 +782,7 @@ "source": [ "#### Colormapping with negative values\n", "\n", - "The above examples all use positive data values to avoid confusion when there is no colorbar or other explicit indication of a z (color) axis range. Negative values are also supported, in which case for a non-categorical plot you should normally use a [diverging colormap](https://colorcet.holoviz.org/user_guide/Continuous.html#Diverging-colormaps,-for-plotting-magnitudes-increasing-or-decreasing-from-a-central-point:):" + "The above examples all use positive data values to avoid confusion when there is no colorbar or other explicit indication of a z (color) axis range. Negative values are also supported, in which case for a non-categorical plot you should normally use a [diverging colormap](https://colorcet.holoviz.org/user_guide/Continuous.html#diverging-colormaps-for-plotting-magnitudes-increasing-or-decreasing-from-a-central-point)" ] }, { @@ -793,7 +793,7 @@ "source": [ "from colorcet import coolwarm, CET_D8\n", "dfn = df.copy()\n", - "dfn.val.replace({20:-20, 30:0, 40:-40}, inplace=True)\n", + "dfn[\"val\"] = dfn[\"val\"].replace({20: -20, 30: 0, 40: -40})\n", "aggn = ds.Canvas().points(dfn,'x','y', agg=ds.mean(\"val\"))\n", "\n", "tf.Images(tf.shade(aggn, name=\"Sequential\", cmap=[\"lightblue\",\"blue\"], how=\"linear\"),\n", @@ -881,7 +881,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "See [the API docs](https://datashader.org/api.html#transfer-functions) for more details. Image composition operators to provide for the `how` argument of `tf.stack` (e.g. `over` (default), `source`, `add`, and `saturate`) are listed in [composite.py](https://raw.githubusercontent.com/holoviz/datashader/main/datashader/composite.py) and illustrated [here](http://cairographics.org/operators).\n", + "See [the API docs](https://datashader.org/api.html#transfer-functions) for more details. Image composition operators to provide for the `how` argument of `tf.stack` (e.g. `over` (default), `source`, `add`, and `saturate`) are listed in [composite.py](https://raw.githubusercontent.com/holoviz/datashader/main/datashader/composite.py) and illustrated [here](https://cairographics.org/operators).\n", "\n", "## Embedding\n", "\n", @@ -896,5 +896,5 @@ } }, "nbformat": 4, - "nbformat_minor": 1 + "nbformat_minor": 4 } diff --git a/examples/getting_started/3_Interactivity.ipynb b/examples/getting_started/3_Interactivity.ipynb index 81a77d3e6..d525d087e 100644 --- a/examples/getting_started/3_Interactivity.ipynb +++ b/examples/getting_started/3_Interactivity.ipynb @@ -6,7 +6,7 @@ "source": [ "The [previous notebook](2-Pipeline.ipynb) showed all the steps required to get a Datashader rendering of your dataset, yielding raster images displayed using [Jupyter](http://jupyter.org)'s \"rich display\" support. However, these bare images do not show the data ranges or axis labels, making them difficult to interpret. Moreover, they are only static images, and datasets often need to be explored at multiple scales, which is much easier to do in an interactive program. \n", "\n", - "To get axes and interactivity, the images generated by Datashader need to be embedded into a plot using an external library like [Matplotlib](http://matplotlib.org) or [Bokeh](http://bokeh.org). As we illustrate below, the most convenient way to make Datashader plots using these libraries is via the [HoloViews](http://holoviews.org) high-level data-science API, using either [Bokeh](http://holoviews.org/user_guide/Large_Data.html) or [Plotly](https://medium.com/plotly/introducing-dash-holoviews-6a05c088ebe5). HoloViews encapsulates the Datashader pipeline in a way that lets you combine interactive datashaded plots easily with other plots without having to write explicit callbacks or event-processing code.\n", + "To get axes and interactivity, the images generated by Datashader need to be embedded into a plot using an external library like [Matplotlib](https://matplotlib.org) or [Bokeh](https://bokeh.org). As we illustrate below, the most convenient way to make Datashader plots using these libraries is via the [HoloViews](https://holoviews.org) high-level data-science API, using either [Bokeh](https://holoviews.org/user_guide/Large_Data.html) or [Plotly](https://medium.com/plotly/introducing-dash-holoviews-6a05c088ebe5). HoloViews encapsulates the Datashader pipeline in a way that lets you combine interactive datashaded plots easily with other plots without having to write explicit callbacks or event-processing code.\n", "\n", "In this notebook, we will first look at the HoloViews API, then at Datashader's new native Matplotlib support." ] @@ -17,7 +17,7 @@ "source": [ "# Embedding Datashader with HoloViews\n", "\n", - "[HoloViews](http://holoviews.org) (1.7 and later) is a high-level data analysis and visualization library that makes it simple to generate interactive [Datashader](https://github.com/holoviz/datashader)-based plots. Here's an illustration of how this all fits together when using HoloViews+[Bokeh](http://bokeh.pydata.org):\n", + "[HoloViews](https://holoviews.org) (1.7 and later) is a high-level data analysis and visualization library that makes it simple to generate interactive [Datashader](https://github.com/holoviz/datashader)-based plots. Here's an illustration of how this all fits together when using HoloViews+[Bokeh](https://bokeh.org):\n", "\n", " ![Datashader+Holoviews+Bokeh](../assets/images/ds_hv_bokeh2.png)\n", "\n", @@ -80,7 +80,7 @@ "source": [ "### HoloViews+Bokeh\n", "\n", - "Rather than starting out by specifying a figure or plot, in HoloViews you specify an [``Element``](http://holoviews.org/reference/index.html#elements) object to contain your data, such as `Points` for a collection of 2D x,y points. To start, let's define a Points object wrapping around a small dataframe with 10,000 random samples from the ``df`` above:" + "Rather than starting out by specifying a figure or plot, in HoloViews you specify an [``Element``](https://holoviews.org/reference/index.html#elements) object to contain your data, such as `Points` for a collection of 2D x,y points. To start, let's define a Points object wrapping around a small dataframe with 10,000 random samples from the ``df`` above:" ] }, { @@ -144,7 +144,7 @@ "\n", "### HoloViews+Datashader+Bokeh\n", "\n", - "The Matplotlib interface only produces a static plot, i.e., a PNG or SVG image, but the [Bokeh](http://bokeh.pydata.org) and Plotly interfaces of HoloViews add the dynamic zooming and panning necessary to understand datasets across scales:" + "The Matplotlib interface only produces a static plot, i.e., a PNG or SVG image, but the [Bokeh](https://bokeh.org) and Plotly interfaces of HoloViews add the dynamic zooming and panning necessary to understand datasets across scales:" ] }, { @@ -187,7 +187,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can read more about HoloViews support for Datashader at [holoviews.org](http://holoviews.org/user_guide/Large_Data.html)." + "You can read more about HoloViews support for Datashader at [holoviews.org](https://holoviews.org/user_guide/Large_Data.html)." ] }, { @@ -196,7 +196,7 @@ "source": [ "### HoloViews+Datashader+Bokeh Legends\n", "\n", - "As explained in the [HoloViews User Guide](http://holoviews.org/user_guide/Large_Data.html), you'll want to use the HoloViews `rasterize` operation whenever you can, instead of `datashade`, because `rasterize` lets the plotting library do the final colormapping stage, allowing it to provide colorbars, legends, and interactive features like hover that reveal the actual (aggregated) data. However, plotting libraries do not yet support all of Datashader's features, such as `shade`'s categorical color mixing, and in those cases you will need to use special techniques like those listed here. \n", + "As explained in the [HoloViews User Guide](https://holoviews.org/user_guide/Large_Data.html), you'll want to use the HoloViews `rasterize` operation whenever you can, instead of `datashade`, because `rasterize` lets the plotting library do the final colormapping stage, allowing it to provide colorbars, legends, and interactive features like hover that reveal the actual (aggregated) data. However, plotting libraries do not yet support all of Datashader's features, such as `shade`'s categorical color mixing, and in those cases you will need to use special techniques like those listed here. \n", "\n", "If you are using Datashader's shading, the underlying plotting library only ever sees an image, not the individual categorical data, and so it cannot automatically show a legend. But you can work around it by building your own categorical legend by adding a suitable collection of labeled dummy points:" ] @@ -257,7 +257,7 @@ "source": [ "In the above examples, the \"fixed square hover\" plot provides coarse hover information from a square patch at a fixed spatial scale, while the \"dynamic square hover\" plot reports on a square area that scales with the zoom level so that arbitrarily small regions of data space can be examined, which is generally more useful.\n", "\n", - "As you can see, HoloViews makes it just about as simple to work with Datashader-based plots as regular Bokeh plots (at least if you don't need color keys!), letting you visualize data of any size interactively in a browser using just a few lines of code. Because Datashader-based HoloViews plots are just one or two extra steps added on to regular HoloViews plots, they support all of the same features as regular HoloViews objects, and can freely be laid out, overlaid, and nested together with them. See [holoviews.org](http://holoviews.org) for examples and documentation for how to control the appearance of these plots and how to work with them in general.\n", + "As you can see, HoloViews makes it just about as simple to work with Datashader-based plots as regular Bokeh plots (at least if you don't need color keys!), letting you visualize data of any size interactively in a browser using just a few lines of code. Because Datashader-based HoloViews plots are just one or two extra steps added on to regular HoloViews plots, they support all of the same features as regular HoloViews objects, and can freely be laid out, overlaid, and nested together with them. See [holoviews.org](https://holoviews.org) for examples and documentation for how to control the appearance of these plots and how to work with them in general.\n", "\n", "## HoloViews+Datashader+Panel\n", "\n", @@ -496,5 +496,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/user_guide/10_Performance.ipynb b/examples/user_guide/10_Performance.ipynb index 1cf69aa12..bfac5f77b 100644 --- a/examples/user_guide/10_Performance.ipynb +++ b/examples/user_guide/10_Performance.ipynb @@ -53,7 +53,7 @@ "\n", "Datashader performance will vary significantly depending on the library and specific data object type used to represent the data in Python, because different libraries and data objects have very different abilities to use the available processing power and memory. Moreover, different libraries and objects are appropriate for different types of data, due to how they organize and store the data internally as well as the operations they provide for working with the data. The various data container objects available from the supported libraries all fall into one of the following three types of data structures:\n", "- **[Columnar (tabular) data](https://pandas.pydata.org/pandas-docs/stable/getting_started/overview.html)**: Relational, table-like data consisting of arbitrarily many rows, each with data for a fixed number of columns. For example, if you track the location of a particular cell phone over time, each time sampled would be a row, and for each time there could be columns for the latitude and longitude for the location at that time.\n", - "- **[n-D arrays (multidimensional data)](http://xarray.pydata.org/en/stable/why-xarray.html)**: Data laid out in _n_ dimensions, where _n_ is typically >1. For example, you might have the precipitation measured on a latitude and longitude grid covering the whole world, for every time at which precipitation was measured. Such data could be stored columnarly, but it would be very inefficient; instead it is stored as a three dimensional array of precipitation values, indexed with time, latitude, and longitude.\n", + "- **[n-D arrays (multidimensional data)](https://docs.xarray.dev/en/stable/getting-started-guide/why-xarray.html)**: Data laid out in _n_ dimensions, where _n_ is typically >1. For example, you might have the precipitation measured on a latitude and longitude grid covering the whole world, for every time at which precipitation was measured. Such data could be stored columnarly, but it would be very inefficient; instead it is stored as a three dimensional array of precipitation values, indexed with time, latitude, and longitude.\n", "- **[Ragged arrays](https://en.wikipedia.org/wiki/Jagged_array)**: Relational/columnar data where the value of at least one column is a list of values that could vary in length for each row. For example, you may have a table with one row per US state and columns for population, land area, and the geographic shape of that state. Here the shape would be stored as a polygon consisting of an arbitrarily long list of latitude and longitude coordinates, which does not fit efficiently into a standard columnar data structure due to its ragged (variable length) nature.\n", "\n", "As you can see, all three examples include latitude and longitude values, but they are very different data structures that need to be stored differently for them to be processed efficiently. \n", @@ -184,7 +184,7 @@ " Yes\n", "\n", "\n", - " Xarray + NumPy\n", + " Xarray + NumPy\n", " n-D\n", " 1-core CPU\n", " in-core\n", @@ -277,5 +277,5 @@ } }, "nbformat": 4, - "nbformat_minor": 1 + "nbformat_minor": 4 } diff --git a/examples/user_guide/11_Geography.ipynb b/examples/user_guide/11_Geography.ipynb index 46a0a4590..c50af261e 100644 --- a/examples/user_guide/11_Geography.ipynb +++ b/examples/user_guide/11_Geography.ipynb @@ -16,7 +16,7 @@ "These functionality is provided in the [xarray-spatial](github.com/makepath/xarray-spatial) library.\n", "You can check out its [example notebooks](https://github.com/makepath/xarray-spatial/tree/master/examples/user_guide) to see how to use the functions.\n", "\n", - "See also [GeoViews](http://geoviews.org), which is designed to work with Datashader to provide a large range of additional geospatial functionality." + "See also [GeoViews](https://geoviews.org), which is designed to work with Datashader to provide a large range of additional geospatial functionality." ] } ], @@ -27,5 +27,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/user_guide/3_Timeseries.ipynb b/examples/user_guide/3_Timeseries.ipynb index 767929ecc..1efaaae7e 100644 --- a/examples/user_guide/3_Timeseries.ipynb +++ b/examples/user_guide/3_Timeseries.ipynb @@ -358,7 +358,7 @@ "source": [ "### Dynamic Plots\n", "\n", - "In practice, it might be difficult to cycle through each of the curves to find one that's different, as done above. Perhaps a criterion based on similarity could be devised, choosing the curve most dissimilar from the rest to plot in this way, which would be an interesting topic for future research. In any case, one thing that can be achieved with [HoloViews](http://holoviews.org/) is to make the plot fully interactive, with direct support for datetimes so that the viewer can zoom in and discover such patterns dynamically with correctly formatted axes." + "In practice, it might be difficult to cycle through each of the curves to find one that's different, as done above. Perhaps a criterion based on similarity could be devised, choosing the curve most dissimilar from the rest to plot in this way, which would be an interesting topic for future research. In any case, one thing that can be achieved with [HoloViews](https://holoviews.org/) is to make the plot fully interactive, with direct support for datetimes so that the viewer can zoom in and discover such patterns dynamically with correctly formatted axes." ] }, { @@ -536,5 +536,5 @@ } }, "nbformat": 4, - "nbformat_minor": 1 + "nbformat_minor": 4 } diff --git a/examples/user_guide/4_Trajectories.ipynb b/examples/user_guide/4_Trajectories.ipynb index b1e356cb6..2a2249dc1 100644 --- a/examples/user_guide/4_Trajectories.ipynb +++ b/examples/user_guide/4_Trajectories.ipynb @@ -141,7 +141,7 @@ "\n", "### Dynamic Plots\n", "\n", - "Specifying hard-coded ranges as above is awkward, so it's much more natural to simply zoom in interactively, which can be done using the `datashade` operation imported from [HoloViews](http://holoviews.org/)." + "Specifying hard-coded ranges as above is awkward, so it's much more natural to simply zoom in interactively, which can be done using the `datashade` operation imported from [HoloViews](https://holoviews.org/)." ] }, { @@ -187,5 +187,5 @@ } }, "nbformat": 4, - "nbformat_minor": 1 + "nbformat_minor": 4 } diff --git a/examples/user_guide/5_Grids.ipynb b/examples/user_guide/5_Grids.ipynb index 504e64dfb..349a74222 100644 --- a/examples/user_guide/5_Grids.ipynb +++ b/examples/user_guide/5_Grids.ipynb @@ -4,15 +4,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "[Datashader](http://datashader.org) renders data into regularly sampled arrays, a process known as [rasterization](https://en.wikipedia.org/wiki/Rasterisation), and then optionally converts that array into a viewable image (with one pixel per element in that array). \n", + "[Datashader](https://datashader.org) renders data into regularly sampled arrays, a process known as [rasterization](https://en.wikipedia.org/wiki/Rasterisation), and then optionally converts that array into a viewable image (with one pixel per element in that array). \n", "\n", - "In some cases, your data is *already* rasterized, such as data from imaging experiments, simulations on a regular grid, or other regularly sampled processes. Even so, the rasters you have already are not always the ones you need for a given purpose, having the wrong shape, range, or size to be suitable for overlaying with or comparing against other data, maps, and so on. Datashader provides fast methods for [\"regridding\"](https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/regridding-overview)/[\"re-sampling\"](http://gisgeography.com/raster-resampling/)/\"re-rasterizing\" your regularly gridded datasets, generating new rasters on demand that can be used together with those it generates for any other data types. Rasterizing into a common grid can help you implement complex cross-datatype analyses or visualizations. \n", + "In some cases, your data is *already* rasterized, such as data from imaging experiments, simulations on a regular grid, or other regularly sampled processes. Even so, the rasters you have already are not always the ones you need for a given purpose, having the wrong shape, range, or size to be suitable for overlaying with or comparing against other data, maps, and so on. Datashader provides fast methods for [\"regridding\"](https://climatedataguide.ucar.edu/climate-data-tools-and-analysis/regridding-overview)/[\"re-sampling\"](https://gisgeography.com/raster-resampling/)/\"re-rasterizing\" your regularly gridded datasets, generating new rasters on demand that can be used together with those it generates for any other data types. Rasterizing into a common grid can help you implement complex cross-datatype analyses or visualizations. \n", "\n", "In other cases, your data is stored in a 2D array similar to a raster, but represents values that are not regularly sampled in the underlying coordinate space. Datashader also provides fast methods for rasterizing these more general rectilinear or curvilinear grids, known as [quadmeshes](#quadmesh-Rasterization) as described later below. Fully arbitrary unstructured grids ([Trimeshes](Trimesh.ipynb)) are discussed separately.\n", "\n", "## Re-rasterization\n", "\n", - "First, let's explore the regularly gridded case, declaring a small raster using Numpy and wrapping it up as an [xarray](http://xarray.pydata.org) DataArray for us to re-rasterize: " + "First, let's explore the regularly gridded case, declaring a small raster using Numpy and wrapping it up as an [xarray](https://xarray.dev) DataArray for us to re-rasterize: " ] }, { @@ -210,7 +210,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Here the factor \"k\" results in the function being evaluated at increasingly higher frequencies, eventually leading to complex [Moiré patterns](https://en.wikipedia.org/wiki/Moir%C3%A9_pattern) due to undersampling for k=33. 3D xarrays are useful for reading multi-layer image files, such as those from [xarray's rasterio-based support for reading from multilayer TIFFs](http://xarray.pydata.org/en/stable/io.html#rasterio).\n", + "Here the factor \"k\" results in the function being evaluated at increasingly higher frequencies, eventually leading to complex [Moiré patterns](https://en.wikipedia.org/wiki/Moir%C3%A9_pattern) due to undersampling for k=33. 3D xarrays are useful for reading multi-layer image files, such as those from [xarray's rasterio-based support for reading from multilayer TIFFs](https://xarray.pydata.org/en/stable/io.html#rasterio).\n", "\n", "Similarly, Datashader can accept an xarray Dataset (a dictionary-like collection of aligned DataArrays), as long as the aggregation method specifies a suitable DataArray within the Dataset:" ] @@ -401,5 +401,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/user_guide/6_Trimesh.ipynb b/examples/user_guide/6_Trimesh.ipynb index e7ddbad89..07ce31e16 100644 --- a/examples/user_guide/6_Trimesh.ipynb +++ b/examples/user_guide/6_Trimesh.ipynb @@ -17,7 +17,7 @@ "\n", "This diagram uses \"pixels\" and colors (grayscale), but for datashader the generated raster is more precisely interpreted as a 2D array with bins, not pixels, because the values involved are numeric rather than colors. (With datashader, colors are assigned only in the later \"shading\" stage, not during rasterization itself.) As shown in the diagram, a pixel (bin) is treated as belonging to a given triangle if its center falls either inside that triangle or along its top or left edge.\n", "\n", - "The specific algorithm used to do so is based on the approach of [Pineda (1998)](http://people.csail.mit.edu/ericchan/bib/pdf/p17-pineda.pdf), which has the following features:\n", + "The specific algorithm used to do so is based on the approach of [Pineda (1998)](https://people.csail.mit.edu/ericchan/bib/pdf/p17-pineda.pdf), which has the following features:\n", " * Classification of pixels relies on triangle convexity\n", " * Embarrassingly parallel linear calculations\n", " * Inner loop can be calculated incrementally, i.e. with very \"cheap\" computations\n", @@ -26,7 +26,7 @@ " * Triangles should be non overlapping (to ensure repeatable results for different numbers of cores)\n", " * Triangles should be specified consistently either in clockwise or in counterclockwise order of vertices (winding). \n", " \n", - "Trimesh rasterization is not yet GPU-accelerated, but it's fast because of [Numba](http://numba.pydata.org) compiling Python into SIMD machine code instructions." + "Trimesh rasterization is not yet GPU-accelerated, but it's fast because of [Numba](https://numba.pydata.org) compiling Python into SIMD machine code instructions." ] }, { @@ -307,7 +307,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Notice that the central disk is being filled in above, even though the function is not defined in the center. That's a limitation of Delaunay triangulation, which will create convex regions covering the provided vertices. You can use other tools for creating triangulations that have holes, align along certain regions, have specified densities, etc., such as [MeshPy](https://mathema.tician.de/software/meshpy) (Python bindings for [Triangle](http://www.cs.cmu.edu/~quake/triangle.html)).\n", + "Notice that the central disk is being filled in above, even though the function is not defined in the center. That's a limitation of Delaunay triangulation, which will create convex regions covering the provided vertices. You can use other tools for creating triangulations that have holes, align along certain regions, have specified densities, etc., such as [MeshPy](https://mathema.tician.de/software/meshpy) (Python bindings for [Triangle](https://www.cs.cmu.edu/~quake/triangle.html)).\n", "\n", "\n", "### Aggregation functions\n", @@ -383,7 +383,7 @@ "source": [ "# Interactive plots\n", "\n", - "By their nature, fully exploring irregular grids needs to be interactive, because the resolution of the screen and the visual system are fixed. Trimesh renderings can be generated as above and then displayed interactively using the datashader support in [HoloViews](http://holoviews.org)." + "By their nature, fully exploring irregular grids needs to be interactive, because the resolution of the screen and the visual system are fixed. Trimesh renderings can be generated as above and then displayed interactively using the datashader support in [HoloViews](https://holoviews.org)." ] }, { @@ -436,5 +436,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/user_guide/7_Networks.ipynb b/examples/user_guide/7_Networks.ipynb index bba50d865..f7f9e9af6 100644 --- a/examples/user_guide/7_Networks.ipynb +++ b/examples/user_guide/7_Networks.ipynb @@ -33,7 +33,7 @@ "source": [ "## Graph (node) layout\n", "\n", - "Some graph data is inherently spatial, such as connections between geographic locations, and these graphs can simply be plotted by connecting each location with line segments. However, most graphs are more abstract, with nodes having no natural position in space, and so they require a \"layout\" operation to choose a 2D location for each node before the graph can be visualized. Unfortunately, choosing such locations is an [open-ended problem involving a complex set of tradeoffs and complications](http://www.hiveplot.com).\n", + "Some graph data is inherently spatial, such as connections between geographic locations, and these graphs can simply be plotted by connecting each location with line segments. However, most graphs are more abstract, with nodes having no natural position in space, and so they require a \"layout\" operation to choose a 2D location for each node before the graph can be visualized. Unfortunately, choosing such locations is an [open-ended problem involving a complex set of tradeoffs and complications](https://www.hiveplot.com).\n", "\n", "Datashader provides a few tools for doing graph layout, while also working with external layout tools. As a first example, let's generate a random graph, with 100 points normally distributed around the origin and 20000 random connections between them:" ] @@ -107,7 +107,7 @@ "source": [ "The circular layout provides an option to distribute the nodes randomly along the circle or evenly, and here we've chosen the former.\n", "\n", - "The two layouts above ignore the connectivity structure of the graph, focusing only on the nodes. The [ForceAtlas2](http://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0098679&type=printable) algorithm is a more complex approach that treats connections like physical forces (a force-directed approach) in order to construct a layout for the nodes based on the network connectivity:" + "The two layouts above ignore the connectivity structure of the graph, focusing only on the nodes. The [ForceAtlas2](https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0098679&type=printable) algorithm is a more complex approach that treats connections like physical forces (a force-directed approach) in order to construct a layout for the nodes based on the network connectivity:" ] }, { @@ -128,7 +128,7 @@ "\n", "## Edge rendering/bundling\n", "\n", - "Assuming that we have a suitable layout for the nodes, we can now plot the connections between them. There are currently two bundling algorithms provided: drawing a line directly between any connected nodes (``connect_edges``), and an iterative \"bundling\" algorithm ``hammer_bundle`` (a variant of [Hurter, Ersoy, & Telea, ECV-2012](http://www.cs.rug.nl/svcg/Shapes/KDEEB)) that allows edges to curve and then groups nearby ones together to help convey structure. Rendering direct connections should be very quick, even for large graphs, but bundling can be quite computationally intensive." + "Assuming that we have a suitable layout for the nodes, we can now plot the connections between them. There are currently two bundling algorithms provided: drawing a line directly between any connected nodes (``connect_edges``), and an iterative \"bundling\" algorithm ``hammer_bundle`` (a variant of [Hurter, Ersoy, & Telea, ECV-2012](https://www.cs.rug.nl/svcg/Shapes/KDEEB)) that allows edges to curve and then groups nearby ones together to help convey structure. Rendering direct connections should be very quick, even for large graphs, but bundling can be quite computationally intensive." ] }, { @@ -399,7 +399,7 @@ "\n", "The above plots all show static images of nodes and edges, with optional category information, but there's no way to see the specific identity of individual nodes. With small numbers of nodes you can try coloring them to convey identity, but in general the only practical way to reveal identity of nodes or edges is typically interactively, as a user inspects individual items. Thus interactive plots are often necessary for doing any exploration of real-world graph data.\n", "\n", - "The simplest way to work with interactive datashaded graphs is to use [HoloViews](http://holoviews.org), which includes specific support for [plotting graphs with and without Datashader](http://holoviews.org/user_guide/Network_Graphs.html):" + "The simplest way to work with interactive datashaded graphs is to use [HoloViews](https://holoviews.org), which includes specific support for [plotting graphs with and without Datashader](https://holoviews.org/user_guide/Network_Graphs.html):" ] }, { @@ -427,7 +427,7 @@ "\n", "You can try clicking and hovering on either plot to see what interactive features are available; in both cases the behavior for nodes should be the same (as the full set of nodes is being overlaid on both plots), while the edges also support interactivity in the pure-Bokeh version.\n", "\n", - "As you can see, the pure-Bokeh version provides more interactivity, but the datashaded version will let you see the patterns of connectivity better for large graphs. The datashader version will also work fine for arbitrarily large graphs that would overwhelm the browser if used with Bokeh directly. [HoloViews](http://holoviews.org/user_guide/Network_Graphs.html) makes it simple to switch between these two extremes as needed, using full-interactive plots for small datasets and adding whatever interactivity is required (as in the overlaid node plots on the right above) for larger datasets while rendering the full dataset as the main plot using datashader." + "As you can see, the pure-Bokeh version provides more interactivity, but the datashaded version will let you see the patterns of connectivity better for large graphs. The datashader version will also work fine for arbitrarily large graphs that would overwhelm the browser if used with Bokeh directly. [HoloViews](https://holoviews.org/user_guide/Network_Graphs.html) makes it simple to switch between these two extremes as needed, using full-interactive plots for small datasets and adding whatever interactivity is required (as in the overlaid node plots on the right above) for larger datasets while rendering the full dataset as the main plot using datashader." ] }, { @@ -458,5 +458,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/examples/user_guide/9_Extending.ipynb b/examples/user_guide/9_Extending.ipynb index 35c61125a..5b498b696 100644 --- a/examples/user_guide/9_Extending.ipynb +++ b/examples/user_guide/9_Extending.ipynb @@ -29,7 +29,7 @@ "\n", "## Transformation\n", "\n", - "Once you have an aggregate array, you can do anything you like! This array will have a fixed size regardless of your original dataset size, and so anything operating on the aggregate array need not be particularly well optimized or tuned for it to be practical for large datasets. The [xarray documentation](http://xarray.pydata.org/en/stable/computation.html) describes all the various transformations you can apply from within xarray, and of course you can always extract the data values and operate on them outside of xarray for any transformation not directly supported by xarray, then construct a suitable xarray object for use in the following stage. If there are transformations that seem particularly useful to other datashader users, we would be happy to consider including them, but generally these are very lightweight objects that you can simply create and discard as needed for your applications.\n", + "Once you have an aggregate array, you can do anything you like! This array will have a fixed size regardless of your original dataset size, and so anything operating on the aggregate array need not be particularly well optimized or tuned for it to be practical for large datasets. The [xarray documentation](https://docs.xarray.dev/en/stable/user-guide/computation.html) describes all the various transformations you can apply from within xarray, and of course you can always extract the data values and operate on them outside of xarray for any transformation not directly supported by xarray, then construct a suitable xarray object for use in the following stage. If there are transformations that seem particularly useful to other datashader users, we would be happy to consider including them, but generally these are very lightweight objects that you can simply create and discard as needed for your applications.\n", "\n", "## Colormapping\n", "\n", @@ -39,7 +39,7 @@ "\n", "## Embedding\n", "\n", - "Datashader is directly supported by [HoloViews](http://holoviews.org), with interactive exploration supported for its [Bokeh](http://bokeh.pydata.org) extension, and static plots supported for its [Matplotlib](http://matplotlib.org) extension. Plotly 3.0 now includes [Datashader support](https://plot.ly/python/change-callbacks-datashader/) as well. Because Datashader simply creates arrays and RGBA images, it should be straightforward to add support for Datashader to any plotting package that can call a Python function. We would love to accept contributions of interfaces for other plotting packages, or for those packages to support rendering using Datashader directly. If you do add Datashader support, please open an issue describing what you've done, so that we can add your tool to our test suite for to validate our changes against as we further develop the library." + "Datashader is directly supported by [HoloViews](https://holoviews.org), with interactive exploration supported for its [Bokeh](https://bokeh.pydata.org) extension, and static plots supported for its [Matplotlib](https://matplotlib.org) extension. Plotly 3.0 now includes [Datashader support](https://plot.ly/python/change-callbacks-datashader/) as well. Because Datashader simply creates arrays and RGBA images, it should be straightforward to add support for Datashader to any plotting package that can call a Python function. We would love to accept contributions of interfaces for other plotting packages, or for those packages to support rendering using Datashader directly. If you do add Datashader support, please open an issue describing what you've done, so that we can add your tool to our test suite for to validate our changes against as we further develop the library." ] } ], @@ -50,5 +50,5 @@ } }, "nbformat": 4, - "nbformat_minor": 1 + "nbformat_minor": 4 }