Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .JuliaFormatter.toml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
style="blue"
format_markdown=true
format_markdown=true
6 changes: 6 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# 0.15.13

Exports extra functionality that should probably have been exported, namely `ordered`, `isinvertible`, and `columnwise`, from Bijectors.jl

The docs have been thoroughly restructured.

# 0.15.12

Improved implementation of the Enzyme rule for `Bijectors.find_alpha`.
Expand Down
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name = "Bijectors"
uuid = "76274a88-744f-5084-9051-94815aaf08c4"
version = "0.15.12"
version = "0.15.13"

[deps]
ArgCheck = "dce04be8-c92d-5529-be00-80e4d2c0e197"
Expand Down
41 changes: 28 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,35 @@
# Bijectors.jl

[![Docs - Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://turinglang.github.io/Bijectors.jl/stable)
[![Docs - Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://turinglang.github.io/Bijectors.jl/dev)
[![Interface tests](https://github.com/TuringLang/Bijectors.jl/workflows/Interface%20tests/badge.svg?branch=main)](https://github.com/TuringLang/Bijectors.jl/actions?query=workflow%3A%22Interface+tests%22+branch%3Amain)
[![AD tests](https://github.com/TuringLang/Bijectors.jl/workflows/AD%20tests/badge.svg?branch=main)](https://github.com/TuringLang/Bijectors.jl/actions?query=workflow%3A%22AD+tests%22+branch%3Amain)
[![Documentation for latest stable release](https://img.shields.io/badge/docs-stable-blue.svg)](https://turinglang.github.io/Bijectors.jl)
[![Documentation for development version](https://img.shields.io/badge/docs-dev-blue.svg)](https://turinglang.github.io/Bijectors.jl/dev)
[![CI](https://github.com/TuringLang/Bijectors.jl/actions/workflows/CI.yml/badge.svg)](https://github.com/TuringLang/Bijectors.jl/actions/workflows/CI.yml)

*A package for transforming distributions, used by [Turing.jl](https://github.com/TuringLang/Turing.jl).*
Bijectors.jl implements functions for transforming random variables and probability distributions.

Bijectors.jl implements both an interface for transforming distributions from Distributions.jl and many transformations needed in this context.
This package is used heavily in the probabilistic programming language Turing.jl.
A quick overview of some of the key functionality is provided below:

See the [documentation](https://turinglang.github.io/Bijectors.jl) for more.
```julia
julia> using Bijectors;
dist = LogNormal();
LogNormal{Float64}(μ=0.0, σ=1.0)

## Do you want to contribute?
julia> x = rand(dist) # Constrained to (0, ∞)
0.6471106974390148

If you feel you have some relevant skills and are interested in contributing, please get in touch!
You can find us in the #turing channel on the [Julia Slack](https://julialang.org/slack/) or [Discourse](https://discourse.julialang.org).
If you're having any problems, please open a Github issue, even if the problem seems small (like help figuring out an error message).
Every issue you open helps us to improve the library!
julia> b = bijector(dist) # This maps from (0, ∞) to ℝ
(::Base.Fix1{typeof(broadcast), typeof(log)}) (generic function with 1 method)

julia> y = b(x) # Unconstrained value in ℝ
-0.43523790570180304

julia> # Log-absolute determinant of the Jacobian at x.
with_logabsdet_jacobian(b, x)
(-0.43523790570180304, 0.43523790570180304)
```

Please see the [documentation](https://turinglang.github.io/Bijectors.jl) for more information.

## Get in touch

If you have any questions, please feel free to [post on Julia Slack](https://julialang.slack.com/archives/CCYDC34A0) or [Discourse](https://discourse.julialang.org/).
We also very much welcome GitHub issues or pull requests!
1 change: 1 addition & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[deps]
Bijectors = "76274a88-744f-5084-9051-94815aaf08c4"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
Functors = "d9f16b24-f501-4c13-a1f2-28368ffc5196"
StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"

Expand Down
11 changes: 7 additions & 4 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,13 @@ makedocs(;
format=Documenter.HTML(),
modules=[Bijectors],
pages=[
"Home" => "index.md",
"Transforms" => "transforms.md",
"Distributions.jl integration" => "distributions.md",
"Examples" => "examples.md",
"index.md",
"interface.md",
"defining.md",
"distributions.md",
"types.md",
"advi.md",
"flows.md",
],
checkdocs=:exports,
doctest=false,
Expand Down
94 changes: 94 additions & 0 deletions docs/src/advi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Example: Variational inference

The real utility of `TransformedDistribution` becomes more apparent when using `transformed(dist, b)` for any bijector `b`.
To get the transformed distribution corresponding to the `Beta(2, 2)`, we called `transformed(dist)` before.
This is an alias for `transformed(dist, bijector(dist))`.
Remember `bijector(dist)` returns the constrained-to-constrained bijector for that particular `Distribution`.
But we can of course construct a `TransformedDistribution` using different bijectors with the same `dist`.

This is particularly useful in _Automatic Differentiation Variational Inference (ADVI)_.

## Univariate ADVI

An important part of ADVI is to approximate a constrained distribution, e.g. `Beta`, as follows:

1. Sample `x` from a `Normal` with parameters `μ` and `σ`, i.e. `x ~ Normal(μ, σ)`.
2. Transform `x` to `y` s.t. `y ∈ support(Beta)`, with the transform being a differentiable bijection with a differentiable inverse (a "bijector").

This then defines a probability density with the same _support_ as `Beta`!
Of course, it's unlikely that it will be the same density, but it's an _approximation_.

Creating such a distribution can be done with `Bijector` and `TransformedDistribution`:

```@example advi
using Bijectors
using StableRNGs: StableRNG
rng = StableRNG(42)

dist = Beta(2, 2)
b = bijector(dist) # (0, 1) → ℝ
b⁻¹ = inverse(b) # ℝ → (0, 1)
td = transformed(Normal(), b⁻¹) # x ∼ 𝓝(0, 1) then b(x) ∈ (0, 1)
x = rand(rng, td) # ∈ (0, 1)
```

It's worth noting that `support(Beta)` is the _closed_ interval `[0, 1]`, while the constrained-to-unconstrained bijection, `Logit` in this case, is only well-defined as a map `(0, 1) → ℝ` for the _open_ interval `(0, 1)`.
This is of course not an implementation detail.
`ℝ` is itself open, thus no continuous bijection exists from a _closed_ interval to `ℝ`.
But since the boundaries of a closed interval has what's known as measure zero, this doesn't end up affecting the resulting density with support on the entire real line.
In practice, this means that

```@example advi
td = transformed(Beta())
inverse(td.transform)(rand(rng, td))
```

will never result in `0` or `1` though any sample arbitrarily close to either `0` or `1` is possible.
_Disclaimer: numerical accuracy is limited, so you might still see `0` and `1` if you're 'lucky'._

## Multivariate ADVI example

We can also do _multivariate_ ADVI using the `Stacked` bijector.
`Stacked` gives us a way to combine univariate and/or multivariate bijectors into a singe multivariate bijector.
Say you have a vector `x` of length 2 and you want to transform the first entry using `Exp` and the second entry using `Log`.
`Stacked` gives you an easy and efficient way of representing such a bijector.

```@example advi
using Bijectors: SimplexBijector

# Original distributions
dists = (Beta(), InverseGamma(), Dirichlet(2, 3))

# Construct the corresponding ranges
function make_ranges(dists)
ranges = []
idx = 1
for i in 1:length(dists)
d = dists[i]
push!(ranges, idx:(idx + length(d) - 1))
idx += length(d)
end
return ranges
end

ranges = make_ranges(dists)
ranges
```

```@example advi
# Base distribution; mean-field normal
num_params = ranges[end][end]

d = MvNormal(zeros(num_params), ones(num_params));

# Construct the transform
bs = bijector.(dists) # constrained-to-unconstrained bijectors for dists
ibs = inverse.(bs) # invert, so we get unconstrained-to-constrained
sb = Stacked(ibs, ranges) # => Stacked <: Bijector

# Mean-field normal with unconstrained-to-constrained stacked bijector
td = transformed(d, sb)
y = rand(td)
```

As can be seen from this, we now have a `y` for which `0.0 ≤ y[1] ≤ 1.0`, `0.0 < y[2]`, and `sum(y[3:4]) ≈ 1.0`.
94 changes: 94 additions & 0 deletions docs/src/defining.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Defining a bijector

This page describes the minimum expected interface to implement a bijector.

In general, there are two pieces of information needed to define a bijector:

1. The transformation itself, i.e., the map $b: \mathbb{R}^d \to \mathbb{R}^d$.

2. The log-absolute determinant of the Jacobian of that transformation.
For a transformation $b: \mathbb{R}^d \to \mathbb{R}^d$, the Jacobian at point $x \in \mathbb{R}^d$ is defined as:

$$J_{b}(x) = \begin{bmatrix}
\partial y_1/\partial x_1 & \partial y_1/\partial x_2 & \cdots & \partial y_1/\partial x_d \\
\partial y_2/\partial x_1 & \partial y_2/\partial x_2 & \cdots & \partial y_2/\partial x_d \\
\vdots & \vdots & \ddots & \vdots \\
\partial y_d/\partial x_1 & \partial y_d/\partial x_2 & \cdots & \partial y_d/\partial x_d
\end{bmatrix}$$

where $y = b(x)$.

## The transform itself

The most efficient way to implement a bijector is to provide an implementation of:

```@docs; canonical=false
Bijectors.with_logabsdet_jacobian
```

If you define `with_logabsdet_jacobian(b, x)`, then you will automatically get default implementations of both `transform(b, x)` and `logabsdetjac(b, x)`, which respectively return the first and second value of that tuple.
So, in fact, you can implement a bijector by defining only `with_logabsdet_jacobian`.

If you prefer, you can implement `transform` and `logabsdetjac` separately, as described below.
Having manual implementations of these may also be useful if you expect either to be used heavily without the other.

### Transformation

```@docs; canonical=false
transform
```

If `transform(b, x)` is defined, then you will automatically get a default implementation of `b(x)` which calls that.

### Log-absolute determinant of the Jacobian

```@docs; canonical=false
Bijectors.logabsdetjac
```

## Inverse

Often you will want to define an inverse bijector as well.
To do so, you will have to implement:

```@docs; canonical=false
Bijectors.inverse
```

If `b` is a bijector, then `inverse(b)` should return the inverse bijector $b^{-1}$.

If your bijector subtypes `Bijectors.Bijector`, then you will get a default implementation of `inverse` which constructs `Bijectors.Inverse(b)`.
This may be easier than creating a second type for the inverse bijector.
Note that you will also need to implement the methods for `with_logabsdet_jacobian` (and/or `transform` and `logabsdetjac`) for the inverse bijector type.

If your bijector is not invertible, you can specify this here:

```@docs; canonical=false
Bijectors.isinvertible
```

## Distributions

If your bijector is intended for use with a distribution, i.e., it transforms random variables drawn from that distribution to Euclidean space, then you should also implement:

```@docs; canonical=false
Bijectors.bijector
```

which should return your bijector.

On top of that, you should also implement a method for `Bijectors.output_size(b, dist::Distribution)`:

```@docs; canonical=false
Bijectors.output_size
```

## Closed-form

If your bijector does _not_ have a closed-form expression (e.g. if it uses an iterative procedure), then this should be set to false:

```@docs; canonical=false
Bijectors.isclosedform
```

The default is `true` so you only need to set this if your bijector is not closed-form.
87 changes: 55 additions & 32 deletions docs/src/distributions.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,78 @@
## Basic usage
# Usage with distributions

Other than the `logpdf_with_trans` methods, the package also provides a more composable interface through the `Bijector` types. Consider for example the one from above with `Beta(2, 2)`.
Bijectors provides many utilities for working with probability distributions.

```julia
julia> using Random;
Random.seed!(42);
```@example distributions
using Bijectors

dist = LogNormal()
x = rand(dist)
b = bijector(dist) # bijection (0, ∞) → ℝ

y = b(x)
```

julia> using Bijectors;
using Bijectors: Logit;
Here, `bijector(d::Distribution)` returns the corresponding constrained-to-unconstrained bijection for `Beta`, which is a log function.
The resulting bijector can be called, just like any other function, to transform samples from the distribution to the unconstrained space.

julia> dist = Beta(2, 2)
Beta{Float64}(α=2.0, β=2.0)
The function [`link`](@ref) provides a short way of doing the above:

julia> x = rand(dist)
0.36888689965963756
```@example distributions
link(dist, x) ≈ b(x)
```

See [the Turing.jl docs](https://turinglang.org/docs/developers/transforms/distributions/) for more information about how this is used in probabilistic programming.

## Transforming distributions

julia> b = bijector(dist) # bijection (0, 1) → ℝ
Logit{Float64}(0.0, 1.0)
We can also couple a distribution together with its bijector to create a _transformed_ `Distribution`, i.e. a `Distribution` defined by sampling from a given `Distribution` and then transforming using a given transformation:

julia> y = b(x)
-0.5369949942509267
```@example distributions
dist = LogNormal() # support on (0, ∞)
tdist = transformed(dist) # support on ℝ
```

In this case we see that `bijector(d::Distribution)` returns the corresponding constrained-to-unconstrained bijection for `Beta`, which indeed is a `Logit` with `a = 0.0` and `b = 1.0`. The resulting `Logit <: Bijector` has a method `(b::Logit)(x)` defined, allowing us to call it just like any other function. Comparing with the above example, `b(x) ≈ link(dist, x)`. Just to convince ourselves:
We can then sample from, and compute the `logpdf` for, the resulting distribution:

```@example distributions
y = rand(tdist)
```

```@example distributions
logpdf(tdist, y)
```

We should expect here that

```julia
julia> b(x) ≈ link(dist, x)
true
logpdf(tdist, y) ≈ logpdf(dist, x) - logabsdetjac(b, x)
```

## Transforming distributions
where `b = bijector(dist)` and `y = b(x)`.

```@setup transformed-dist-simple
using Bijectors
To verify this, we can calculate the value of `x` using the inverse bijector:

```@example distributions
b = bijector(dist)
binv = inverse(b)

x = binv(y)
```

We can create a _transformed_ `Distribution`, i.e. a `Distribution` defined by sampling from a given `Distribution` and then transforming using a given transformation:
(Because `b` is just a log function, `binv` is an exponential function, i.e. `x = exp(y)`.)

```@repl transformed-dist-simple
dist = Beta(2, 2) # support on (0, 1)
tdist = transformed(dist) # support on ℝ
Then we can check the equality:

tdist isa UnivariateDistribution
```@example distributions
logpdf(tdist, y) ≈ logpdf(dist, x) - logabsdetjac(b, x)
```

We can the then compute the `logpdf` for the resulting distribution:
You can also use [`Bijectors.logpdf_with_trans`](@ref) with the original distribution:

```@repl transformed-dist-simple
# Some example values
x = rand(dist)
y = tdist.transform(x)
```@example distributions
logpdf_with_trans(dist, x, false) ≈ logpdf(dist, x)
```

logpdf(tdist, y)
```@example distributions
logpdf_with_trans(dist, x, true) ≈ logpdf(tdist, y)
```
Loading