Skip to content

Conversation

@joshdunnlime
Copy link
Contributor

@joshdunnlime joshdunnlime commented Oct 3, 2025

Reference Issues/PRs

#605 and #601

What does this implement/fix? Explain your changes.

Implements the Differentiable Transformer as discussed in #605.

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

See #605 for details.

Did you add any tests for the change?

Not yet. Tests to follow

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

docs/api/*
docs/_build/*
cover/*
.coverage.*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as tests create .coverage.Mac for each process.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Would this solve the fitting/reset problem?

@joshdunnlime
Copy link
Contributor Author

I had already made some changes yesterday before your reply stating your preference to just passing the additional functions (option 2).

In the BaseTransformer, I have implemented an sklearn -> skpro transformer. The main idea is that the (skpro) BaseTransformer (BaseT) can take an sklearn transformer an wrap it in one of two ways:

  1. Very typical use case: an unfitted sklearn transformer is passed to BaseT at initialisation and is then fitted. We can then extend BaseT for DiffBaseT and DiffT as expected.
  2. The less typical case: a fitted sklearn transformer is passed to BaseT at initialisation. The BaseT uses this instance as if it was fitted with it. This only happens when the _fit_with_fitted method is called. If we want to refit then calling fit does that as normal.

Why is this needed?
Whether we wish to pass all three functions/methods (transform, inverse and inverse_diff) or the transformer class itself, we would like to separate the differentiation logic out from the TransformedDistribution (TD). When the TransformedTargetRegressor (TTR) passes a transform (either function or class) off the TD, the TD has no way of accessing y (or named X internally when working with transformers) and therefore fitting the DiffT.

Reset Issue
I don't think that the reset issue is a problem here. predict_probab returns the TD but doesn't seem to ever reset the DiffT. Perhaps this is because inside TD fit is never called (either on TD or on DiffT).

@joshdunnlime
Copy link
Contributor Author

This gives us the following options for using a TTR that returns a TD that can use log_loss:

Pass sklearn transformer

# wraps the fitted MMS in the DiffT internally
xgb = XGBoostLSS(**params)
mms = MinMaxScaler()
pipe = TransformedTargetRegressor(regressor=xgb, transformer=mms)

pipe.fit(X=Xy[["feature"]], y=Xy["target"])
td = pipe.predict_proba(Xy[["feature"]])
LogLoss()(y_true=Xy["target"], y_pred=pp)

Pass skpro DiffT

# this calls fit on the DiffT and subsequent MMS
xgb = XGBoostLSS(**params)
mms = DifferentiableTransformer(transformer=MinMaxScaler())
pipe = TransformedTargetRegressor(regressor=xgb, transformer=mms)

pipe.fit(X=Xy[["feature"]], y=Xy["target"])
td = pipe.predict_proba(Xy[["feature"]])
LogLoss()(y_true=Xy["target"], y_pred=pp)

The transformer that is not used in the TD td is always a DifferentiableTransformer so LogLoss can always be called (for some "well-behaved" function).

This also means we can still pass an mms.transform and wrap it in a DiffT:

mms = MinMaxScaler()
mms.fit(Xy[["target"]])

trafo = TransformedDistribution(
    distribution=Normal(),
    transform=mms.transform,
    assume_monotonic=True,
    index=Xy.index,
    columns=['target'],
)

trafo.log_pdf(Xy[["target"]])

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 4, 2025

we would like to separate the differentiation logic out from the TransformedDistribution (TD)

Just to clarify, I think it would be good if TD had:

  • some default for numerical differentiation in it.
  • the option to pass inverse transform.

What makes sense is to separate differentiation logic that is at least one of:

  • customisable, e.g., allow the user to pick the degree of approximation or similar parameters determining numerical strategy
  • auto-diff-like, e.g., dispatching on arguments to add auto-diff-like features to an sklearn transformer

@joshdunnlime
Copy link
Contributor Author

we would like to separate the differentiation logic out from the TransformedDistribution (TD)

Just to clarify, I think it would be good if TD had:

  • some default for numerical differentiation in it.
  • the option to pass inverse transform.

Done in #612

What makes sense is to separate differentiation logic that is at least one of:

  • customisable, e.g., allow the user to pick the degree of approximation or similar parameters determining numerical strategy
  • auto-diff-like, e.g., dispatching on arguments to add auto-diff-like features to an sklearn transformer

Solution in #612 makes it very easy to add this. Closing this in favour of #612

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants