Skip to content

PyTorch implementation of the Penalized Exponential Loss (PENEX): A margin-based loss function for neural networks inspired by AdaBoost, designed to improve generalization.

License

Notifications You must be signed in to change notification settings

rudolfwilliam/penex

Repository files navigation

Penalized Exponential Loss (PENEX)

Do you still train your neural networks using cross-entropy loss? How about you give PENEX a try, a new loss that is inspired by the infamous AdaBoost algorithm:

$$ \mathcal{L}_{\mathrm{\scriptscriptstyle PENEX}}(f; \alpha) = \hat{\mathbb{E}} \Bigg[ \exp \Bigl( -\alpha f^{(y)}(\mathbf{x}) \Bigr) + \rho(\alpha) \sum_{j=1}^K \exp \Bigl( f^{(j)}(\mathbf{x}) \Bigr) \Bigg]. $$

You are not convinced that you need PENEX? Maybe these CIFAR-100 results will change your mind:

Our implementation is based on PyTorch.

Setup 💻

First, download the repository:

git clone https://github.com/rudolfwilliam/penex

Then, install the package via

pip install .

Now, let us have some fun! 🚀

Minimal Training and Inference Examples

The best part is that integrating PENEX into your training loop is almost no effort.

Training

During training, just replace nn.CrossEntropyLoss by PENEX:

import torch
import torch.nn as nn
import torch.optim as optim

from penex.losses import PENEX

criterion = PENEX() # PENEX instead of nn.CrossEntropyLoss

# Dummy dataset
X = torch.randn(100, 10)
y = torch.randint(0, 2, (100,))

# Simple model
model = nn.Sequential(
    nn.Linear(10, 50),
    nn.ReLU(),
    nn.Linear(50, 2)
)

optimizer = optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(10):
    optimizer.zero_grad()
    logits = model(X)
    loss = criterion(logits, y)    # Compute loss
    loss.backward()
    optimizer.step()

Inference

Inference (that is, when you make predictions) is almost as simple as training.

Important

There is only one difference between cross-entropy and PENEX during inference: For PENEX, you need to perform inference at temperature $(1 + \alpha)^{-1}$, where $\alpha$ is the sensitivity parameter.

In code, this is just one extra line

model.eval()

# New input sample
x_new = torch.randn(1, 10)

# Disable gradient calculation
with torch.no_grad():
    logits = model(x_new)
    logits_rescaled = logits * (1 + criterion.sensitivity) # IMPORTANT LINE
    probs = torch.softmax(logits_rescaled, dim=-1)
    pred_class = torch.argmax(probs, dim=-1)

print("Probabilities:", probs)
print("Predicted class:", pred_class.item())

That's it! Please check out project_files/scripts/plotting/simple_2D_example.ipynb for a slightly more extensive demo.

Practical Advice

PENEX can be more sensitive than cross-entropy and may require a smaller learning rate. If training becomes unstable after replacing cross-entropy with PENEX, adjusting the learning rate should be your first step.

Citation

If you find PENEX useful, we would be happy if you could leave our repository a star ⭐ and cite our pre-print 📄. The bibtex entry is

@article{kladny2025penex,
  title={{PENEX: AdaBoost-Inspired Neural Network Regularization}},
  author={Kladny, Klaus-Rudolf and Sch{\"o}lkopf, Bernhard and Muehlebach, Michael},
  journal={arXiv preprint arXiv:2510.02107},
  year={2025}
}

Reproducing Paper Results

If you would like to reproduce the experiments from our paper, please take a look at our EXPERIMENTS.md.

Problems, Questions or Feedback?

Please create an issue or inform me via e-mail: kkladny [at] tuebingen [dot] mpg [dot] de

"Mathematical theory is not critical to the development of machine learning. But scientific inquiry is."

— Leo Breiman

About

PyTorch implementation of the Penalized Exponential Loss (PENEX): A margin-based loss function for neural networks inspired by AdaBoost, designed to improve generalization.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published