[SECURITY] Pickle allowlist in `utils/serialization.py` is bypassable (`builtins.getattr` allowed, duplicate keys drop `PredictionTask`)

### Bug description

The `BASE_SERIALIZATION_CLASSES` allowlist introduced by PR #802 (commit `ab7207cf`, "Sec pic fix") in `transformers4rec/utils/serialization.py` has two structural issues that defeat the intended restriction on `Unpickler.find_class`.

File (at `ab7207cf`):
https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/ab7207cf40c7960f7aa22c86ab232576aa8cf847/transformers4rec/utils/serialization.py#L11-L68

The file was subsequently removed in PR #808 (`41b14d7b`), but checkpoints produced during the `#802 → #807` window still exist in the wild and the design questions apply to any follow-up allowlist.

**(1) `builtins` key appears twice.** Python dict literals are last-write-wins, so only the second declaration survives:

```python
BASE_SERIALIZATION_CLASSES = {
    "builtins": [
        "Exception", "ValueError", "NotImplementedError", "AttributeError",
        "AssertionError"
    ],
    ...
    "builtins": ["getattr"],   # <-- this overrides the first "builtins" key
}
```

Net result at runtime: `builtins` maps to `["getattr"]`. The Exception subclasses (intentionally allowlisted) are silently dropped, and `builtins.getattr` — a well-known primitive in pickle gadget chains (attribute traversal → code execution) — is approved.

**(2) `torch.storage._load_from_bytes` is in the allowlist.** `_load_from_bytes` wraps `torch.load`, which itself performs unrestricted pickle deserialization when the installed PyTorch version predates the `weights_only=True` default. Combined with `getattr`, this provides a reachable path from the restricted unpickler to arbitrary code execution.

**(3) `transformers4rec.torch.model.base` is also duplicated.** The first declaration includes `PredictionTask`; the second (which wins) does not:

```python
"transformers4rec.torch.model.base": ["Model", "Head", "PredictionTask"],
...
"transformers4rec.torch.model.base": ["forward_to_prediction_fn", "Model", "Head"],   # wins; PredictionTask dropped
```

So any checkpoint containing `PredictionTask` fails to deserialize with `ValueError` from the restricted unpickler. This is an availability regression, not a security one, but it indicates the dict literal was not reviewed carefully before merge.

### Steps/Code to reproduce bug

Structural evidence (no exploit payload):
```bash
git show ab7207cf:transformers4rec/utils/serialization.py | grep -n '"builtins"'
# 12:    "builtins": [
# 66:    "builtins": ["getattr"],

git show ab7207cf:transformers4rec/utils/serialization.py | grep -n 'torch.storage'
# 49:    "torch.storage": ["_load_from_bytes"],
```

Evaluating the module confirms the surviving values:
```python
import ast
src = open(".../serialization.py").read()
ns = {}
exec(src, ns)
print(ns["BASE_SERIALIZATION_CLASSES"]["builtins"])            # ['getattr']
print(ns["BASE_SERIALIZATION_CLASSES"]["torch.storage"])       # ['_load_from_bytes']
print("PredictionTask" in ns["BASE_SERIALIZATION_CLASSES"]
      ["transformers4rec.torch.model.base"])                    # False
```

I am intentionally not attaching an end-to-end exploit payload to a public issue; the combination of `builtins.getattr` + `torch.storage._load_from_bytes` is sufficient for any reader familiar with pickle gadget chains to reproduce.

### Expected behavior

- The allowlist does not contain `builtins.getattr` or `torch.storage._load_from_bytes`.
- The dict has no duplicate keys (a `flake8-bugbear B033` or similar lint would have caught this).
- The design shifts away from pickle-based serialization for model weights: either `safetensors`, or `torch.load(..., weights_only=True)` after bumping the minimum PyTorch version (see the companion `torch.load` issue I am filing).

### Environment details

- Transformers4Rec: commits `ab7207cf` (PR #802) through `0e31f575` (PR #807); file removed at `41b14d7b` (PR #808) but affected checkpoints persist.
- Python: any
- PyTorch: any (the `_load_from_bytes` gadget exists across all modern versions)

### Additional context

Suggested remediation order:
1. Restore allowlist-based protection only as a temporary measure — remove `builtins.getattr` and `torch.storage._load_from_bytes`, de-duplicate all keys.
2. Transition model-weight serialization to a format that does not execute code on load (`safetensors` is the common choice for HF-adjacent projects).
3. For any remaining pickle-based artifacts, wrap `torch.load` with an explicit `weights_only=True` (see the companion issue).

I can prepare separate PRs for each step if the maintainers prefer.

	BASE_SERIALIZATION_CLASSES = {
	"builtins": [
	"Exception", "ValueError", "NotImplementedError", "AttributeError",
	"AssertionError"
	], # each Exception Error class needs to be added explicitly
	"collections": ["OrderedDict", "defaultdict"],
	"datetime": ["timedelta"],
	"pathlib": ["PosixPath"],
	"functools": ["partial"],
	"transformers4rec.torch.model.base": ["Model", "Head", "PredictionTask"],
	"transformers4rec.torch.block.base": ["SequentialBlock"],
	"transformers4rec.torch.features.sequence": ["TabularSequenceFeatures", "SequenceEmbeddingFeatures"],
	"transformers4rec.torch.features.continuous": ["ContinuousFeatures"],
	"transformers4rec.torch.features.embedding": ["FeatureConfig", "EmbeddingConfig", "TableConfig", "PretrainedEmbeddingFeatures"],
	"transformers4rec.torch.features.sparse": ["SparseFeatures"],
	"transformers4rec.torch.features.tabular": ["TabularFeatures"],
	"transformers4rec.torch.tabular.base": ["FilterFeatures", "AsTabular"],
	"transformers4rec.torch.tabular.aggregation": ["ConcatFeatures"],
	"transformers4rec.torch.block.base": ["Block", "SequentialBlock"],
	"transformers4rec.torch.block.mlp": ["DenseBlock"],
	"transformers4rec.torch.block.transformer": ["TransformerBlock"],
	"transformers4rec.torch.masking": ["CausalLanguageModeling"],
	"transformers4rec.torch.model.base": ["forward_to_prediction_fn", "Model", "Head"],
	"transformers4rec.torch.model.prediction_task": ["NextItemPredictionTask", "_NextItemPredictionTask"],
	"transformers4rec.torch.ranking_metric": ["NDCGAt", "DCGAt", "AvgPrecisionAt", "PrecisionAt", "RecallAt"],
	"transformers4rec.config.transformer": ["XLNetConfig"],
	"torch.nn.modules.container": ["ModuleList","ModuleDict"],
	"torch.nn.modules.loss": ["CrossEntropyLoss"],
	"merlin_standard_lib.schema.schema": ["Schema", "ColumnSchema"],
	"merlin_standard_lib.proto.schema_bp": [
	"FeaturePresence", "FeaturePresenceWithinGroup", "FeatureType", "FixedShape", "ValueCount", "ValueCountList",
	"IntDomain", "FloatDomain", "StringDomain", "BoolDomain", "StructDomain", "NaturalLanguageDomain",
	"FeatureCoverageConstraints", "SequenceLengthConstraints", "ImageDomain", "MIDDomain", "URLDomain",
	"TimeDomain", "TimeOfDayDomain", "DistributionConstraints", "Annotation", "FeatureComparator", "InfinityNorm",
	"JensenShannonDivergence", "UniqueConstraints", "DatasetConstraints", "NumericValueComparator"],
	"torch.nn.init": ["kaiming_normal_", "kaiming_uniform_", "xavier_normal_", "xavier_uniform_", "uniform_", "normal_", "zeros_", "ones_"],
	"torch._utils": ["_rebuild_tensor_v2", "_rebuild_parameter"],
	"torch": ["Size", "device"],
	"torch.storage": ["_load_from_bytes"],
	"torch._C._nn": ["gelu"],
	"torch.nn.module": ["Module"],
	"torch.nn.modules.activation": ["ReLU", "Sigmoid", "Tanh"],
	"torch.nn.modules.linear": ["Linear", "Identity"],
	"torch.nn.modules.conv": ["Conv1d", "Conv2d", "Conv3d"],
	"torch.nn.modules.pooling": ["MaxPool1d", "MaxPool2d", "MaxPool3d"],
	"torch.nn.modules.normalization": ["BatchNorm1d", "BatchNorm2d", "BatchNorm3d", "LayerNorm"],
	"torch.nn.modules.dropout": ["Dropout", "Dropout2d", "Dropout3d"],
	"torch.nn.modules.rnn": ["RNN", "LSTM", "GRU"],
	"torch.nn.modules.sparse": ["Embedding"],
	"torch.optim.adam": ["Adam"],
	"torchmetrics.metric": ["jit_distributed_available"],
	"torchmetrics.utilities.data": ["dim_zero_cat"],
	"transformers.models.xlnet.modeling_xlnet": ["XLNetModel", "XLNetLayer", "XLNetRelativeAttention", "XLNetFeedForward"],
	"transformers.activations": ["GELUActivation"],
	"transformers.modeling_utils": ["SequenceSummary"],
	"builtins": ["getattr"],

	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SECURITY] Pickle allowlist in `utils/serialization.py` is bypassable (`builtins.getattr` allowed, duplicate keys drop `PredictionTask`) #814

Bug description

Steps/Code to reproduce bug

Expected behavior

Environment details

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[SECURITY] Pickle allowlist in utils/serialization.py is bypassable (builtins.getattr allowed, duplicate keys drop PredictionTask) #814

Description

Bug description

Steps/Code to reproduce bug

Expected behavior

Environment details

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[SECURITY] Pickle allowlist in `utils/serialization.py` is bypassable (`builtins.getattr` allowed, duplicate keys drop `PredictionTask`) #814