activation_store

Utility library to persistently store transformer activations on disk as huggingface dataset.

As these activations can be quite large (layers x batch x sequence x hidden_size), generating them to disk helps avoid out of memory errors.

Install using

pip install git+https://github.com/wassname/activation_store.git

Examples

Full examples can be found in the nbs folder.

layer_groups = {'mlp.down_proj': [
  'model.layers.21.mlp.down_proj',
  'model.layers.22.mlp.down_proj',
  'model.layers.23.mlp.down_proj'],
 'self_attn': [
  'model.layers.21.self_attn',
  'model.layers.22.self_attn',
  'model.layers.23.self_attn'],
 'mlp.up_proj': [
  'model.layers.21.mlp.up_proj',
  'model.layers.22.mlp.up_proj',
  'model.layers.23.mlp.up_proj']}

# collect activations into a huggingface dataset
f = activation_store(loader=ds, model=model, layers=layer_groups)
f
# > Generating train split: 0 examples [00:00, ? examples/s]
# Dataset({
#    features: ['mlp.down_proj', 'self_attn', 'mlp.up_proj', 'loss', 'logits', 'hidden_states'],
#    num_rows: 20
# })

# it has this shape
ds_a = Dataset.from_parquet(str(f)).with_format("torch")
ds_a[0:2]['hidden_states'].shape # [batch, layers, tokens, hidden_states]
# torch.Size([2, 25, 1, 896])

Development

git clone https//github.com/wassname/activation_store.git
uv sync

TODO:

test compression: it's not worth the complexity
add examples
generate and collect activations
- A manual loop of forwards/generate, reusing kv_cache, and appending model outputs along the token dim. saving outputs too

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
activation_store		activation_store
nbs		nbs
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
research_journal.md		research_journal.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

activation_store

Examples

Development

TODO:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

wassname/activation_store

Folders and files

Latest commit

History

Repository files navigation

activation_store

Examples

Development

TODO:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages