Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
__init__.py	__init__.py
check_conversion.py	check_conversion.py
check_partitioned.py	check_partitioned.py
convert_none.py	convert_none.py
eval_new_t5.py	eval_new_t5.py
modeling_t5.py	modeling_t5.py

Name

Last commit message

Last commit date

Convert HuggingFace model to FTPipe format.

Download the model

wget https://raw.githubusercontent.com/huggingface/transformers/v4.5.1/src/transformers/models/t5/modeling_t5.py

Add

from transformers.models.t5.modeling_t5 import *

to the top of the file to fix imports.

Run the conversion scrip to convert is None and is not None.

from autopipe.autopipe.utils import convert_none_checks
convert_none_checks(input_file="modeling_t5.py", output_file="modeling_t5.py")

Use "stateless" addons to allow shared linear weights.

add

from models.normal.NLP_models.stateless import StatelessEmbedding

to the top of the file.

Then, add

    def make_stateless(self):
        stateless_shared = StatelessEmbedding(self.shared)
        self.encoder.embed_tokens = StatelessEmbedding(self.shared)
        self.decoder.embed_tokens = StatelessEmbedding(self.shared)

        del self.shared
        self.encoder.embed_tokens.pop_weight()
        self.decoder.embed_tokens.pop_weight()

        self.shared_embed_weight = stateless_shared.pop_weight()

to T5Model.

This makes calls to self.encoder.embed_tokens and self.decoder.embed_tokens accept a the shared weight as the first parameter.

Then, make sure all calls get the new parameter self.shared_embed_weight. This requires the following changes in forward methods:

In T5Stack:

(1) Before:

            inputs_embeds = self.embed_tokens(input_ids)

(1) After:

            inputs_embeds = self.embed_tokens(shared_embedding, input_ids)

(2) add shared_embedding as first parameter in forward declaration.

(3) In T5Modle: In callers: self.decoder(...) self.encoder(...) simply add self.shared_embed_weight as the first parameter.

Now, the model can be registered to the framework.

In addition:

remove huggingface functions which are called in runtime but I'm too lazy to convert, like head mask (to remove operator.is_).
return a single value
check if there additional hidden operator.is_
training=self.training this is traced as static, replace it.

Explanation:

Conversion: done to help the tracer.
Stateless: this manually creates an edge from the shared weight to new Staleless layers, which will accept it as a parameter. The rest will be handled by the framework.
single value: currently only models with single output value are supported (this can be easily changed).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Convert HuggingFace model to FTPipe format.

FilesExpand file tree

new_t5_example

Directory actions

More options

Directory actions

More options

Latest commit

History

new_t5_example

Folders and files

parent directory

README.md

Convert HuggingFace model to FTPipe format.