Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.
This repository was archived by the owner on Nov 1, 2024. It is now read-only.

Tokenization for downstream tasks #10

@danigoju

Description

@danigoju

First of all thank you very much for your work.

I am working on the long text classification task, and given the spectacular results of MEGA for long sequence modelling I wanted to use it for this task. The only thing that I haven't figured out how to do is the tokenization of my text samples, so I was wondering if someone could help me out on how to tokenize my text with the dict that is obtained from a checkpoint like the LRA one from the text task.

Thank you very much for your time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions