Skip to content

Conversation

@Ingvarstep
Copy link
Collaborator

@Ingvarstep Ingvarstep commented Jul 30, 2025

Implement a joint encoder–decoder architecture for NER.
The encoder models spans and conditions the label generation performed by the decoder. Additionally, prompt-based decoding is supported, enabling label generation based on prompt anchors.


Architecture

Architecture


Usage

from gliner import GLiNER
model = GLiNER.from_pretrained('model-path')

text = "Apple was founded as Apple Computer Company on April 1, 1976, by Steve Wozniak, Steve Jobs (1955–2011) and Ronald Wayne to develop and sell Wozniak's Apple I personal computer."

labels = ["person", "other"]

model.run(texts, labels, threshold=0.3, num_gen_sequences=1)

Example output:

[
  [
    {
      "start": 21,
      "end": 26,
      "text": "Apple",
      "label": "other",
      "score": 0.6795641779899597,
      "generated labels": ["Organization"]
    },
    {
      "start": 47,
      "end": 60,
      "text": "April 1, 1976",
      "label": "other",
      "score": 0.44296327233314514,
      "generated labels": ["Date"]
    },
    {
      "start": 65,
      "end": 78,
      "text": "Steve Wozniak",
      "label": "person",
      "score": 0.9934439659118652,
      "generated labels": ["Person"]
    },
    {
      "start": 80,
      "end": 90,
      "text": "Steve Jobs",
      "label": "person",
      "score": 0.9725918769836426,
      "generated labels": ["Person"]
    },
    {
      "start": 107,
      "end": 119,
      "text": "Ronald Wayne",
      "label": "person",
      "score": 0.9964536428451538,
      "generated labels": ["Person"]
    }
  ]
]

You can constrain the set of possible labels generated by the decoder:

model.run(
    texts, labels,
    threshold=0.3,
    num_gen_sequences=1,
    gen_constraints=[
        "organization type", "city", "organization",
        "technology", "date", "person"
    ]
)

Two implementations of the label trie are available. For the more efficient C++-based version, install Cython:

pip install cython

This improves memory efficiency and computational performance when working with millions of labels.

@Ingvarstep Ingvarstep merged commit 3498055 into main Aug 15, 2025
@Ingvarstep Ingvarstep deleted the labels_decoder branch August 15, 2025 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants