Skip to content

[BUG: Output] Incorrect word-level bboxes in >=v0.15.0 #450

@mzilinec

Description

@mzilinec

📝 Describe the Output Issue

Hi guys, I have just observed that in versions >=0.15.0, word-level (and character-level) bounding boxes seem to be way less accurate (or maybe scaled incorrectly?). This seems to happen across different documents.

I tried the minimal code below, which produces correct bboxes on 0.14.7 but fails on 0.16.3 - I’m sending some examples of the result.

Image Image

Image Image

#foundation_predictor = FoundationPredictor()
recognition_predictor = RecognitionPredictor()#foundation_predictor)
detection_predictor = DetectionPredictor()

predictions = recognition_predictor([images[0]], det_predictor=detection_predictor, return_words=True)

drawn_im = images[0].copy()
draw = ImageDraw.Draw(drawn_im)

def random_color():
    return randint(0, 255), randint(0, 255), randint(0, 255)

for i, cell in enumerate(predictions[0].text_lines):
    draw.rectangle(cell.bbox, outline=random_color())
    # print(cell.original_text_good)  # False
    if cell.chars:
       for word in cell.chars:
           poly = [(int(x), int(y)) for x, y in word.polygon]
           draw.polygon(poly, outline=(128, 0, 0))

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug: outputPoor markdown/HTML output quality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions