Skip to content

Custom words dictionary #73

@venzen

Description

@venzen

Thank you for this good work. I have two questions about using this tool. First let me briefly explain my use case:

I am translating Buddhist texts from Thai to English for the Mahachulalangkornraachawitayaalay (MCU). The source material is images, so I must first do OCR (with tesseract) and then edit to markdown format. After that I can translate to English using Google Translate. During OCR some characters and annotations are missed or misinterpreted. I hope that deepcut can allow me to correct those words that are misrepresented by OCR. For example, the correct word is 'ประจําบท' but OCR misses the sara am and returns 'ประจาบท'.

  1. Can deepcut help in this case?
  2. If there are new or unseen words in the text, how can I add these words to deepcut for identification in the future?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions