Skip to content

Trendyol/go-symspell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Go SymSpell

Go Version License Go Report Card GoDoc

Overview

Go SymSpell is a fast and efficient spell-checking and correction library for Go. It implements the SymSpell algorithm with the “symmetric delete” approach, enabling both speed and accuracy. Unlike traditional spell checkers that generate variations of the input word, SymSpell precomputes all possible deletions of dictionary words up to a given edit distance. This allows very quick lookups while keeping correction quality high.

Installation

go mod init your-project
go get github.com/Trendyol/go-symspell

Quick Start

package main

import (
    "fmt"
    "log"
    "github.com/Trendyol/go-symspell/symspell"
    "github.com/Trendyol/go-symspell/verbosity"
)

func main() {
    // Create a new SymSpell instance with default settings
    ss, err := symspell.NewSymSpell()
    if err != nil {
        log.Fatal("Failed to create spell checker:", err)
    }

    // Load dictionary (word frequency_count format)
    _, err = ss.LoadDictionary("dictionary.txt", 0, 1, " ")
    if err != nil {
        log.Fatal("Failed to load dictionary:", err)
    }

    // Get spelling suggestions
    suggestions, err := ss.Lookup("speling", verbosity.Top, 2)
    if err != nil {
        log.Fatal("Lookup failed:", err)
    }

    // Print results
    for _, suggestion := range suggestions {
        fmt.Printf("Suggestion: %s (Distance: %d, Frequency: %d)\n", 
            suggestion.Term, suggestion.Distance, suggestion.Count)
    }
}

Configuration Options

Create a SymSpell instance with custom configuration:

ss, err := symspell.NewSymSpell(
    symspell.WithMaxDictionaryEditDistance(2),  // Maximum edit distance for dictionary
    symspell.WithPrefixLength(7),               // Prefix length for optimization
    symspell.WithIncludeUnknown(true),          // Include unknown words in results
    symspell.WithTransferCasing(true),          // Transfer original casing
    symspell.WithIgnoreNonWords(true),          // Skip non-word tokens
    symspell.WithSplitBySpace(true),            // Enable compound word splitting
)

Available Options

Option Default Description
MaxDictionaryEditDistance 2 Maximum edit distance for precomputed deletes
PrefixLength 7 Length of word prefixes for optimization
InitialCapacity 16 Initial dictionary capacity
CountThreshold 1 Minimum frequency threshold for words
DistanceAlgorithm DamerauOSAFast Edit distance algorithm
IncludeUnknown false Include input word even if not in dictionary
TransferCasing false Apply original casing to suggestions
IgnoreNonWords false Skip tokens that aren't words
IgnoreTermWithDigits false Skip words containing digits
SplitBySpace false Split compound words automatically

Dictionary Format

Dictionary files should contain words with their frequencies:

the 1061396
of 593677
to 416629
and 411764
a 409757

Verbosity Levels

Control the detail level of suggestions:

import "github.com/Trendyol/go-symspell/verbosity"

// Top suggestion only
suggestions, _ := ss.Lookup("word", verbosity.Top, 2)

// Closest matches within edit distance
suggestions, _ := ss.Lookup("word", verbosity.Closest, 2)

// All suggestions within edit distance
suggestions, _ := ss.Lookup("word", verbosity.All, 2)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages