Skip to content

vvatanabe/go-sdbm

Repository files navigation

go-sdbm

Go codecov Go Report Card Go Reference

go-sdbm is a pure Go implementation of the classic SDBM key-value store.

About SDBM

SDBM (Substitute DBM) is a public domain clone of the ndbm library, originally implemented in C by Ozan Yigit. It provides a simple and efficient way to store and retrieve key-value pairs using dynamic hashing, based on P.-A. Larson’s 1978 algorithm known as “Dynamic Hashing”. SDBM is recognized for being practical, easy to understand, and compatible with ndbm, while offering improvements such as reduced file sizes and faster database creation.

C Compatibility

This Go implementation produces database files that are binary-compatible with C SDBM under the following platform assumptions:

Assumption Detail
Byte order Little-endian. Page index offsets are stored as little-endian uint16 values. C SDBM uses native short * casts, so databases created on big-endian C hosts are not compatible.
Data model LP64 (long = 64-bit). The hash function uses 64-bit arithmetic. Databases created by C SDBM compiled in ILP32 mode (long = 32-bit) may produce different hash values and page layouts.
char signedness Signed (default on x86-64). The hash function interprets each byte as signed char before promoting to the hash accumulator, matching the default behavior of char on most x86-64 and ARM64 compilers. C toolchains that define char as unsigned (e.g., gcc -funsigned-char, some ARM configurations) will produce different hash values.

In practice, these conditions are met by Linux and macOS on x86-64 and ARM64 with standard toolchains (GCC, Clang), which covers the vast majority of modern environments.

Intentional Differences from C

The following behaviors intentionally differ from the original C implementation for safety or idiomatic Go reasons:

  • ChkPage rejects odd entry counts: Go's ChkPage returns false when the entry count is odd (n%2 != 0). The C version does not perform this check. This only affects corrupted pages and prevents potential out-of-bounds access during iteration.
  • EOF handling in iteration: When NextKey reaches the end of the .pag file, Go returns (nil, nil) (normal termination). C sets the DBM_IOERR flag and returns nullitem, treating EOF and I/O errors identically. This does not affect database file contents.
  • Error reporting: Go returns error values instead of using errno and a global I/O error flag. Write failures are propagated to the caller rather than silently setting an internal flag.

Verified by

Compatibility is verified by a multi-layer test suite:

  1. Hash function: Direct comparison of Go Hash() vs C dbm_hash() via CGo, including signed-char edge cases.
  2. Page operations: Byte-for-byte comparison of Go and C page buffers after PutPair, GetPair, DelPair, SplPage, FitPair, ChkPage, GetNKey, and DupPair.
  3. Golden files: Reading databases generated by a C program (testdata/golden/) and verifying all keys and values.
  4. File hash comparison: SHA-256 comparison of .dir and .pag files produced by Go vs C for identical input sequences.

Motivation

Reimplementing a simple and classic key-value store like SDBM in the Go programming language offers the following objectives and learning opportunities:

  • Data Structures and Algorithms: Gain a deep understanding of the internal workings of classic key-value stores (KVS). By learning concepts such as dynamic hashing methods, page management, and bitmaps, you can master efficient data storage and retrieval techniques.
  • Hash Functions and Hashing Techniques: Learn about selecting and implementing effective hash functions and collision resolution methods. This is crucial for deepening your foundational understanding of databases and cache systems.
  • File I/O and Data Persistence in Go: Acquire knowledge on how to directly manipulate the file system and persist data. This enhances your understanding of Go's os package and binary data reading and writing.

Through this project, you can learn how to implement a simple and efficient KVS in Go, thereby deepening your understanding of the fundamental principles of system programming and databases.

Installation

To install the sdbm package, run:

go get github.com/vvatanabe/go-sdbm

Documentation

Detailed documentation is available on pkg.go.dev.

Usage

Below is an example demonstrating basic usage of the sdbm package:

package main

import (
	"fmt"
	"os"

	"github.com/vvatanabe/go-sdbm"
)

func main() {
	// Open the database file (creates it if it doesn't exist)
	db, err := sdbm.Open("mydatabase", os.O_RDWR|os.O_CREATE, 0644)
	if err != nil {
		panic(err)
	}
	defer db.Close()

	// Create key and value data
	key := sdbm.Datum("mykey")
	value := sdbm.Datum("myvalue")

	// Store the key-value pair in the database
	success, err := db.Store(key, value, sdbm.StoreREPLACE)
	if err != nil {
		panic(err)
	}
	if success {
		fmt.Println("Key-value pair stored successfully.")
	}

	// Fetch the value associated with the key
	fetchedValue, err := db.Fetch(key)
	if err != nil {
		panic(err)
	}
	fmt.Printf("Fetched Value: %s\n", fetchedValue)

	// Delete the key-value pair from the database
	deleted, err := db.Delete(key)
	if err != nil {
		panic(err)
	}
	if deleted {
		fmt.Println("Key-value pair deleted successfully.")
	}

	// First keys in the database
	key, err = db.FirstKey()
	if err != nil {
		panic(err)
	}
	fmt.Printf("FirstKey: %s\n", key)

	// Next keys in the database
	key, err = db.NextKey()
	if err != nil {
		panic(err)
	}
	fmt.Printf("NextKey: %s\n", key)
}

Acknowledgments

Authors

License

This project is licensed under the CC0 1.0 Universal.

About

go-sdbm is a pure Go implementation of the classic SDBM key-value store.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages