Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,4 @@ lint/
# Config files and environment files (make sure to add only those you do not wish to commit)
.env
config.yaml
TODO.md
112 changes: 34 additions & 78 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,14 @@
[![Go Report Card](https://goreportcard.com/badge/github.com/stackloklabs/gollm)](https://goreportcard.com/report/github.com/stackloklabs/gollm)
[![License](https://img.shields.io/github/license/stackloklabs/gollm)](LICENSE)

Gollm is a Go library that provides an easy interface to interact with AI backends
like [Ollama](https://ollama.com) and [OpenAI](https://openai.com).
Gollm is a Go library that provides an easy interface to interact with Large
Language Model backends including [Ollama](https://ollama.com) and [OpenAI](https://openai.com), along with an embeddings interface for RAG.

Quickly generate responses and embeddings from these AI models and integrate
them into your Go applications.

## 🌟 Features

- **Interact with Ollama & OpenAI:** Generate responses from multiple AI backends.
- **Embeddings Generation:** Generate text embeddings for your applications.
- **RAG / Embeddings Generation:** Generate text embeddings store / load to a vector database for RAG.

---

Expand Down Expand Up @@ -40,97 +38,55 @@ Pull and run a model
ollama run qwen2.5
```

Ollama should run on port `11435` and `localhost`, if you change this, don't
Ollama should run on port `11434` and `localhost`, if you change this, don't
forget to update your config.

# 3. Configuration
## 3. OpenAI

You'll need an OpenAI API key to use the OpenAI backend, which can be be
set within the config as below.

## 4. Configuration

Gollm uses Viper to manage configuration settings.

Backends are configured for either generation or embeddings, and can be set to either Ollama or OpenAI.

For each backend Models is set. Note that for Ollama you will need to
have this as running model, e.g. `ollama run qwen2.5` or `ollama run mxbai-embed-large`.

Finally, in the case of RAG embeddings, a database URL is required.

Currently Postgres is supported, and the database should be created before running the application, with the schena provided in `db/init.sql`

Should you wish, the docker-compose will automate the setup of the database.

```bash
cp examples/config-example.yaml ./config.yaml
```

```yaml
backend:
embeddings: "ollama" # or "ollama"
generation: "ollama" # or "openai"
ollama:
host: "http://localhost:11434"
model: "your-ollama-model-name"

gen_model: "qwen2.5"
emb_model: "mxbai-embed-large"
openai:
api_key: "your-openai-api-key"
model: "text-davinci-003"
api_key: "your-key"
gen_model: "gpt-3.5-turbo"
emb_model: "text-embedding-ada-002"
database:
url: "postgres://user:password@localhost:5432/dbname?sslmode=disable"
log_level: "info"
```

# 🛠️ Usage

Best bet is to see `/examples/main.go` for reference

# 📋 API Reference

Initialise the config (or roll your own)

```go
cfg := config.InitializeViperConfig("config", "yaml", ".")
```

## Ollama Integration

Create Ollama Backend Instance:

```go
ollamaBackend := backend.NewOllamaBackend(cfg.Get("ollama.host"), cfg.Get("ollama.model"))
```

Generate Response:

```go
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

response, err := ollamaBackend.Generate(ctx, "Your prompt here")
Best bet is to see `/examples/main.go` for reference, this explains how to use
the library with full examples for generation, embeddings and implementing RAG.

fmt.Printf("Model: %s\nResponse: %s\n", response.Model, response.Response)
```

Embeddings response:

Support is also present for the Ollama's embeddings API

```go
embeddingResponse, err := ollamaBackend.Embed(ctx, "Text to generate embedding for")
```

> **Note**
> 📝 Only certain models provide an embeddings interface, see [ollama docs](https://ollama.com/blog/embedding-models) for more details

## OpenAI Integration

Create OpenAI Backend Instance:

```go
openaiBackend := backend.NewOpenAIBackend(cfg.Get("openai.api_key"), cfg.Get("openai.model"))
```

Generate Response:

```go
response, err := openaiBackend.Generate(ctx, "Your prompt here")

if len(response.Choices) > 0 {
fmt.Printf("Choice Index: %d\n", response.Choices[0].Index)
fmt.Printf("Message Role: %s\n", response.Choices[0].Message.Role)
fmt.Printf("Message Content: %s\n", response.Choices[0].Message.Content)
fmt.Printf("Finish Reason: %s\n", response.Choices[0].FinishReason)
}
```

Embeddings response:

Support is also present for the OpenAI embeddings API

```go
embeddingResponse, err := openaiBackend.GenerateEmbedding(ctx, "Text to generate embedding for")
```

# 📝 Contributing

Expand Down
28 changes: 28 additions & 0 deletions db/init.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
-- Enable the pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create the table to store embeddings for OpenAI
CREATE TABLE openai_embeddings (
id SERIAL PRIMARY KEY,
doc_id TEXT NOT NULL,
embedding VECTOR(1536), -- Replace 1536 with the actual size of your embeddings
metadata JSONB
);

-- Create the table to store embeddings for Ollama
CREATE TABLE ollama_embeddings (
id SERIAL PRIMARY KEY,
doc_id TEXT NOT NULL,
embedding VECTOR(1024), -- Replace 1024 with the actual size of your embeddings
metadata JSONB
);

-- Index for efficient vector similarity search for OpenAI embeddings
CREATE INDEX openai_embeddings_idx ON openai_embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- Tune 'lists' for better performance

-- Index for efficient vector similarity search for Ollama embeddings
CREATE INDEX ollama_embeddings_idx ON ollama_embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- Tune 'lists' for better performance
18 changes: 18 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
version: '3.8'

services:
postgres:
image: ankane/pgvector:latest
container_name: pgvector-db
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: dbname
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
- ./db/init.sql:/docker-entrypoint-initdb.d/init.sql

volumes:
pgdata:
16 changes: 12 additions & 4 deletions examples/config-example.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
backend:
embeddings: "ollama" # or "ollama"
generation: "ollama" # or "openai"
ollama:
host: "http://localhost:11435"
model: "qwen2.5"
host: "http://localhost:11434"
gen_model: "qwen2.5"
emb_model: "mxbai-embed-large"
openai:
api_key: "some-key"
model: "gpt-4o-mini"
api_key: "your-key"
gen_model: "gpt-3.5-turbo"
emb_model: "text-embedding-ada-002"
database:
url: "postgres://user:password@localhost:5432/dbname?sslmode=disable"
log_level: "info"
Loading