TypeScript Semantic Cache

A modular Node.js + Express API that implements semantic caching for efficient LLM workflows.

This project provides:

A prompt module to handle requests to LLM providers (e.g., OpenAI, DeepSeek, Gemini)
A semantic cache module to reduce token costs and latency by caching LLM responses based on semantic similarity rather than exact string match

Base use case: before calling an expensive LLM API, check if a semantically similar question was already answered. If so, instantly return the cached response from Redis.

Project Structure

.src/
│   server.ts
│
├───modules
│   ├───prompt
│   │   ├───controllers
│   │   │       prompt.controller.ts
│   │   ├───dtos
│   │   │       prompt.dto.ts
│   │   ├───interfaces
│   │   │   ├───abstract
│   │   │   │       base-prompt-service.ts
│   │   │   ├───prompt-response.interface.ts
│   │   │   ├───prompt.service.interface.ts
│   │   │   └───request-body.interface.ts
│   │   ├───routes
│   │   │       prompt.routes.ts
│   │   └───services
│   │           deepseek-prompt.service.ts
│   │
│   └───semantic_cache
│       ├───config
│       │       cache.config.ts
│       ├───interfaces
│       │       cache.config.interface.ts
│       │       document.interface.ts
│       │       embedding.repository.interface.ts
│       │       ft-raw-search-result.interface.ts
│       │       vector-search-result.interface.ts
│       ├───providers
│       │       gemini.provider.ts
│       │       openai.provider.ts
│       ├───services
│       │   ├───core
│       │   │       embedding.service.ts
│       │   │       semantic-cache.service.ts
│       │   ├───search
│       │   │       cached-answer.service.ts
│       │   │       parse-vector-search.service.ts
│       │   │       vector-search.service.ts
│       │   └───storage
│       │           document-storage.service.ts
│       │           index-management.service.ts
│       └───utils
│               buffer-to-float32.ts
│               cosine-similarity.ts
│               float32-to-buffer.ts
│
└───shared
    ├───exceptions
    ├───middlewares
    ├───services
    └───utils

Semantic Caching

Instead of caching by exact string match, semantic caching uses vector embeddings to find semantically similar queries. When a user sends a question:

We convert each user question into an embedding, a sort of mathematical representation of meaning (basically a very big float vector);
We store the embedding + original question + LLM response in a Redis vector database;
When a new query comes in, we perform a vector search in Redis to retrieve the 5 most semantically similar questions
We then calculate how close these questions are to the current question, by using a cosine similarity algorithm, which returns a number from 0 to 1.
If the similarity exceeds our threshold (0.82), we return the cached response, skipping the LLM call entirely

This way we reduce token usage and improve user experience by providing answers waaaay faster!

Prompt Module

Provides a consistent entry point to send queries to different LLM APIs. It handles:

Input validation with class-validator
Request formatting
Response normalization

Stack

TypeScript
Node.js + Express
Redis with RediSearch
OpenAI / Gemini APIs
class-validator, class-transformer
Axios

Setup

1. Clone the repository

git clone https://github.com/garzuze/ts_semantic_cache.git
cd ts_semantic_cache

2. Install dependencies

pnpm install

3. Run docker

docker compose up -d

Server will run on http://localhost:3000.

Note: You will need to set your environment variables with the API keys from the providers of your choice. .env.example provides an example of how to set your .env configuration. The system currently supports OpenAi and Gemini embeddings API + DeepSeek and Gemini as prompt providers.

TODO

Create OpenAI prompt service
Create Claude prompt service
Add factory pattern

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
biome.json		biome.json
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TypeScript Semantic Cache

Project Structure

Semantic Caching

Prompt Module

Stack

Setup

1. Clone the repository

2. Install dependencies

3. Run docker

TODO

About

Uh oh!

Releases

Packages

Languages

garzuze/ts_semantic_cache

Folders and files

Latest commit

History

Repository files navigation

TypeScript Semantic Cache

Project Structure

Semantic Caching

Prompt Module

Stack

Setup

1. Clone the repository

2. Install dependencies

3. Run docker

TODO

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages