Skip to content

Lightweight browser SDK to run ONNX models with WebGPU/WASM.

Notifications You must be signed in to change notification settings

hwclass/onnx-web-kit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ONNX Web Kit

Node >=18 WebGPU ready WASM fallback License ISC

onnx-web-kit

Lightweight browser SDK to run ONNX models with WebGPU/WASM. Includes a minimal sentiment example under examples/sentiment.

Prerequisites

  • Node.js 18+ and npm
  • A modern browser; WebGPU gives best performance (Chrome/Edge 121+, flags may be needed on some platforms), falls back to WASM automatically.
  • Local assets under public/models/… (example sentiment model + tokenizer is provided).

Model download for the demo

  • npm install will try to fetch public/models/sentiment/v1/model.onnx if you set MODEL_ONNX_URL (or ONNX_MODEL_URL) to a direct .onnx download. Example:
    MODEL_ONNX_URL=https://your-hosted-model/model.onnx npm install
  • If the file already exists locally, the download is skipped. To bypass on CI, set SKIP_MODEL_DOWNLOAD=1.
  • The ONNX file is ignored by git (public/models/**/*.onnx), so each developer can fetch it without committing large binaries.

Install (SDK consumers)

npm install onnx-web-kit

Local dev (this repo)

npm install
npm run dev          # runs the sentiment example at http://localhost:3000

Build example

npm run build        # builds the sentiment example into dist/examples/sentiment

Examples live under examples/; add more there following the same pattern.

  • examples/sentiment/demo: Vanilla JS page wiring the SDK directly with the bundled sentiment model and tokenizer in public/models/sentiment/v1/.
  • examples/react-generic-model: Vite + React demo with a generic useOnnxModel hook and a sentiment wrapper component showing text classification end-to-end.

Using the SDK (app code)

import {
  createRuntime,
  registerModel,
} from "onnx-web-kit";
import { runTextModel } from "onnx-web-kit/core/text-utils.js";

// 1) Create a runtime
const runtime = createRuntime({
  preferredBackend: "webgpu",   // or "wasm"
  modelBasePath: "/models",     // where your ONNX + tokenizer files live
  debug: true,
  onLog: console.log,
});

// 2) Register your model
registerModel("sentiment", {
  version: "v1",
  path: "sentiment/v1/model.onnx",
  tokenizer: "sentiment/v1/tokenizer.json",
  // Optional: add labels to get decoded outputs instead of raw logits
  labels: ["very negative", "negative", "neutral", "positive", "very positive"],
});

// 3) Run inference
const result = await runTextModel(runtime, "sentiment", "I love this!");
// If labels provided: { logits, probs, label, labelIndex, labelProb }
// Otherwise: raw logits array
console.log(result);

React hook example

import React from "react";
import useOnnxModel from "./examples/react-generic-model/useOnnxModel.js";

export function SentimentWidget() {
  const { ready, loading, error, analyze, output } = useOnnxModel({
    modelName: "sentiment",
    modelPath: "sentiment/v1/model.onnx",
    tokenizerPath: "sentiment/v1/tokenizer.json",
    modelBasePath: "/models",
    preferredBackend: "webgpu",
  });

  const run = () => analyze("I love how simple this SDK makes browser AI!");

  return (
    <div>
      <button onClick={run} disabled={!ready || loading}>
        {loading ? "Running…" : "Analyze sentiment"}
      </button>
      {error && <p>Error: {error.message}</p>}
      {output && <pre>{JSON.stringify(output, null, 2)}</pre>}
    </div>
  );
}

Adding your own model

  1. Drop your ONNX and tokenizer files under public/models/<name>/<version>/.
    • Example layout:
      public/models/your-model/v1/model.onnx
      public/models/your-model/v1/tokenizer.json
      public/models/your-model/v1/tokenizer_config.json
      public/models/your-model/v1/vocab.txt (if your tokenizer needs it)
      
  2. Register it in your app:
    registerModel("your-model", {
      version: "v1",
      path: "your-model/v1/model.onnx",
      tokenizer: "your-model/v1/tokenizer.json",
      labels: ["labelA", "labelB", "labelC"], // optional
    });
  3. Call the appropriate helper (e.g., runTextModel).

Why use this SDK?

  • Zero server round-trips: Models run in the browser; great for latency and privacy.
  • WebGPU-first, WASM fallback: Takes advantage of modern GPU acceleration without breaking older browsers.
  • Simple DX: One runtime, a small registry, and a high-level helper (runTextModel) hide the ONNX + tokenizer wiring.
  • Pluggable labels: App developers can attach label sets per model to get decoded outputs automatically.
  • Self-hosted assets: Works fully offline once the ONNX + tokenizer files are served locally.
  • Minimal footprint: Plain JS, Vite dev server, no heavy framework lock-in.

When to use it

  • You need client-side inference for text models (classification, simple NLP tasks) without standing up an API.
  • You want to prototype or demo ONNX models quickly in a browser.
  • You care about user data staying on-device (no server calls).
  • You need a portable setup that can drop into any static hosting environment.

Notes

  • Labels are optional and per-model. If provided in registerModel, runTextModel returns decoded labels; otherwise it returns raw logits.
  • The SDK currently exposes a text helper (runTextModel). Extend similarly for other modalities (image/audio) by following the loader + feeds pattern.
  • WebGPU availability varies by browser/OS; the runtime will fall back to WASM automatically.

About

Lightweight browser SDK to run ONNX models with WebGPU/WASM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published