Understand your GitHub stars faster: fetch stars from one or more users, filter what you care about, then cluster repos by description similarity.
- Features
- Technology Stack
- How It Works
- Architecture
- Getting Started
- Configuration
- API Reference
- Development
- Author
- How to Cite
- License
- Search GitHub users and pull their starred repositories (supports multiple users).
- Explore in a list view with search, language filtering, topic filtering, minimum stars, and sorting.
- Cluster repositories by description similarity using K-means, hierarchical clustering, or PCA + hierarchical clustering.
- Tune clustering parameters from the UI and switch between algorithms.
- Strong input validation in the backend (clear errors and safe limits).
-
Frontend
- Next.js (App Router) + React + TypeScript
- TanStack Query for data fetching/caching
- Zustand for client state
- shadcn/ui-style components (Radix primitives + Tailwind)
- Tailwind CSS
- Zod for client-side validation
-
Backend
- FastAPI + Pydantic (Python 3.11+)
- scikit-learn (TF-IDF, K-means, PCA)
- SciPy hierarchical clustering (scikit-learn uses SciPy for hierarchical clustering)
- uv for dependency management
- Ruff + Pyright + pytest for quality gates
- The frontend calls the GitHub REST API to fetch starred repositories for selected users.
- The frontend sends repository metadata to the backend clustering API.
- The backend vectorizes repository text (description/name) using TF-IDF and runs one or more clustering algorithms.
- The frontend renders clusters and lets you filter/search within the results.
This repo is a monorepo with two services:
frontend/: Next.js app (App Router).backend/: FastAPI service exposing the clustering API.
Data flow:
- Browser -> GitHub REST API (
/search/users,/users/:user/starred) - Browser -> Backend (
POST /clustering)
- Node.js (LTS) +
pnpm - Python 3.11+ +
uv
pnpm install
(cd backend && uv sync)pnpm devpnpm dev:frontend
pnpm dev:backendOpen the app at http://localhost:3000.
Create frontend/.env.local:
NEXT_PUBLIC_API_URL=http://localhost:8000CORS_ORIGINS: comma-separated list of allowed origins. Default:http://localhost:3000.UVICORN_HOST: default127.0.0.1(only used when runningpython app/main.pydirectly).UVICORN_PORT: default8000(only used when runningpython app/main.pydirectly).
Runs clustering algorithms over repository descriptions/names. The backend validates and enforces limits:
repositories: 2..250 itemskmeans_clusters: 2..20 (and must be <= number of repos)hierarchical_threshold: (0, 10]pca_components: 2..50 (and must be <= number of repos and TF-IDF dimensions)
Request Body
{
"repositories": [
{
"id": number,
"name": string,
"full_name": string,
"description": string | null,
"html_url": string,
"stargazers_count": number,
"forks_count": number,
"open_issues_count": number,
"size": number,
"watchers_count": number,
"language": string | null,
"topics": string[],
"owner": {
"login": string,
"avatar_url": string
},
"updated_at": string
}
],
"kmeans_clusters": number,
"hierarchical_threshold": number,
"pca_components": number
}Response
{
"status": "success",
"kmeans_clusters": {
"algorithm": "kmeans",
"clusters": { "0": [0, 2, 4], "1": [1, 3, 5] },
"parameters": { "num_clusters": 2 },
"processing_time_ms": 150.5
},
"hierarchical_clusters": {
"algorithm": "hierarchical",
"clusters": { "1": [0, 2], "2": [1, 3], "3": [4, 5] },
"parameters": { "distance_threshold": 1.5 },
"processing_time_ms": 200.3
},
"pca_hierarchical_clusters": {
"algorithm": "pca_hierarchical",
"clusters": { "1": [0, 2, 4], "2": [1, 3, 5] },
"parameters": { "n_components": 10, "distance_threshold": 1.5 },
"processing_time_ms": 180.7
},
"total_processing_time_ms": 531.5
}Health check endpoint.
{
"status": "healthy",
"timestamp": 1730000000.0,
"clustering_service": "available"
}Run from repo root:
pnpm lint
pnpm biome
pnpm typecheck
pnpm test
pnpm buildBackend-only (from backend/):
uv run ruff format
uv run ruff check
uv run pyright
uv run python -m pytestNotes:
- The frontend uses the public GitHub API from the browser. If you hit rate limits, wait for the reset time and retry.
- GitHub: @BjornMelin
- Website: bjornmelin.io
- LinkedIn: @bjorn-melin
If you use Stardex in your research or project, please cite it as follows:
@software{melin2024stardex,
author = {Melin, Bjorn},
title = {Stardex: GitHub Stars Explorer},
year = {2025},
publisher = {GitHub},
url = {https://github.com/BjornMelin/stardex},
version = {0.1.0},
description = {Explore and cluster GitHub starred repositories using a Next.js UI and a FastAPI clustering service}
}This project is licensed under the MIT License - see the LICENSE file for details.
Built by [Bjorn Melin](https://bjornmelin.io)