Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- [Experimental][Sourcebot EE] Added permission syncing repository Access Control Lists (ACLs) between Sourcebot and GitHub. [#508](https://github.com/sourcebot-dev/sourcebot/pull/508)

### Changed
- Improved repository query performance by adding db indices. [#526](https://github.com/sourcebot-dev/sourcebot/pull/526)
- Improved repository query performance by removing JOIN on `Connection` table. [#527](https://github.com/sourcebot-dev/sourcebot/pull/527)
Expand Down
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Copyright (c) 2025 Taqla Inc.

Portions of this software are licensed as follows:

- All content that resides under the "ee/", "packages/web/src/ee/", and "packages/shared/src/ee/" directories of this repository, if these directories exist, is licensed under the license defined in "ee/LICENSE".
- All content that resides under the "ee/", "packages/web/src/ee/", "packages/backend/src/ee/", and "packages/shared/src/ee/" directories of this repository, if these directories exist, is licensed under the license defined in "ee/LICENSE".
- All third party components incorporated into the Sourcebot Software are licensed under the original license provided by the owner of the applicable component.
- Content outside of the above mentioned directories or restrictions above is available under the "Functional Source License" as defined below.

Expand Down
1 change: 1 addition & 0 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
"docs/features/code-navigation",
"docs/features/analytics",
"docs/features/mcp-server",
"docs/features/permission-syncing",
{
"group": "Agents",
"tag": "experimental",
Expand Down
30 changes: 16 additions & 14 deletions docs/docs/configuration/config-file.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,19 @@ Sourcebot syncs the config file on startup, and automatically whenever a change

The following are settings that can be provided in your config file to modify Sourcebot's behavior

| Setting | Type | Default | Minimum | Description / Notes |
|-------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
| `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
| `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. |
| `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. |
| `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. |
| `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. |
| `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. |
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. |
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. |
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. |
| `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. |
| `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. |
| `enablePublicAccess` **(deprecated)** | boolean | false | — | Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
| Setting | Type | Default | Minimum | Description / Notes |
|-------------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
| `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
| `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. |
| `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. |
| `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. |
| `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. |
| `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. |
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. |
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. |
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. |
| `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. |
| `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. |
| `enablePublicAccess` **(deprecated)** | boolean | false | — | Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the repo permission syncer should run. |
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the user permission syncer should run. |
1 change: 1 addition & 0 deletions docs/docs/configuration/environment-variables.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ The following environment variables allow you to configure your Sourcebot deploy
| `AUTH_EE_OKTA_ISSUER` | `-` | <p>The issuer URL for Okta SSO authentication.</p> |
| `AUTH_EE_GCP_IAP_ENABLED` | `false` | <p>When enabled, allows Sourcebot to automatically register/login from a successful GCP IAP redirect</p> |
| `AUTH_EE_GCP_IAP_AUDIENCE` | - | <p>The GCP IAP audience to use when verifying JWT tokens. Must be set to enable GCP IAP JIT provisioning</p> |
| `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` | `false` | <p>Enables [permission syncing](/docs/features/permission-syncing).</p> |


### Review Agent Environment Variables
Expand Down
6 changes: 5 additions & 1 deletion docs/docs/connections/github.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -196,4 +196,8 @@ To connect to a GitHub host other than `github.com`, provide the `url` property

<GitHubSchema />

</Accordion>
</Accordion>

## See also

- [Syncing GitHub Access permissions to Sourcebot](/docs/features/permission-syncing#github)
6 changes: 3 additions & 3 deletions docs/docs/features/agents/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ title: "Agents Overview"
sidebarTitle: "Overview"
---

<Warning>
Agents are currently a experimental feature. Have an idea for an agent that we haven't built? Submit a [feature request](https://github.com/sourcebot-dev/sourcebot/issues/new?template=feature_request.md) on our GitHub.
</Warning>
import ExperimentalFeatureWarning from '/snippets/experimental-feature-warning.mdx'

<ExperimentalFeatureWarning />

Agents are automations that leverage the code indexed on Sourcebot to perform a specific task. Once you've setup Sourcebot, check out the
guides below to configure additional agents.
Expand Down
72 changes: 72 additions & 0 deletions docs/docs/features/permission-syncing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
title: "Permission syncing"
sidebarTitle: "Permission syncing"
tag: "experimental"
---

import LicenseKeyRequired from '/snippets/license-key-required.mdx'
import ExperimentalFeatureWarning from '/snippets/experimental-feature-warning.mdx'

<LicenseKeyRequired />
<ExperimentalFeatureWarning />

# Overview

Permission syncing allows you to sync Access Permission Lists (ACLs) from a code host to Sourcebot. When configured, users signed into Sourcebot (via the code host's OAuth provider) will only be able to access repositories that they have access to on the code host. Practically, this means:

- Code Search results will only include repositories that the user has access to.
- Code navigation results will only include repositories that the user has access to.
- Ask Sourcebot (and the underlying LLM) will only have access to repositories that the user has access to.
- File browsing is scoped to the repositories that the user has access to.

Permission syncing can be enabled by setting the `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` environment variable to `true`.

```bash
docker run \
-e EXPERIMENT_EE_PERMISSION_SYNC_ENABLED=true \
/* additional args */ \
ghcr.io/sourcebot-dev/sourcebot:latest
```

## Platform support

We are actively working on supporting more code hosts. If you'd like to see a specific code host supported, please [reach out](https://www.sourcebot.dev/contact).

| Platform | Permission syncing |
|:----------|------------------------------|
| [GitHub (GHEC & GHEC Server)](/docs/features/permission-syncing#github) | ✅ |
| GitLab | 🛑 |
| Bitbucket Cloud | 🛑 |
| Bitbucket Data Center | 🛑 |
| Gitea | 🛑 |
| Gerrit | 🛑 |
| Generic git host | 🛑 |

# Getting started

## GitHub

Prerequisite: [Add GitHub as an OAuth provider](/docs/configuration/auth/providers#github).

Permission syncing works with **GitHub.com**, **GitHub Enterprise Cloud**, and **GitHub Enterprise Server**. For organization-owned repositories, users that have **read-only** access (or above) via the following methods will have their access synced to Sourcebot:
- Outside collaborators
- Organization members that are direct collaborators
- Organization members with access through team memberships
- Organization members with access through default organization permissions
- Organization owners.

**Notes:**
- A GitHub OAuth provider must be configured to (1) correlate a Sourcebot user with a GitHub user, and (2) to list repositories that the user has access to for [User driven syncing](/docs/features/permission-syncing#how-it-works).
- OAuth tokens must assume the `repo` scope in order to use the [List repositories for the authenticated user API](https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-repositories-for-the-authenticated-user) during [User driven syncing](/docs/features/permission-syncing#how-it-works). Sourcebot **will only** use this token for **reads**.

# How it works

Permission syncing works by periodically syncing ACLs from the code host(s) to Sourcebot to build an internal mapping between Users and Repositories. This mapping is hydrated in two directions:
- **User driven** : fetches the list of all repositories that a given user has access to.
- **Repo driven** : fetches the list of all users that have access to a given repository.

User driven and repo driven syncing occurs every 24 hours by default. These intervals can be configured using the following settings in the [config file](/docs/configuration/config-file):
| Setting | Type | Default | Minimum |
|-------------------------------------------------|---------|------------|---------|
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 |
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 |
4 changes: 4 additions & 0 deletions docs/snippets/experimental-feature-warning.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

<Warning>
This is an experimental feature. Certain functionality may be incomplete and breaking changes may ship in non-major releases. Have feedback? Submit a [issue](https://github.com/sourcebot-dev/sourcebot/issues) on GitHub.
</Warning>
20 changes: 20 additions & 0 deletions docs/snippets/schemas/v3/index.schema.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,16 @@
"deprecated": true,
"description": "This setting is deprecated. Please use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead.",
"default": false
},
"experiment_repoDrivenPermissionSyncIntervalMs": {
"type": "number",
"description": "The interval (in milliseconds) at which the repo permission syncer should run. Defaults to 24 hours.",
"minimum": 1
},
"experiment_userDrivenPermissionSyncIntervalMs": {
"type": "number",
"description": "The interval (in milliseconds) at which the user permission syncer should run. Defaults to 24 hours.",
"minimum": 1
}
},
"additionalProperties": false
Expand Down Expand Up @@ -195,6 +205,16 @@
"deprecated": true,
"description": "This setting is deprecated. Please use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead.",
"default": false
},
"experiment_repoDrivenPermissionSyncIntervalMs": {
"type": "number",
"description": "The interval (in milliseconds) at which the repo permission syncer should run. Defaults to 24 hours.",
"minimum": 1
},
"experiment_userDrivenPermissionSyncIntervalMs": {
"type": "number",
"description": "The interval (in milliseconds) at which the user permission syncer should run. Defaults to 24 hours.",
"minimum": 1
}
},
"additionalProperties": false
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
"watch:mcp": "yarn workspace @sourcebot/mcp build:watch",
"watch:schemas": "yarn workspace @sourcebot/schemas watch",
"dev:prisma:migrate:dev": "yarn with-env yarn workspace @sourcebot/db prisma:migrate:dev",
"dev:prisma:generate": "yarn with-env yarn workspace @sourcebot/db prisma:generate",
"dev:prisma:studio": "yarn with-env yarn workspace @sourcebot/db prisma:studio",
"dev:prisma:migrate:reset": "yarn with-env yarn workspace @sourcebot/db prisma:migrate:reset",
"dev:prisma:db:push": "yarn with-env yarn workspace @sourcebot/db prisma:db:push",
Expand Down
17 changes: 8 additions & 9 deletions packages/backend/src/connectionManager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,6 @@ import { env } from "./env.js";
import * as Sentry from "@sentry/node";
import { loadConfig, syncSearchContexts } from "@sourcebot/shared";

interface IConnectionManager {
scheduleConnectionSync: (connection: Connection) => Promise<void>;
registerPollingCallback: () => void;
dispose: () => void;
}

const QUEUE_NAME = 'connectionSyncQueue';

type JobPayload = {
Expand All @@ -30,10 +24,11 @@ type JobResult = {
repoCount: number,
}

export class ConnectionManager implements IConnectionManager {
export class ConnectionManager {
private worker: Worker;
private queue: Queue<JobPayload>;
private logger = createLogger('connection-manager');
private interval?: NodeJS.Timeout;

constructor(
private db: PrismaClient,
Expand Down Expand Up @@ -75,8 +70,9 @@ export class ConnectionManager implements IConnectionManager {
});
}

public async registerPollingCallback() {
setInterval(async () => {
public startScheduler() {
this.logger.debug('Starting scheduler');
this.interval = setInterval(async () => {
const thresholdDate = new Date(Date.now() - this.settings.resyncConnectionIntervalMs);
const connections = await this.db.connection.findMany({
where: {
Expand Down Expand Up @@ -369,6 +365,9 @@ export class ConnectionManager implements IConnectionManager {
}

public dispose() {
if (this.interval) {
clearInterval(this.interval);
}
this.worker.close();
this.queue.close();
}
Expand Down
8 changes: 7 additions & 1 deletion packages/backend/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,11 @@ export const DEFAULT_SETTINGS: Settings = {
maxRepoGarbageCollectionJobConcurrency: 8,
repoGarbageCollectionGracePeriodMs: 10 * 1000, // 10 seconds
repoIndexTimeoutMs: 1000 * 60 * 60 * 2, // 2 hours
enablePublicAccess: false // deprected, use FORCE_ENABLE_ANONYMOUS_ACCESS instead
enablePublicAccess: false, // deprected, use FORCE_ENABLE_ANONYMOUS_ACCESS instead
experiment_repoDrivenPermissionSyncIntervalMs: 1000 * 60 * 60 * 24, // 24 hours
experiment_userDrivenPermissionSyncIntervalMs: 1000 * 60 * 60 * 24, // 24 hours
}

export const PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES = [
'github',
];
Loading
Loading