Benchmark: MariaDB vs SQLite backend — disk, memory, read & write-concurrency

## Summary

Benchmark comparing a **MariaDB-backed** vs **SQLite-backed** Gameplan site on identical data, to inform when each backend is appropriate. Headline: SQLite is lighter (disk/memory) and faster on reads, and even posts higher *write* throughput on this host — but that write lead is a **durability asymmetry**, not scaling, and SQLite cannot scale writes at all (single-writer).

> Run on a single **macOS / Apple Silicon** machine (otherwise idle). macOS `fsync` is unusually slow, which penalizes MariaDB's durable per-commit writes more than production Linux would. Treat write numbers as directional, not absolute.

## Method

- Two fresh sites, Gameplan installed: `gp-maria.test` (MariaDB 10.6), `gp-sqlite.test` (SQLite/WAL).
- Identical seed on both (verified equal counts):

 | Doctype | Rows |
 |---|---|
 | GP Discussion | 5,000 |
 | GP Comment | 25,000 |
 | GP Unread Record | 167,500 |
 | GP Discussion Visit | 5,000 |
 | GP Project / GP Team / User | 110 / 10 / 22 |

 Each post fans out to ~5 `GP Unread Record` rows (one per private-space member), so the unread table — not comments — is the largest object and the main write-path stressor.
- Server: `gunicorn frappe.app:application`, restarted fresh per site (so each loads only one engine). HTTP load driven via a Python harness (login → sid; target site selected by `Host` header).

## 1. Disk (identical data; SQLite `VACUUM`+checkpoint first)

| | SQLite | MariaDB |
|---|---|---|
| Total | **80 MB** | **91.9 MB** |
| ‑ data | 80 (single file) | 53.3 |
| ‑ index | (inline) | 38.6 |

SQLite **~13% smaller**. MariaDB carries 38.6 MB of InnoDB index/clustering overhead. No FTS search DB existed on either site (search never exercised; it's SQLite FTS5 on both engines regardless, so it cancels out). Largest table on both: `GP Unread Record` (~44 MB / 166k rows on MariaDB).

## 2. Memory (gunicorn: 1 master + 4 workers, warm)

| | SQLite | MariaDB |
|---|---|---|
| Worker-set RSS, warm-idle | 524 MB | 529 MB |
| Worker-set RSS, peak under read load | 568 MB | 565 MB |
| Separate DB daemon RSS | — (in-process) | **76 MB** (`mariadbd`) |

Worker RSS is **effectively identical** — at this size SQLite's in-process page cache is negligible. The real difference is the separate `mariadbd` daemon (~76 MB RSS; `innodb_buffer_pool_size`=128 MB), which is **shared across every site on the machine** — pure overhead for one site, near-free amortized across many. SQLite has no daemon.

## 3. Read latency (conc=8, 600 reqs; 50% feed-list + 50% open-one-discussion)

| | req/s | p50 | p95 | p99 |
|---|---|---|---|---|
| SQLite | **420.7** | **18.2 ms** | 28.4 ms | 41.5 ms |
| MariaDB | 273.4 | 29.6 ms | 47.0 ms | 55.4 ms |

SQLite **~1.5× faster on reads**: the DB is compiled into the worker, so every query skips the socket round-trip + connection handshake MariaDB pays per call. At this size the whole DB fits in OS page cache.

## 4. Write throughput + latency — headline (comment insert = full hook cascade)

Each comment insert re-saves the parent discussion (+ a `Version` row) and inserts ~5 unread records — a fat, multi-statement transaction.

| conc | SQLite req/s | SQLite p50 | SQLite max | MariaDB req/s | MariaDB p50 | MariaDB max |
|---|---|---|---|---|---|---|
| 1 | 18.0 | 50 ms | 152 ms | 7.1 | 138 ms | 254 ms |
| 2 | 20.5 | 94 ms | 183 ms | 8.6 | 227 ms | 333 ms |
| 4 | 20.5 | 92 ms | **5,510 ms** | 8.7 | 446 ms | 557 ms |
| 8 | 20.7 | 93 ms | 4,572 ms | 8.7 | 912 ms | 1,081 ms |
| 16 | 20.1 | 591 ms | 3,898 ms | 8.4 | 1,876 ms | 2,253 ms |

SQLite high-concurrency stress (300 reqs): conc 24/32/48/64 → still **0 failures**, throughput flat ~21 req/s, p50 914 ms → 2,800 ms, max 8,058 ms.

**Neither engine scales write throughput with concurrency** — both are single-machine, write-bound; added concurrency becomes queue latency, not throughput. The difference is the ceiling and failure mode:

- **SQLite throughput is hard-capped ~20 req/s** from conc 1→64 (textbook single-writer signature). It never raised `database is locked` over HTTP, because `busy_timeout=5000ms` + frappe's deadlock-retry absorb contention and the worker pool bounds real parallelism. The cost surfaces as **unbounded tail latency** (8 s at conc 64), not errors.
- **MariaDB throughput ~8.7 req/s** here — lower — latency growing linearly.

### Why SQLite "wins" writes — durability asymmetry (read before concluding)

| | SQLite | MariaDB |
|---|---|---|
| per-commit fsync | **No** (`synchronous=NORMAL`, WAL — syncs only at checkpoint) | **Yes** (`innodb_flush_log_at_trx_commit=1` + `doublewrite`) |
| crash safety of last txns | can lose recently committed txns on power loss | fully durable |

MariaDB does a durable fsync (+ doublewrite) on **every** commit; SQLite does not. That is essentially the entire write-speed gap, amplified by slow macOS `fsync`. This is a fair *out-of-the-box defaults* comparison, but it pits a less-durable config against a fully-durable one. On production Linux the MariaDB gap narrows substantially.

### Genuine SQLite write risk

`database is locked` is real — it appeared during the initial **parallel** data seed, where SQLite's internal second connection (sequence-number emulation) self-contended under I/O starvation and exceeded `busy_timeout`. Over HTTP it stayed hidden only because the worker pool bounds concurrency and short transactions stay under 5 s. Longer transactions, more writer processes than busy-timeout can absorb, or heavy concurrent I/O **will** surface it. SQLite serializes every writer — no horizontal write headroom.

## Bottom line

- **Disk:** SQLite ~13% smaller.
- **Memory:** worker RSS equal; MariaDB adds a ~76 MB shared daemon (overhead for one site, amortized across many).
- **Reads:** SQLite ~1.5× faster (in-process, no socket hop).
- **Writes:** SQLite higher throughput + lower latency *here* — but only because it skips per-commit fsync (and macOS fsync is slow). SQLite cannot scale writes (single-writer, flat ~20 req/s, multi-second tail latency under load) and will throw `database is locked` under enough writer contention. MariaDB sustains durable, multi-writer writes and would close most of the gap on production Linux.

**Fit:** SQLite suits single-user / small-team / read-heavy / low-footprint instances. MariaDB is the safer choice for write-concurrent, multi-user teams and durability-sensitive data — its costs (disk, a shared daemon, per-commit fsync) buy write capacity and crash safety SQLite structurally cannot provide.

### Suggested follow-up

Re-run with **matched durability** (SQLite `synchronous=FULL` *or* MariaDB `innodb_flush_log_at_trx_commit=2`) on **Linux** to isolate engine cost from fsync policy and confirm whether SQLite's write lead survives equal durability.

---
Caveats: macOS fsync penalizes MariaDB writes more than Linux. `mariadbd` RSS is shared across all machine sites (not isolatable). Effective write concurrency bounded by the 16-worker gunicorn pool. Durability settings differ (above) — the dominant write factor. Benchmark harness + parameterized seed loader available on request.

---

## Reproduce — benchmark code

Run from `frappe-bench/`. Create two sites (`--db-type mariadb` / `--db-type sqlite`), install gameplan, run the seed loader on both, start gunicorn, then the harness scripts. Paths in the shell scripts are absolute to the test bench and should be adjusted.

<details>
<summary><code>gameplan/benchmark_seed.py — parameterized data loader (bench execute target)</code></summary>

```python
"""
Synthetic data loader for the MariaDB-vs-SQLite resource comparison.

Creates identical record volumes on whichever site `bench execute` targets so the
two engines can be compared on equal data. Run the SAME counts on both sites.

 bench --site gp-maria.test execute gameplan.benchmark_seed.run
 bench --site gp-sqlite.test execute gameplan.benchmark_seed.run

Override counts:
 bench --site <s> execute gameplan.benchmark_seed.run --kwargs "{'discussions': 5000, 'comments': 25000}"

All lifecycle hooks stay ENABLED on purpose: a real comment insert cascades into a
parent-discussion re-save plus one GP Unread Record per space member. That write
amplification is exactly what the benchmark is meant to exercise, and it is identical
on both engines.
"""

import frappe

# Medium volume per the agreed scope.
DEFAULTS = dict(
	users=20,
	communities=10,
	spaces=100,
	discussions=5000,
	comments=25000,
	members_per_space=5,
)

# Deterministic, index-driven pseudo-randomness so both sites get byte-identical data
# without Math.random-style nondeterminism (also keeps reruns reproducible).
LOREM = (
	"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor "
	"incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud "
	"exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."
)


def _insert(doc):
	"""Insert with a short retry on SQLite's transient 'database is locked'.

	Single-writer SQLite can briefly fail to acquire the write lock under I/O
	pressure (busy_timeout expiry); a couple of backed-off retries make bulk
	seeding robust without changing the data produced.
	"""
	import time

	for attempt in range(5):
		try:
			return doc.insert(ignore_permissions=True)
		except frappe.QueryDeadlockError:
			if attempt == 4:
				raise
			frappe.db.rollback()
			time.sleep(0.2 * (attempt + 1))


def _content(seed: int) -> str:
	# A couple of paragraphs of HTML so Text Editor fields hold realistic byte volume.
	n = (seed % 3) + 1
	paras = "".join(f"{LOREM}" for _ in range(n))
	return f"Item {seed}.{paras}"


def run(**kwargs):
	cfg = {**DEFAULTS, **{k: int(v) for k, v in kwargs.items()}}
	frappe.flags.in_import = True # quieten realtime/publish noise; keeps doc hooks
	print(f"[seed] site={frappe.local.site} config={cfg}")

	users = _create_users(cfg["users"])
	communities = _create_communities(cfg["communities"], users)
	spaces = _create_spaces(cfg["spaces"], communities, users, cfg["members_per_space"])
	discussions = _create_discussions(cfg["discussions"], spaces, users)
	_create_comments(cfg["comments"], discussions, users)

	frappe.db.commit()
	_report_counts()


def _create_users(n):
	existing = set(frappe.get_all("User", pluck="name"))
	users = []
	for i in range(n):
		email = f"bench{i}@example.com"
		if email not in existing:
			doc = frappe.get_doc(
				doctype="User",
				email=email,
				first_name=f"Bench{i}",
				send_welcome_email=0,
				user_type="System User",
			).insert(ignore_permissions=True)
			email = doc.name
		users.append(email)
	# GP User Profile defaults to enabled=0 until activation; flip it on so the users
	# count as real members (needed for logged-in API calls and member resolution).
	for u in users:
		frappe.db.set_value("GP User Profile", {"user": u}, "enabled", 1)
	frappe.db.commit()
	print(f"[seed] users ready: {len(users)}")
	return users


def _create_communities(n, users):
	out = []
	for i in range(n):
		team = frappe.get_doc(doctype="GP Team", title=f"Community {i}")
		# Spread membership so spaces under a community have a member pool to draw from.
		for u in users:
			team.append("members", {"user": u, "is_admin": 1 if u == users[i % len(users)] else 0})
		team.insert(ignore_permissions=True)
		out.append(team.name)
	frappe.db.commit()
	print(f"[seed] communities: {len(out)}")
	return out


def _create_spaces(n, communities, users, members_per_space):
	out = []
	for i in range(n):
		team = communities[i % len(communities)]
		# Private spaces resolve membership from the GP Member table, so unread-record
		# volume per post == members_per_space (bounded + deterministic). Public spaces
		# would instead fan out to ALL enabled users, making volume vary with user count.
		space = frappe.get_doc(doctype="GP Project", title=f"Space {i}", team=team, is_private=1)
		# A bounded, deterministic slice of users per space drives unread-record volume.
		for j in range(members_per_space):
			space.append("members", {"user": users[(i + j) % len(users)]})
		space.insert(ignore_permissions=True)
		out.append(space.name)
		if (i + 1) % 25 == 0:
			frappe.db.commit()
			print(f"[seed] spaces: {i + 1}/{n}")
	frappe.db.commit()
	print(f"[seed] spaces: {len(out)}")
	return out


def _create_discussions(n, spaces, users):
	out = []
	for i in range(n):
		space = spaces[i % len(spaces)]
		frappe.set_user(users[i % len(users)]) # vary author so participants/unread differ
		doc = _insert(
			frappe.get_doc(
				doctype="GP Discussion",
				title=f"Discussion {i}: {LOREM[: 30 + (i % 40)]}",
				project=space,
				content=_content(i),
			)
		)
		out.append((doc.name, space))
		if (i + 1) % 250 == 0:
			frappe.db.commit()
			print(f"[seed] discussions: {i + 1}/{n}")
	frappe.set_user("Administrator")
	frappe.db.commit()
	print(f"[seed] discussions: {len(out)}")
	return out


def _create_comments(n, discussions, users):
	for i in range(n):
		disc_name, _ = discussions[i % len(discussions)]
		frappe.set_user(users[(i * 7) % len(users)])
		_insert(
			frappe.get_doc(
				doctype="GP Comment",
				reference_doctype="GP Discussion",
				reference_name=disc_name,
				content=_content(i + 1000),
			)
		)
		if (i + 1) % 500 == 0:
			frappe.db.commit()
			print(f"[seed] comments: {i + 1}/{n}")
	frappe.set_user("Administrator")
	frappe.db.commit()
	print(f"[seed] comments done: {n}")


def _report_counts():
	for dt in (
		"User",
		"GP Team",
		"GP Project",
		"GP Discussion",
		"GP Comment",
		"GP Unread Record",
		"GP Discussion Visit",
		"GP Activity",
	):
		try:
			print(f"[seed] count {dt:22s} = {frappe.db.count(dt)}")
		except Exception as e:
			print(f"[seed] count {dt:22s} = ERR {e}")

```

</details>

<details>
<summary><code>bench_http.py — HTTP load harness (read + write modes)</code></summary>

```python
#!/usr/bin/env python
"""
HTTP benchmark harness for the MariaDB-vs-SQLite gameplan comparison.

Connects to a single `bench serve` on 127.0.0.1:PORT and selects the target site
via an explicit Host header (so no /etc/hosts changes are needed). Auth uses the
sid cookie captured from a real login.

Measures, per site:
 - read latency : get_discussions feed + open one discussion, at concurrency C
 - write throughput/contention : N concurrent GP Comment inserts (full hook cascade)

Usage:
 env/bin/python bench_http.py --site gp-maria.test --mode read --conc 8 --requests 400
 env/bin/python bench_http.py --site gp-sqlite.test --mode write --conc 10 --requests 200
"""
import argparse
import json
import statistics
import time
from concurrent.futures import ThreadPoolExecutor

import requests

BASE = "http://127.0.0.1:8000" # overridden by --port


def login(site, user="Administrator", pwd="admin"):
	h = {"Host": site}
	r = requests.post(f"{BASE}/api/method/login", data={"usr": user, "pwd": pwd}, headers=h, timeout=30)
	r.raise_for_status()
	sid = r.cookies.get("sid")
	if not sid:
		raise SystemExit(f"login failed on {site}: {r.text[:200]}")
	return sid


def hdr(site, sid):
	return {"Host": site, "Cookie": f"sid={sid}"}


def get_discussion_names(site, sid, limit=200):
	h = hdr(site, sid)
	r = requests.get(
		f"{BASE}/api/v2/method/gameplan.gameplan.doctype.gp_discussion.api.get_discussions",
		params={"limit": limit, "order_by": "last_post_at desc"},
		headers=h,
		timeout=60,
	)
	r.raise_for_status()
	data = r.json()
	rows = data.get("data") or data.get("message") or data
	return [d["name"] for d in rows]


def timed(fn):
	t0 = time.perf_counter()
	ok = True
	err = None
	try:
		fn()
	except Exception as e: # noqa: BLE001
		ok = False
		err = str(e)[:120]
	return (time.perf_counter() - t0) * 1000.0, ok, err


def run(reqs, conc):
	"""reqs: list of zero-arg callables. Returns latencies(ms), oks, errors."""
	lat, oks, errs = [], 0, []
	t0 = time.perf_counter()
	with ThreadPoolExecutor(max_workers=conc) as ex:
		for ms, ok, err in ex.map(lambda f: timed(f), reqs):
			lat.append(ms)
			if ok:
				oks += 1
			elif err:
				errs.append(err)
	wall = time.perf_counter() - t0
	return lat, oks, errs, wall


def pct(xs, p):
	if not xs:
		return 0.0
	xs = sorted(xs)
	i = min(len(xs) - 1, int(round((p / 100.0) * (len(xs) - 1))))
	return xs[i]


def summarize(label, lat, oks, errs, wall, total):
	from collections import Counter

	out = {
		"label": label,
		"total": total,
		"ok": oks,
		"failed": total - oks,
		"wall_s": round(wall, 3),
		"req_per_s": round(total / wall, 1) if wall else 0,
		"p50_ms": round(pct(lat, 50), 1),
		"p95_ms": round(pct(lat, 95), 1),
		"p99_ms": round(pct(lat, 99), 1),
		"max_ms": round(max(lat), 1) if lat else 0,
		"err_sample": dict(Counter(errs).most_common(3)),
	}
	print(json.dumps(out))
	return out


def main():
	ap = argparse.ArgumentParser()
	ap.add_argument("--site", required=True)
	ap.add_argument("--mode", choices=["read", "write"], required=True)
	ap.add_argument("--conc", type=int, default=8)
	ap.add_argument("--requests", type=int, default=400)
	ap.add_argument("--warm", type=int, default=30)
	ap.add_argument("--port", type=int, default=8000)
	args = ap.parse_args()

	global BASE
	BASE = f"http://127.0.0.1:{args.port}"

	sid = login(args.site)
	names = get_discussion_names(args.site, sid, limit=200)
	if not names:
		raise SystemExit("no discussions found; seed first")
	h = hdr(args.site, sid)

	if args.mode == "read":
		# Mix: half feed-list calls, half open-one-discussion calls (realistic home + open).
		def feed():
			r = requests.get(
				f"{BASE}/api/v2/method/gameplan.gameplan.doctype.gp_discussion.api.get_discussions",
				params={"limit": 50, "order_by": "last_post_at desc"},
				headers=h,
				timeout=60,
			)
			r.raise_for_status()

		def open_one(i):
			n = names[i % len(names)]
			r = requests.get(f"{BASE}/api/v2/document/GP Discussion/{n}", headers=h, timeout=60)
			r.raise_for_status()

		reqs = []
		for i in range(args.requests):
			reqs.append(feed if i % 2 == 0 else (lambda i=i: open_one(i)))

		# Warmup (not measured)
		run(reqs[: args.warm], args.conc)
		lat, oks, errs, wall = run(reqs, args.conc)
		summarize(f"{args.site} read conc={args.conc}", lat, oks, errs, wall, len(reqs))

	else: # write
		def post_comment(i):
			n = names[i % len(names)]
			body = {
				"reference_doctype": "GP Discussion",
				"reference_name": n,
				"content": f"bench concurrent comment {i}",
			}
			r = requests.post(
				f"{BASE}/api/v2/document/GP Comment",
				json=body,
				headers={**h, "Content-Type": "application/json"},
				timeout=120,
			)
			if r.status_code >= 400:
				raise RuntimeError(f"{r.status_code}:{r.text[:80]}")

		reqs = [(lambda i=i: post_comment(i)) for i in range(args.requests)]
		lat, oks, errs, wall = run(reqs, args.conc)
		summarize(f"{args.site} write conc={args.conc}", lat, oks, errs, wall, len(reqs))


if __name__ == "__main__":
	main()

```

</details>

<details>
<summary><code>measure_disk.sh — disk footprint (SQLite VACUUM vs information_schema)</code></summary>

```bash
#!/bin/bash
# Disk footprint for both sites. Run from frappe-bench root.
set -uo pipefail
BENCH=/Users/netchampfaris/Projects/benches/frappe-bench
cd "$BENCH"

echo "=== SQLite (gp-sqlite.test) ==="
# Checkpoint + compact so WAL pages fold back and we measure settled size.
bench --site gp-sqlite.test execute frappe.db.sql --kwargs "{'query':'PRAGMA wal_checkpoint(TRUNCATE)'}" >/dev/null 2>&1
bench --site gp-sqlite.test execute frappe.db.sql --kwargs "{'query':'VACUUM'}" >/dev/null 2>&1
echo "-- per-file (KB) --"
du -k sites/gp-sqlite.test/*.db sites/gp-sqlite.test/*.db-* 2>/dev/null
echo "-- main db only (MB) --"
ls -1 sites/gp-sqlite.test/*.db | grep -v "_search" | while read f; do
 echo "$f $(du -m "$f" | cut -f1) MB"
done
echo "-- search db (MB) --"
du -m sites/gp-sqlite.test/*_search.db 2>/dev/null
echo "-- all .db* total (MB) --"
du -ch sites/gp-sqlite.test/*.db* 2>/dev/null | tail -1

echo
echo "=== MariaDB (gp-maria.test) ==="
DB=$(python3 -c "import json;print(json.load(open('sites/gp-maria.test/site_config.json'))['db_name'])")
echo "db_name=$DB"
bench --site gp-maria.test execute frappe.db.sql --kwargs "{'query':\"SELECT ROUND(SUM(data_length+index_length)/1024/1024,1) AS mb, ROUND(SUM(data_length)/1024/1024,1) AS data_mb, ROUND(SUM(index_length)/1024/1024,1) AS index_mb, SUM(table_rows) AS approx_rows FROM information_schema.tables WHERE table_schema='$DB'\"}" 2>/dev/null | tail -5
echo "-- top 8 tables by size (MB) --"
bench --site gp-maria.test execute frappe.db.sql --kwargs "{'query':\"SELECT table_name, ROUND((data_length+index_length)/1024/1024,2) mb, table_rows FROM information_schema.tables WHERE table_schema='$DB' ORDER BY (data_length+index_length) DESC LIMIT 8\"}" 2>/dev/null | tail -10

```

</details>

<details>
<summary><code>mem_lat.sh — per-site memory + read latency (fresh gunicorn per site)</code></summary>

```bash
#!/bin/bash
# Per-site memory + read-latency. Restarts gunicorn fresh so only this site loads.
# Usage: mem_lat.sh <site>
set -uo pipefail
SITE=$1
BENCH=/Users/netchampfaris/Projects/benches/frappe-bench
SC=/private/tmp/claude-501/-Users-netchampfaris-Projects-benches-frappe-bench-apps-gameplan/6a9a8f95-4852-43bd-9b8b-a3e04f88161c/scratchpad
PORT=8056
PY=$BENCH/env/bin/python

cd "$BENCH/sites"
pkill -f "gunicorn.*8056" 2>/dev/null; sleep 2
nohup "$BENCH/env/bin/gunicorn" -b 127.0.0.1:$PORT -w 4 --timeout 120 frappe.app:application > "$SC/gunicorn.log" 2>&1 &
sleep 8
WPIDS=$(pgrep -f "gunicorn.*8056" | tr '\n' ',' | sed 's/,$//')
echo "[$SITE] gunicorn pids: $WPIDS"

# Warm: load modules + DB cache in all workers for THIS site only.
"$PY" "$SC/bench_http.py" --site "$SITE" --mode read --conc 4 --requests 60 --warm 0 --port $PORT >/dev/null 2>&1

sum_rss () { ps -o rss= -p "$WPIDS" 2>/dev/null | awk '{s+=$1} END{print int(s/1024)}'; }
IDLE=$(sum_rss)
echo "[$SITE] warm-idle server RSS: ${IDLE} MB"

# Sample peak RSS in background while the read load runs.
PEAKF="$SC/peak_$SITE.txt"; echo "$IDLE" > "$PEAKF"
( for _ in $(seq 1 200); do r=$(sum_rss); p=$(cat "$PEAKF"); [ "${r:-0}" -gt "${p:-0}" ] && echo "$r" > "$PEAKF"; sleep 0.2; done ) &
SAMPLER=$!

"$PY" "$SC/bench_http.py" --site "$SITE" --mode read --conc 8 --requests 600 --warm 20 --port $PORT
kill "$SAMPLER" 2>/dev/null
echo "[$SITE] peak server RSS under read load: $(cat "$PEAKF") MB (idle ${IDLE} MB)"

```

</details>

<details>
<summary><code>write_sweep.sh — write-concurrency sweep</code></summary>

```bash
#!/bin/bash
# Write-concurrency sweep: post GP Comments (full hook cascade) at rising concurrency.
# Restarts gunicorn with many workers so requests truly hit the DB in parallel.
set -uo pipefail
BENCH=/Users/netchampfaris/Projects/benches/frappe-bench
SC=/private/tmp/claude-501/-Users-netchampfaris-Projects-benches-frappe-bench-apps-gameplan/6a9a8f95-4852-43bd-9b8b-a3e04f88161c/scratchpad
PORT=8056
PY=$BENCH/env/bin/python
REQ=200

cd "$BENCH/sites"
pkill -f "gunicorn.*8056" 2>/dev/null; sleep 2
nohup "$BENCH/env/bin/gunicorn" -b 127.0.0.1:$PORT -w 16 --timeout 120 frappe.app:application > "$SC/gunicorn.log" 2>&1 &
sleep 9
echo "workers: $(pgrep -f "gunicorn.*8056" | wc -l | tr -d ' ')"

for SITE in gp-sqlite.test gp-maria.test; do
 echo "===== $SITE ====="
 # warm
 "$PY" "$SC/bench_http.py" --site "$SITE" --mode read --conc 4 --requests 20 --warm 0 --port $PORT >/dev/null 2>&1
 for C in 1 2 4 8 16; do
 "$PY" "$SC/bench_http.py" --site "$SITE" --mode write --conc $C --requests $REQ --port $PORT
 done
done

```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark: MariaDB vs SQLite backend — disk, memory, read & write-concurrency #508

Summary

Method

1. Disk (identical data; SQLite `VACUUM`+checkpoint first)

2. Memory (gunicorn: 1 master + 4 workers, warm)

3. Read latency (conc=8, 600 reqs; 50% feed-list + 50% open-one-discussion)

4. Write throughput + latency — headline (comment insert = full hook cascade)

Why SQLite "wins" writes — durability asymmetry (read before concluding)

Genuine SQLite write risk

Bottom line

Suggested follow-up

Reproduce — benchmark code

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Doctype	Rows
GP Discussion	5,000
GP Comment	25,000
GP Unread Record	167,500
GP Discussion Visit	5,000
GP Project / GP Team / User	110 / 10 / 22

	SQLite	MariaDB
Total	80 MB	91.9 MB
‑ data	80 (single file)	53.3
‑ index	(inline)	38.6

	SQLite	MariaDB
Worker-set RSS, warm-idle	524 MB	529 MB
Worker-set RSS, peak under read load	568 MB	565 MB
Separate DB daemon RSS	— (in-process)	76 MB (`mariadbd`)

	req/s	p50	p95	p99
SQLite	420.7	18.2 ms	28.4 ms	41.5 ms
MariaDB	273.4	29.6 ms	47.0 ms	55.4 ms

conc	SQLite req/s	SQLite p50	SQLite max	MariaDB req/s	MariaDB p50	MariaDB max
1	18.0	50 ms	152 ms	7.1	138 ms	254 ms
2	20.5	94 ms	183 ms	8.6	227 ms	333 ms
4	20.5	92 ms	5,510 ms	8.7	446 ms	557 ms
8	20.7	93 ms	4,572 ms	8.7	912 ms	1,081 ms
16	20.1	591 ms	3,898 ms	8.4	1,876 ms	2,253 ms

	SQLite	MariaDB
per-commit fsync	No (`synchronous=NORMAL`, WAL — syncs only at checkpoint)	Yes (`innodb_flush_log_at_trx_commit=1` + `doublewrite`)
crash safety of last txns	can lose recently committed txns on power loss	fully durable

Uh oh!

Benchmark: MariaDB vs SQLite backend — disk, memory, read & write-concurrency #508

Description

Summary

Method

1. Disk (identical data; SQLite VACUUM+checkpoint first)

2. Memory (gunicorn: 1 master + 4 workers, warm)

3. Read latency (conc=8, 600 reqs; 50% feed-list + 50% open-one-discussion)

4. Write throughput + latency — headline (comment insert = full hook cascade)

Why SQLite "wins" writes — durability asymmetry (read before concluding)

Genuine SQLite write risk

Bottom line

Suggested follow-up

Reproduce — benchmark code

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Disk (identical data; SQLite `VACUUM`+checkpoint first)