Skip to content

SchemaView performance improvements on large iModels #9430

Description

@rschili

As mentioned here: iTwin/presentation#1394 SchemaView is a major performance improvement in some areas, but it has one blind spot where it regresses performance - the "Models tree initial load" scenario.

Summary

SchemaView loads every included schema in a single blob on the first getSchemaView() call. That is
the right trade for content-heavy scenarios (decisive win) but regresses "Models tree initial load,"
where the consumer only needs BisCore (plus a few references) yet pays to hydrate the whole iModel.
On large-schema iModels this is a structural regression (BisCore-only micro-benchmark up to +577% on the
worst case).

Make getSchemaView() cheap: return a husk backed by a tiny reference-graph manifest, and hydrate
schemas on demand (hydrateSchemas([...])) by merging per-closure fragments into the existing global
flat-array index space. The synchronous read path and all flyweights/resolvers stay unchanged.

Scope / approach (chosen direction: Option C)

  • Manifest from meta.ECSchemaDef + meta.SchemaHasSchemaReferences (ECSql, no new blob).
  • New native fragment writer + schema_view_fragment(...) pragma emitting cross-schema references as
    external ECInstanceId tokens (no row leakage between schemas). Blob format v1 -> v2 (superset).
  • TS merges each fragment append-only into one global index space (reuses SchemaViewBuilder,
    recovers cross-fragment string dedup in memory). Only derivedClassMap invalidated per merge.
  • Additive API: hydrateSchemas(names[]), hydrateAll(), getAvailableSchemaNames().
  • Single-flight _hydrating promise gates writers (reuses the IModelDb/IModelConnection pattern);
    reads stay synchronous.

Acceptance criteria

  • getSchemaView() returns a husk; hydrateSchemas(["BisCore"]) makes only BisCore + its references
    available, synchronously readable afterwards.
  • hydrateAll() reproduces today's full-load semantics; re-hydrating already-loaded schemas is a no-op.
  • Native fragment for schema X contains zero rows owned by any other schema; all cross-schema references
    are external tokens resolved in TS.
  • No regression to the synchronous read API surface (flyweights, resolvers, caches).
  • Overlapping-hydrate concurrency test passes (no duplicate classes/strings, one effective merge per
    schema).
  • "Models tree initial load" / BisCore-only benchmark: large-schema tail is neutral-or-better and
    small-schema wins are preserved.

Notes / risks

  • Highest-risk piece is the native fragment writer + no-leakage audit; de-risk first.
  • Intentional behavioral break: getSchemaView() no longer returns a fully loaded view. Accepted given
    the early @beta rollout stage and the beta deprecation window.
  • Out of scope: the viewer escalation's Platform-infra asks (backend scaling, model+category index).

Metadata

Metadata

Assignees

Labels

ecschemaIssues related to the various ecschema packages
No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions