Skip to content

Commit 3c0f51a

Browse files
klntskyRyun1rphair
authored
CIP-0116? | Universal JSON Encoding for Domain Types (#766)
* JSON-spec: add specs for numeric types * Add Address types and PlutusData * Re-implement PlutusData via oneOf * Switch to `tag`/`value`-based encoding of PlutusData for ease of programmatic processing * Add schema for TransactionOutput subtypes * Add TransactionMetadatum * More types * Rename cardano.json to cardano-babbage.json * Add NativeScript, Update, TransactionBody * - Fix: requiredProperties -> required - New types: TransactionWitnessSet, MIR, BootstrapWitness, etc * Use `format` instead of `contentEncoding` * Fix `pattern` for ByteString - it can be empty * Add a README, rename the schema file * Apply suggestions from code review by Ryun1 Co-authored-by: Ryan Williams <[email protected]> * Fix header layout * Remove a dead link * Expand "Limitations", add info on encoding of binary types * Fix RewardAddress prefix Co-authored-by: Ryan Williams <[email protected]> * Add note on uniqueness of encoding & move scope of the schema section * Add TransactionMetadata, AuxiliaryData and Transaction types * Typo: requiredProperties -> required * Use uniform names for numeric types * Complete the definitions + fixes * Fix PlutusScript, add titles everywhere * Add a note on AuxiliaryData * fix bracket mismatch in code sample * setting candidate CIP number to 116 * Fix suggested by Evgeny: Use BigInt in PlutusData * Fix suggested by Evgeny * Apply suggestions + fixes * Assign a number * Fix Mint type. Update docs for `Map` type schema. * Add a link to the repo with tests * small tidy * fix header link * added changelog template * add notes to rationale * editorial adjustments * flesh out rationale * fix typo * Update CIP-0116/README.md Co-authored-by: Vladimir Kalnitsky <[email protected]> * Update CIP-0116/README.md Co-authored-by: Vladimir Kalnitsky <[email protected]> * Update CIP-0116/README.md Co-authored-by: Vladimir Kalnitsky <[email protected]> * Update CIP-0116/README.md Co-authored-by: Vladimir Kalnitsky <[email protected]> * adjust scope rationale * Update CIP-0116/README.md Co-authored-by: Vladimir Kalnitsky <[email protected]> * Update cardano-babbage.json * Format * Fix incorrectly specified required field * Fix incorrectly specified required Mint properties * Add missing Array wrapper for Mint assets * Add new formats * Specify length for BootstrapWitness chain_code * Move PoolMetadataHash to a separate type * Specify VRFCert proof length * Remove legacy HeaderLeaderCert * Inline AuxiliaryDataSet type * Update 'Path to active' * Update CIP-0116/README.md Co-authored-by: Vladimir Kalnitsky <[email protected]> * Add myself to implementors Co-authored-by: Ryan <[email protected]> * Use PoolPubKeyHash instead of Ed25519KeyHash in stake delegation cert --------- Co-authored-by: Ryan Williams <[email protected]> Co-authored-by: Robert Phair <[email protected]> Co-authored-by: Ryan Williams <[email protected]>
1 parent b14311d commit 3c0f51a

File tree

3 files changed

+2284
-0
lines changed

3 files changed

+2284
-0
lines changed

CIP-0116/README.md

Lines changed: 302 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
---
2+
CIP: 116
3+
Title: Standard JSON encoding for Domain Types
4+
Category: Tools
5+
Status: Proposed
6+
Authors:
7+
- Vladimir Kalnitsky <[email protected]>
8+
Implementors:
9+
- Vladimir Kalnitsky <[email protected]>
10+
Discussions:
11+
- https://github.com/cardano-foundation/CIPs/pull/742
12+
- https://github.com/cardano-foundation/CIPs/pull/766
13+
Created: 2024-02-22
14+
License: CC-BY-4.0
15+
---
16+
17+
## Abstract
18+
19+
Canonical JSON encoding for Cardano domain types lets the ecosystem converge on a single way of serializing data to JSON, thus freeing the developers from repeating roughly the same, but slightly different encoding/decoding logic over and over.
20+
21+
## Motivation: why is this CIP necessary?
22+
23+
Cardano domain types have canonical CDDL definitions (for every era), but when it comes to use in web apps, where JSON is the universally accepted format, there is no definite standard. This CIP aims to change that.
24+
25+
The full motivation text is provided in [CPS-11 | Universal JSON Encoding for Domain Types](https://github.com/cardano-foundation/CIPs/tree/master/CPS-0011).
26+
27+
## Specification
28+
29+
This CIP is expected to contain multiple json-schema definitions for Cardano Eras and breaking intra-era hardforks starting from Babbage.
30+
31+
| Ledger era | Hardfork | Ledger Commit | Schema | Changelog Entry |
32+
| --- | --- | --- | --- |--- |
33+
| Babbage | Vasil | [12dc779](https://github.com/IntersectMBO/cardano-ledger/blob/12dc779d7975cbeb69c7c18c1565964a90f50920/eras/babbage/impl/cddl-files/babbage.cddl) | [cardano-babbage.json](./cardano-babbage.json) | N/A |
34+
35+
### Tests & utilities for JSON validation
36+
37+
[`cip-0116-tests`](https://github.com/mlabs-haskell/cip-0116-tests) repo contains utility functions and a test suite for the schema. In particular, there's a `mkValidatorForType` function that builds a validator function for any type defined in the schema.
38+
39+
### Scope of the Schema
40+
41+
The schemas should cover `Block` type and all of its structural components, which corresponds to the scope of CDDL files located in [the ledger repo](https://github.com/IntersectMBO/cardano-ledger/).
42+
43+
### Schema Design Principles
44+
45+
Below you can find some principles outlining the process of schema creation / modification. They are intended to be applied when there is a need to create a schema for a new Cardano era.
46+
47+
#### Uniqueness of encoding
48+
49+
Every transaction (i.e. CBOR-encoded binary) must have exactly one valid JSON encoding, up to entry ordering in mappings (that are represented as key-value pairs).
50+
51+
For a single JSON fixture, however, there are multiple variants of encoding it as CBOR.
52+
53+
#### Consistency with the previous versions
54+
55+
To simplify transitions of dApps between eras, the scope of changes introduced to the schemas SHOULD be limited to the scope of CDDL changes.
56+
57+
### Schema Conventions
58+
59+
These conventions help to keep the schema uniform in style.
60+
61+
#### Encoding of binary types
62+
63+
Binary data MUST be encoded as lower-case hexademical strings. Restricting the character set to lower-case letters (`a-f`) allows for comparisons and equality checks without the need to normalize the values to a uniform case.
64+
65+
#### Encoding of mapping types
66+
67+
`Map`-like container types MUST be encoded as arrays of key-value pairs.
68+
69+
```json
70+
"Map": {
71+
"type": "array",
72+
"items": {
73+
"type": "object",
74+
"properties": {
75+
"key": ...,
76+
"value": ...
77+
},
78+
"required": [
79+
"key",
80+
"value"
81+
],
82+
"additionalProperties": false
83+
}
84+
}
85+
```
86+
87+
Uniqueness of `"key"` objects in a map MUST be preserved (but this property is not expressible via a schema).
88+
89+
Implementations MUST consider mappings with conflicting keys invalid.
90+
91+
Some mapping-like types, specifically `Mint`, allow for duplicate keys. Types like these should not be encoded as maps, instead, `key` and `value` properties should be named differently.
92+
93+
#### Encoding of variant types
94+
95+
Encoding types with variable payloads MUST be done with the use of `oneOf` and an explicit discriminator property: `tag`:
96+
97+
```json
98+
{
99+
"Credential": {
100+
"type": "object",
101+
"discriminator": {
102+
"propertyName": "tag"
103+
},
104+
"oneOf": [
105+
{
106+
"type": "object",
107+
"properties": {
108+
"tag": {
109+
"enum": [
110+
"pubkey_hash"
111+
]
112+
},
113+
"value": {
114+
"$ref": "cardano-babbage.json#/definitions/Ed25519KeyHash"
115+
}
116+
},
117+
"required": ["tag", "value"],
118+
"additionalProperties": false
119+
},
120+
{
121+
"type": "object",
122+
"properties": {
123+
"tag": {
124+
"enum": [
125+
"script_hash"
126+
]
127+
},
128+
"value": {
129+
"$ref": "cardano-babbage.json#/definitions/ScriptHash"
130+
}
131+
},
132+
"required": ["tag", "value"],
133+
"additionalProperties": false
134+
}
135+
]
136+
}
137+
}
138+
```
139+
140+
Other properties of a tagged object MUST be specified in lower-case snake-case.
141+
142+
#### Encoding of enum types
143+
144+
Enums are a special kind of variant types that carry no payloads. These MUST be encoded as string `enum`s.
145+
146+
Lowercase snake case identifiers MUST be used for the options, e.g.:
147+
148+
```json
149+
{
150+
"Language": {
151+
"title": "Language",
152+
"type": "string",
153+
"enum": [
154+
"plutus_v1",
155+
"plutus_v2"
156+
]
157+
}
158+
}
159+
```
160+
161+
#### Encoding of record types
162+
163+
All record types MUST be encoded as objects with explicit list of `required` properties, and `additionalProperties` set to `false` (see "absence of extensibility" chapter for the motivation behind this suggestion).
164+
165+
#### Encoding of nominal type synonyms
166+
167+
Some of the types have identical representations, differing only by nominal name. For example, `Slot` domain type is expressed as `uint` in CDDL.
168+
169+
For these types, their nominal name SHOULD NOT have a separate definition in the json-schema, and the "representation type" should be used via a `$ref` instead. The domain type name SHOULD be included as `title` string at the point of usage.
170+
171+
### Additional format types
172+
173+
Some non-standard `format` types are used:
174+
175+
- `hex` - lower-case hex-encoded byte string
176+
- `bech32` - [bech32](https://en.bitcoin.it/wiki/Bech32) string
177+
- `base58` - [base58](https://bitcoinwiki.org/wiki/base58) string
178+
- `uint64` - 64-bit unsigned integer
179+
- `int128` - 128-bit signed integer
180+
- `string64` - a unicode string that must not exceed 64 bytes when utf8-encoded.
181+
- `posint64` - a positive (0 excluded) 64-bit integer. `1 .. 2^64-1`
182+
183+
### Limitations
184+
185+
JSON-schema does not allow to express certain properties of some of the types.
186+
187+
#### Uniqueness of mapping keys
188+
189+
See the chapter on encoding of mapping types.
190+
191+
#### Bech32 and Base58 formats
192+
193+
Validity of values of these types can't be expressed as a regular expression, so the implementations MAY validate them separately.
194+
195+
#### Address types
196+
197+
Bech32 strings are not always valid addresses: even if the prefixes are correct, the [binary layout of the payload](https://github.com/IntersectMBO/cardano-ledger/blob/f754084675a1decceed4f309814b09605f443dd5/libs/cardano-ledger-core/src/Cardano/Ledger/Address.hs#L603) must also be valid.
198+
199+
The implementations MAY validate it separately.
200+
201+
#### Byte length limits for strings
202+
203+
In CDDL, the length of a `tstr` value gives the number of bytes, but in `json-schema` there is no way to specify restrictions on byte lengths. So, `maxLength` is not the correct way of specifying the limits, but it is still useful, because no string longer than 64 *characters* satisfies the 64-byte limit.
204+
205+
#### Auxiliary Data encoding
206+
207+
`auxiliary_data` CDDL type is handled specially.
208+
209+
```cddl
210+
auxiliary_data =
211+
metadata ; Shelley
212+
/ [ transaction_metadata: metadata ; Shelley-ma
213+
, auxiliary_scripts: [ * native_script ]
214+
]
215+
/ #6.259({ ? 0 => metadata ; Alonzo and beyond
216+
, ? 1 => [ * native_script ]
217+
, ? 2 => [ * plutus_v1_script ]
218+
, ? 3 => [ * plutus_v2_script ]
219+
})
220+
```
221+
222+
Instead of providing all three variants of encoding, we base the schema on the one that is the most general (the last one):
223+
224+
```json
225+
{
226+
"AuxiliaryData": {
227+
"properties": {
228+
"metadata": {
229+
"$ref": "cardano-babbage.json#/definitions/TransactionMetadata"
230+
},
231+
"native_scripts": {
232+
"type": "array",
233+
"items": {
234+
"$ref": "cardano-babbage.json#/definitions/NativeScript"
235+
}
236+
},
237+
"plutus_scripts": {
238+
"type": "array",
239+
"items": {
240+
"$ref": "cardano-babbage.json#/definitions/PlutusScript"
241+
}
242+
}
243+
},
244+
}
245+
}
246+
```
247+
248+
It is up to implementors to decide how to serialize the values into CBOR. The property we want to maintain is preserved regardless of the choice: for every block binary there is exactly one JSON encoding.
249+
250+
### Versioning
251+
252+
This CIP should not follow a conventional versioning scheme, rather it should be altered via pull request before a hardforks to add new a JSON schema to align with new ledger ers. Each schema must be standalone and not reuse definitions between eras. Authors MUST follow the [Schema Scope](#scope-of-the-schema), [Schema Design Principles](#schema-design-principles) and [Schema Conventions](#schema-conventions).
253+
254+
Furthermore, for each subsequent schema, the [changelog](./changelog.md) must be updated. Authors must clearly articulate the deltas between schemas.
255+
256+
## Rationale: how does this CIP achieve its goals?
257+
258+
### Scope
259+
260+
We keep the scope of this standard to the data types within Cardano blocks. The rationale for this is that block data is by far the most useful for the majority of Cardano actors. There is also one nice benefit that the definitions can map directly from the provided CDDL file from ledger team.
261+
262+
### Strictness
263+
264+
This CIP lays out strong conventions that future schema authors must follow, along with a large set of design principles. The aim is to minimize the potential for unavoidable deltas between schemas.
265+
266+
By setting sometimes arbitrary conventions we hope to create a single possible interpretation from CBOR to JSON, alleviating any ambiguity.
267+
268+
### Absence of extensibility
269+
270+
The schemas MUST NOT be extensible with additional properties. This may sound counter-intuitive and against the spirit of json-schema, but there are some motivations behind that:
271+
272+
- More safety from typos: object fields that are optional may be specified with slightly incorrect names in dApps' code, leading to inability of the decoders to pick up the values, which may go unnoticed.
273+
- Clear delineation between Cardano domain types and user dApp domain types: forcing the developers to store their dApp domain data separately from Cardano data, or close to it (as opposed to mixing these together in a single object) will indirectly motivate better structured dApp code.
274+
275+
### JSON
276+
277+
JSON was chosen as there is no viable alternative. The majority of Cardano's web tooling is built with Javascript where JSON is the primary object representation format.
278+
279+
Furthermore, even across non-Javascript based stacks, JSON enjoys wide tooling support, this improves the potential for builders to adopt this standard.
280+
281+
### Bech32 for addresses
282+
283+
We choose to use Bech32 as the representation for Cardano addresses. When compared to the alternative of hexademical encoding, Bech32 gives the advantages of an included checksum and a human readable prefix.
284+
285+
## Path to Active
286+
287+
### Acceptance Criteria
288+
289+
- [ ] One future ledger era schema is added
290+
- [ ] This standard is implemented within three separate tools, libraries, etc.
291+
292+
### Implementation Plan
293+
294+
- [x] Complete the specification for the current Babbage era
295+
- [ ] Provide a test suite validating JSON fixtures for all the types against the schema
296+
- [x] Provide an implementation of validating functions that uses this json-schema
297+
- [mlabs-haskell/cip-0116-tests](https://github.com/mlabs-haskell/cip-0116-tests)
298+
- [ ] Collect a list of cardano domain types implementations and negotiate transition to the specified formats with maintainers (if it makes sense and is possible)
299+
300+
## Copyright
301+
302+
This CIP is licensed under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode).

0 commit comments

Comments
 (0)