Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ DOC_FILES := \
layer.md \
config.md \
manifest.md \
manifest-list.md
manifest-list.md \
canonicalization.md

FIGURE_FILES := \
img/media-types.png
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ The OCI Image Format project creates and maintains the software shipping contain
- [Image Configuration](config.md)
- [Image Manifest](manifest.md)
- [Image Manifest List](manifest-list.md)
- [Canonicalization](canonicalization.md)

## Overview

Expand Down
21 changes: 21 additions & 0 deletions canonicalization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Canonicalization

OCI Images [are](descriptor.md) [content-addressable](image-layout.md).
One benefit of content-addressable storage is easy deduplication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly more intro context - something like "One of the goals of the OCI Image Specification is to leverage content-addressable storage (CAS), which provides benefits like easy deduplication"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I stand corrected. I still think the current wording covers that though. Would you still prefer your initial suggestion?

Many images might depend on a particular [layer](layer.md), but there will only be one blob in the [store](image-layout.md).
With a different serialization, that same semantic layer would have a different hash, and if both versions of the layer are referenced there will be two blobs with the same semantic content.
To allow efficient storage, implementations serializing content for blobs SHOULD use a canonical serialization.
This increases the chance that different implementations can push the same semantic content to the store without creating redundant blobs.

## JSON

[JSON][] content SHOULD be serialized as [canonical JSON][canonical-json].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON content (for example, ...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON content (for example all of the bla bla media types or bla bla manifests) SHOULD be serialized as

Of the [OCI Image Format Specification media types](media-types.md), all the types ending in `+json` contain JSON content.
Implementations:

* [Go][]: [github.com/docker/go][], which claims to implement [canonical JSON][canonical-json] except for Unicode normalization.

[canonical-json]: http://wiki.laptop.org/go/Canonical_JSON
[github.com/docker/go]: https://github.com/docker/go/
[Go]: https://golang.org/
[JSON]: http://json.org/