Status: Draft Draft

Available formats: HTML, Markdown

Canonical AEON Form — Appendix

Appendix to: AEON Specification v1


Overview

Canonical AEON defines a deterministic textual representation suitable for hashing, diffing, signing, and reproducible builds.

Canonical form is optional for parsing but REQUIRED for canonical emitters.


Document Structure

If no header exists, emit the default:

Header fields sorted lexicographically.

Top-Level Order

  1. aeon:header first
  2. All other bindings in lexicographic order

This ordering is not formatter-only advice:

  • a structured header appearing after any body binding is invalid input;
  • canonicalization MUST reject such input rather than reordering it into a valid document.

Keys and Objects

  • Object keys sorted lexicographically
  • Attribute keys sorted lexicographically
  • One key per line in multi-line objects

Canonical layout is structure-sensitive:

  • inline object forms may remain inline when the enclosing canonical layout is inline;
  • when an enclosing object or list is rendered as multiline canonical layout, nested object values also canonically expand to multiline object blocks rather than remaining inline;
  • key sorting still applies at every level regardless of whether the object is rendered inline or multiline.

Lists

  • Element order preserved (never reordered)
  • Single-line for simple scalars, multi-line for complex values

When a list is rendered multiline, complex object elements canonically expand to multiline object blocks.


Scalars

Numbers

  • Canonicalization of :n values is value-normalizing while preserving the

broad representation family chosen by the author:

  • integer family
  • decimal family
  • exponent family
  • Remove all _ separators
  • No leading + sign
  • Leading-dot decimals gain an explicit zero (.50.5)
  • Decimal-family values trim redundant trailing fractional zeroes, but retain at

least one fractional digit (10.0010.0)

  • Exponent-family values use lowercase e
  • Exponent-family values remove redundant exponent sign and leading exponent

zeroes (1.0E+031e3)

  • Zero follows the same family model rather than a special ad hoc rule:
  • integer zero → 0 or -0
  • decimal zero → 0.0 or -0.0
  • exponent zero → 0e0 or -0e0

Radix

  • Canonicalization of :radix[...] values remains representation-preserving

rather than value-normalizing

  • _ separators are removed from the canonical payload
  • the remaining digit sequence, decimal point placement, and leading zero width

are otherwise preserved

Booleans

Always true or false (lowercase).

Toggle

Preserve original literal (yes, on, etc.).

Base Literals

  • Hex: lowercase (#ff00aa)
  • Remove _ separators

Strings

  • Always double quotes (")
  • Minimal escaping
  • Raw line breaks → \n
  • Non-ASCII preserved as UTF-8

Multiline semantic strings canonically emit as spaces-only trimticks:

  • canonical output never uses tabs in the trimtick gutter
  • canonical equality is determined by the resulting trimmed string value
  • single-line strings continue to emit as ordinary quoted strings
  • trimticks may collapse to ordinary quoted strings in inline canonical contexts
  • one-line normalized trimticks in inline containers emit as ordinary quoted strings
  • multiline trimticks rendered inside inline object or attribute forms emit as escaped quoted strings rather than multiline trimtick blocks

Separator Literals

Canonical separator literals:

  • No whitespace between = and ^
  • No raw whitespace outside quoted segments
  • Raw segments are emitted verbatim
  • Quoted segments use canonical quoted-string escaping

Quoted segments preserve their string content:


References

References MUST preserve semantic intent:

  • clone references remain clone references (~...)
  • pointer references remain pointer references (~>...)

Canonical form MUST NOT:

  • Change a clone-intent reference into a pointer-intent reference or vice versa
  • Inline or resolve references
  • Alter the logical value graph

Canonical reference rendering also applies these normalizations:

  • explicit root prefixes are elided when redundant (~$.a~a, ~>$.a~>a)
  • quoted member or attribute selectors may collapse to bare identifier form when the decoded segment is already a canonical bare identifier (~a@["meta"]~a@meta)

Node Heads

Canonical node-head ordering is:

  • tag@{...}:datatype

Canonical form preserves this order for node heads in the same way it preserves key@{...}:type ordering for bindings.


Whitespace

  • 2-space indentation (no tabs)
  • One space around =
  • LF line endings (\n)
  • Opening brace on same line as binding
  • Closing brace on own line

Non-Goals

Canonical form does NOT:

  • Alter the logical value graph
  • Inline references
  • Change types
  • Add or remove bindings (except default header)

End of Canonical Form Appendix