Status: Draft informative clarification for consolidated v1

Available formats: HTML, Markdown

Appendix - Whitespace Boundaries

Status: informative clarification for consolidated v1

Canonical topic owner: ../structure-syntax-v1.md

If this appendix conflicts with the canonical v1 spec set, the canonical v1 spec set wins.

1. Purpose

This appendix clarifies the practical reading of the Core v1 newline rules:

  • between structural tokens, a line break behaves the same as ordinary inter-token whitespace;
  • a line break becomes structurally significant only when the grammar consumes it as a separator;
  • a line break is forbidden when it would split a compact token that must remain contiguous.

2. Structural-Boundary Rule

AEON parsers should treat the following two categories differently:

  • inter-token whitespace at structural boundaries;
  • interior characters of a single lexical token.

If a line break appears at a structural boundary, it is accepted on the same basis as an ordinary space.

If a line break appears inside a compact token, the token is broken and the form is invalid unless that token family explicitly permits embedded newlines.

3. Accepted Structural Examples

These examples are valid because the line breaks appear between structural tokens rather than inside a compact token body.

Datatype and generic boundaries:

Separator spec boundaries:

Attribute and node-head boundaries:

Node children:

4. Rejected Compact-Token Splits

These examples are invalid because the line break splits a token that must remain contiguous.

Separator literal payload:

Bare identifier:

The same compact-token rule applies to numbers, bare identifiers, quoted keys, quoted strings, separator specs, and literal families whose lexical body is defined as contiguous in Core v1.

5. Reading Rule

When deciding whether a line break is valid, the first question is not "is this the same line?" but rather "is this boundary between tokens, or inside one token?"

If it is between structural tokens, the line break is ordinarily acceptable.

If it is inside one token body, it is ordinarily forbidden unless that token family explicitly defines multiline behavior.