Specification Reference
ODIN-L 1.0, Open Data Interchange Notation Language
Overview
ODIN is a data interchange format designed for the AI era, combining the token efficiency of CSV, the nesting capability of JSON, the type safety of Protocol Buffers, and the human readability of TOML, all in a single notation that machines and humans can parse, generate, and reason about without ambiguity.
Self-describing typed values
Prefixes (#, ?, @, ^, ~, #$, ##) eliminate parsing ambiguity
Inline modifiers
! required, * confidential, - deprecated, metadata travels with values
Tabular mode
Eliminates key repetition for lists, cutting payload size 20-50% vs JSON
Strict syntax
One valid parse, no barewords, no leniency, no edge cases
Mixed modes
Flat assignments, nested paths, and typed tables in one document
Token-efficient
30% fewer tokens and 40% smaller than JSON on average, while carrying more semantic information per byte. Less data, more meaning.
ODIN is not a standard. Standards imply committees, politics, and "implementation flexibility" that defeats interoperability. ODIN is a notation, one way to write it, one way to read it.
Design Principles
- Self-describing: Types are in the data, not just schema
- Line-based: One assignment per line, no wrapping
- Flat: Explicit paths, no implicit tree traversal
- Diffable: Meaningful diffs in version control
- Token-efficient: Minimal structural overhead
- Deterministic: Same data produces same output
- Composable: Documents can chain and reference each other
Why Symbol Density?
ODIN uses prefix symbols (#, ##, #$, ?, @, ^, ~, !, *, -) rather than keywords.
For machines: Symbols are unambiguous single-token markers. A parser sees # and knows "number follows" without lookahead. No reserved words to conflict
with data values.
For AI/LLM processing: JSON's structural overhead, braces, brackets, colons, quotes around every key, is token waste. Compare:
{"vehicle": {"year": 2022, "make": "Honda", "model": "Accord"}}vehicle.year = #2022
vehicle.make = "Honda"
vehicle.model = "Accord"The ODIN version is more tokens of actual data, fewer tokens of syntax. When processing millions of documents through LLMs, this matters.
Why Flat Paths?
Nested formats like JSON, YAML, and XML have a fundamental problem: merge conflicts. When two branches modify different fields in the same object, the entire object block conflicts because the closing braces don't match up.
Tab-based and whitespace-based nesting (YAML, Python-style) adds another problem: depth is ambiguous. A misplaced tab shifts an entire subtree. Whether a field is nested two or three levels deep is invisible without counting whitespace characters. Flat paths make depth explicit and unambiguous.
ODIN's flat paths eliminate both problems:
; Branch A adds:
policy.coverage.collision = #500
; Branch B adds:
policy.coverage.comprehensive = #250
; Git merges cleanly, different lines, no conflictHeaders provide visual grouping for humans while maintaining the flat, diffable structure underneath:
{policy.coverage}
collision = #500
comprehensive = #250
liability = #100000Why Self-Describing Types?
This is the single biggest reason ODIN exists. JSON gives you six types: string, number, boolean, null, array, object. XML gives you essentially nothing. Everything is a string until an XSD says otherwise. Both rely on out-of-band documentation to tell you what the data actually means.
ODIN encodes type information directly on the wire. Every value tells you what it is. No schema lookup, no data dictionary, no guessing.
JSON: Documentation-Dependent
{
"policyNumber": "POL-2024-001",
"premium": 1250.00,
"discount": 0.125,
"deductible": 500,
"effective": "2024-06-15",
"expires": "2025-06-15",
"duration": "P1Y",
"ssn": "123-45-6789",
"drivers": 12,
"active": true,
"lastClaim": null
}Now answer these questions without a data dictionary:
- premium: dollars? cents? what currency? USD or CAD or EUR?
- discount:
0.125. Is that 12.5% or 0.125%? A ratio? A factor? - deductible: integer dollars or float dollars?
- effective / expires: dates? Or just strings that happen to look like dates?
- duration: ISO 8601 duration string? Or some other format?
- ssn: just a string. Nothing flags it as PII. Nothing prevents it from leaking to logs.
- drivers: integer 12 or float 12.0? Different parsers will give you different answers.
- lastClaim: explicitly null, or just missing? The schema can't tell you.
Every field needs documentation. Every parser needs a schema validator to round-trip the types back. Every integration project starts with a 50-page data dictionary that goes stale the moment someone adds a field.
ODIN: The Data Tells You
{policy}
number = "POL-2024-001"
premium = #$1250.00:USD
discount = #%12.5
deductible = ##500
effective = 2024-06-15
expires = 2025-06-15
duration = P1Y
ssn = *"123-45-6789"
drivers = ##12
active = ?true
lastClaim = ~Now read it again:
- premium:
#$1250.00:USD. Currency, US dollars, period. No ambiguity. - discount:
#%12.5. Percent. 12.5%, not 0.125 of a ratio. - deductible:
##500. Integer. Will never be parsed as500.0. - effective:
2024-06-15. An actual date, not a string. ISO 8601 syntax baked into ODIN. - duration:
P1Y. An actual duration value. Parsers return a typed Duration object. - ssn:
*"...". The asterisk prefix marks it as PII. SDKs can auto-redact in logs. - drivers:
##12. Integer, every SDK in every language returns an integer. - active:
?true. Explicitly boolean, not a string that happens to be "true". - lastClaim:
~. Explicitly null. Distinct from missing.
Why It Matters
When a Rust parser, a TypeScript parser, a Python parser, and a Java parser all read #$1250.00:USD, they all return a Currency object with amount=1250.00 and code="USD". The type isn't determined by
the parser, the schema, the documentation, or the application code. It's determined
by the bytes on the wire.
This eliminates an entire class of bugs:
- No more "wait, is the API returning cents or dollars this week?"
- No more JavaScript silently truncating large integer IDs to floats
- No more
"2024-13-45"sneaking through as a "valid" string - No more PII leaking into logs because nothing flagged it as sensitive
- No more "the schema says number, but the data has strings" mismatches
For insurance, healthcare, finance, and legal, industries where ambiguity has real cost, this is foundational. The data documents itself, and every system that reads it gets the same answer.
Value Types
Core Types
| Type | Prefix | Format | Examples |
|---|---|---|---|
| String | (none) | Quoted | "Honda", "Price = $50" |
| Boolean | ? | Prefix required | ?true, ?false |
| Null | ~ | Literal | ~ |
| Reference | @ | Path | @parties[0], @vehicles[0].garaging |
| Binary | ^ | Base64 | ^SGVsbG8=, ^sha256:ABC123... |
| Verb | % | Expression | %upper @name, %concat @first " " @last |
Numeric Types
| Type | Prefix | Description | Examples |
|---|---|---|---|
| Number | # | Any numeric (integer or decimal) | #2022, #-45.50, #1.2e10 |
| Integer | ## | Whole number only | ##42, ##-100, ##0 |
| Currency | #$ | Monetary value with precision preserved | #$100.00, #$-50.25, #$99.99:USD, #$1.00000000:BTC |
| Percent | #% | Percentage as decimal (0-1 range) | #%0.15, #%1.0, #%0.055 |
Numeric Rules
#accepts any numeric precision, use for general numbers##explicitly marks value as integer; decimal values are invalid#%marks value as a percentage (stored as 0-1 decimal, where 0.5 = 50%)- Exponential notation (
1.2e10) valid with#prefix - Negative values: place
-after prefix (#-45,##-100,#$-50.00)
Currency Codes
#$ marks a value as currency. Decimal precision is preserved from the source
(minimum 2, up to 18 digits), making it suitable for both traditional finance and crypto.
An optional currency code suffix identifies the denomination. Codes follow ISO 4217 for fiat currencies (USD, EUR, GBP, JPY) and common ticker symbols for digital assets (BTC, ETH, USDC):
price = #$99.99:USD ; US Dollars
cost = #$1234.56:EUR ; Euros
amount = #$50.00:GBP ; British Pounds
crypto = #$1.00000000:BTC ; 8-digit precision for Bitcoin
local = #$100.00 ; No code = local currency assumedTemporal Types
| Type | Prefix | Format | Examples |
|---|---|---|---|
| Date | (none) | YYYY-MM-DD | 2024-06-15 |
| Timestamp | (none) | ISO 8601 | 2025-12-06T14:30:00Z |
| Time | T | ISO 8601 time | T14:30:00, T09:00:00.500 |
| Duration | P | ISO 8601 duration | P6M, P1Y, PT30M, P1DT12H |
Sections & Paths
Headers
Headers set a path prefix for subsequent assignments, providing visual grouping while maintaining flat, diffable structure.
| Syntax | Meaning | Example |
|---|---|---|
{path} | Set absolute prefix | {vehicles[0]} |
{.path} | Set relative prefix | {.garaging} |
{} | Reset to root | {} |
{vehicles[0]}
vin = "1HGCM82633A004352"
year = #2022
{.garaging} ; resolves to vehicles[0].garaging
line1 = "123 Main Street"
city = "Columbus"
{.lienholder} ; resolves to vehicles[0].lienholder
name = "First National Bank"
{drivers[0]} ; absolute, resets context
name.first = "John"Path Reference
| Element | Syntax | Example |
|---|---|---|
| Simple path | segment.segment | policy.effective |
| Array access | segment[n] | vehicles[0] |
| Nested array | segment[n].segment[n] | drivers[0].violations[2] |
| Extension | &domain.path | &com.acme.custom_field |
Extension paths use reverse domain notation to namespace custom fields:
&com.acme.priority = ##3
&org.opendata.region = "NA"
&io.example.custom = "value"Arrays & Tables
Arrays are ordered collections accessed by zero-based index. They are created implicitly by assigning to indexed paths.
items[0].name = "First"
items[0].price = #10.00
items[1].name = "Second"
items[1].price = #20.00Array Rules
- Zero-based: First element is
[0] - Contiguous: Indices must be sequential with no gaps
- Object elements: Array elements are objects with fields, not primitive values
- Implicit creation: Assigning to
field[0]creates the array
items[] = ~ ; explicit empty/null array (no elements)Tabular Mode
For arrays of flat objects (primitives only), tabular syntax provides a compact representation that eliminates key repetition:
{line_items[] : sku, description, qty, price}
"ABC-001", "Widget", ##10, #$5.99
"ABC-002", "Gadget", ##5, #$12.50
"XYZ-100", "Cable, 6ft", ##20, #$3.25This is equivalent to the expanded form:
line_items[0].sku = "ABC-001"
line_items[0].description = "Widget"
line_items[0].qty = ##10
line_items[0].price = #$5.99
line_items[1].sku = "ABC-002"
line_items[1].description = "Gadget"
line_items[1].qty = ##5
line_items[1].price = #$12.50
; ...Tabular Rules
- Header declares columns:
{path[] : col1, col2, col3}defines array path and column names - Rows follow header: Each non-blank, non-comment line is a row
- Comma-separated values: Values separated by
,with optional whitespace - Standard type prefixes: Values use
#,##,#$,@,^,~,true,false, dates, etc. - Quote strings with commas: Use
"value, with comma"for strings containing commas - Null cells: Use
~for null values - Absent cells: Empty cell (nothing between commas) means field is absent
- Exit tabular mode: Next header
{...}or document separator---ends tabular section
Cell Value Semantics
| Syntax | Meaning | Path Created? |
|---|---|---|
~ | Null value | Yes (with null) |
"" | Empty string | Yes (with "") |
| (empty) | Absent/missing | No |
Relative column names reduce repetition when multiple columns share a common parent path:
; These two headers are equivalent:
{holders[] : name, address.line1, address.city, address.state, address.postal, active}
{holders[] : name, address.line1, .city, .state, .postal, active}Primitive Arrays
For arrays of primitive values (no object fields), use the special ~ column marker with {path[] : ~} syntax.
Each row is one value. Types can be mixed within the same array:
; Integer array
{txIndexes[] : ~}
##8208220048659020
##2830423323628866
##3871696279527106
; String array
{tags[] : ~}
"urgent"
"important"
"reviewed"
; Mixed-type array
{values[] : ~}
"text"
##42
?true
~
#$9.99Modifiers
| Modifier | Symbol | Position | Meaning |
|---|---|---|---|
| Critical | ! | After = | Field is required; absence fails validation |
| Confidential | * | After ! or = | Field contains sensitive data; systems should protect accordingly |
| Deprecated | - | After = | Field is obsolete; may be removed in future |
Modifier Order: = [!][-][*][type_prefix]value
field = !"value" ; critical string
field = *"value" ; confidential string
field = -"value" ; deprecated string
field = !*"value" ; critical + confidential string
field = !-"value" ; critical + deprecated (still required but obsolete)
field = -*"value" ; deprecated + confidential
field = !-*"value" ; critical + deprecated + confidential
field = !#100 ; critical number
field = !##42 ; critical integer
field = !#$99.99 ; critical currencyModifier Semantics
- Critical (
!): Field must be present and non-null; validation fails if absent - Confidential (
*): Signals that the field contains sensitive data (SSN, account numbers, etc.). This is a hint to consuming systems, the value itself is transmitted as-is, but systems should mask it in logs, displays, and non-secure outputs. - Deprecated (
-): Field exists for backward compatibility; consumers should migrate to alternatives
Document Chaining
ODIN supports composing multiple documents into a single stream or referencing external documents. This enables layered data models where a base document is modified by subsequent documents.
The --- separator divides multiple ODIN documents within a single stream:
{$}
odin = "1.0.0"
id = "policy_base_001"
role = "base"
{policy}
number = "PAP-2024-001"
effective = 2024-06-15
term = P6M
{vehicles[0]}
vin = "1HGCM82633A004352"
year = #2022
make = "Honda"
model = "Accord"
---
{$}
odin = "1.0.0"
id = "endorsement_001"
role = "endorsement"
parent = @policy_base_001
effective = 2024-09-01
{vehicles[1]}
vin = "5YJSA1E26MF123456"
year = #2023
make = "Tesla"
model = "Model 3"Document Roles
Roles are free-form strings, any value meaningful to your domain. Common patterns include:
| Role | Purpose | Example Domain |
|---|---|---|
base | Original document or transaction | Any |
amendment | Modification to a base document | Contracts, legal |
revision | Updated version of a prior document | Publishing, regulatory |
header | Shared entity data that persists across transactions | Insurance, finance |
endorsement | Mid-term modification | Insurance |
renewal | New term based on prior agreement | Insurance, leasing |
cancellation | Terminates an active record | Insurance, subscriptions |
correction | Fixes errors in a prior document | Finance, healthcare |
supplement | Adds information to a prior document | Legal, medical records |
Encoding & File Metadata
| Property | Value |
|---|---|
| Character set | UTF-8 |
| Line endings | LF (U+000A) preferred, CRLF accepted |
| Line length | Unlimited (no wrapping) |
| Case sensitivity | Case-sensitive throughout |
| Reserved words | None |
| File extension | .odin |
| MIME type | text/odin-l |
File Naming Conventions
| Pattern | Purpose | Example |
|---|---|---|
*.odin | General ODIN data documents | policy.odin |
*.schema.odin | ODIN Schema definitions | auto.schema.odin |
*.transform.odin | ODIN Transform definitions | xml-to-odin.transform.odin |
Document Metadata
The $ path is reserved for document-level metadata:
{$}
odin = "1.0.0"
id = "doc_abc123"
created = 2025-12-06T14:30:00Z
source.format = "al3"
source.version = "2024.1"
hash = ^sha256:e3b0c44298fc1c149afbf4c8996fb924...
signature = ^ed25519:SGVsbG8gV29ybGQhIFRoaXM...EBNF Grammar
The complete formal grammar for ODIN-L 1.0, written in ISO 14977 EBNF.
Every literal terminal is quoted; only { } and [ ] are used for repetition and optional groups. This grammar is the canonical
source — it reflects the exact behavior of the Odin parser.
(* ===========================================================================
ODIN-L 1.0 — Core Notation Grammar
---------------------------------------------------------------------------
Canonical EBNF for the Open Data Interchange Notation Language.
Notation: ISO 14977 EBNF.
{ x } means zero or more repetitions of x
[ x ] means x is optional (zero or one)
( x ) groups
a | b alternation
"lit" terminal literal
a , b concatenation (comma optional in this file for readability)
Every terminal is enclosed in double quotes. There are no bare repetition
operators (no `*`, no `+`). The only metacharacters are `{ } [ ] ( ) | "`.
This grammar reflects the exact behavior of the Odin parser. Deviations
from this grammar are parser bugs, not language extensions.
=========================================================================== *)
(* --------------------------------------------------------------------------
1. DOCUMENT STRUCTURE
-------------------------------------------------------------------------- *)
document = [ bom ] { document_element } ;
bom = ? UTF-8 BOM, U+FEFF, stripped if present at offset 0 ? ;
document_element = blank_line
| comment_line
| header
| assignment
| directive
| document_separator ;
blank_line = newline ;
document_separator = "---" newline ;
newline = "\n" | "\r\n" | "\r" ;
(* --------------------------------------------------------------------------
2. COMMENTS
-------------------------------------------------------------------------- *)
(* Comments begin with ";" and consume the rest of the line. They may appear
on their own line or trailing an assignment, header, or tabular row. They
may NOT appear inside quoted strings, header braces, or array indices. *)
comment_line = comment newline ;
comment = ";" { char_except_newline } ;
(* --------------------------------------------------------------------------
3. HEADERS
-------------------------------------------------------------------------- *)
header = "{" header_body "}" [ trailing_comment ] newline ;
header_body = "" (* {} resets context *)
| [ "." ] header_path [ tabular_clause ] ;
header_path = meta_path | regular_path ;
meta_path = "$" [ "." path_segment { ( "." | array_index ) path_segment } ] ;
regular_path = path_segment { ( "." | array_index ) path_segment } ;
path_segment = identifier | "&" identifier { "." identifier } ;
(* Tabular clause turns a header into an array-of-records (or primitive
array) declaration. Subsequent lines until the next header are data rows. *)
tabular_clause = ":" [ primitive_marker ] [ column_list ] ;
primitive_marker = "~" ;
column_list = column_name { "," column_name } [ "," ] ;
column_name = "." identifier
| identifier [ "." identifier ] ;
(* --------------------------------------------------------------------------
4. ARRAY INDICES
-------------------------------------------------------------------------- *)
(* An array index is the bracketed segment that follows a path component. *)
array_index = "[" array_index_body "]" ;
array_index_body = "" (* tabular sentinel *)
| digits (* normal index *)
| jsonpath_filter (* filter expr *)
| key_list ; (* keyed lookup *)
jsonpath_filter = "?" "(" { char_except ( ")" ) } ")" ;
key_list = identifier { "," identifier } ;
(* --------------------------------------------------------------------------
5. ASSIGNMENTS
-------------------------------------------------------------------------- *)
assignment = path "=" [ modifiers ] value [ trailing_directives ]
[ trailing_comment ] newline ;
path = path_start { path_continuation } ;
path_start = identifier | "$" | "&" identifier ;
path_continuation = "." path_element
| array_index
| ".@" identifier ; (* XML attribute reference *)
path_element = identifier | "true" | "false" ;
(* Identifiers permit ASCII letters, digits, underscores, and hyphens. They
must begin with a letter or underscore. *)
identifier = ident_start { ident_cont } ;
ident_start = letter | "_" ;
ident_cont = letter | digit | "_" | "-" ;
(* --------------------------------------------------------------------------
6. MODIFIERS
-------------------------------------------------------------------------- *)
(* Modifiers prefix the value. Each may appear at most once. Order is not
semantically significant; the parser accepts them in any order. *)
modifiers = { modifier } ;
modifier = "!" (* required / critical *)
| "*" (* confidential — masked downstream *)
| "-" (* deprecated *) ;
(* --------------------------------------------------------------------------
7. VALUES
-------------------------------------------------------------------------- *)
(* Every value carries its own type via a one- or two-character prefix, with
the exception of bare booleans (true / false) and quoted strings. Bare
unquoted strings are NOT permitted as values. *)
value = quoted_string
| multiline_string
| currency
| percent
| integer
| number
| boolean
| null
| reference
| binary
| timestamp
| date
| time
| duration
| extension_value ;
(* 7.1 Strings *)
quoted_string = '"' { string_char | escape_seq } '"' ;
multiline_string = '"""' { multiline_char | escape_seq } '"""' ;
string_char = ? any character except '"', '\', '\n', '\r' ? ;
multiline_char = ? any character except the closing '"""' or '\' ? ;
escape_seq = "\" ( "\" | '"' | "n" | "t" | "r" | "0"
| "u" hex_digit hex_digit hex_digit hex_digit
| "U" hex_digit hex_digit hex_digit hex_digit
hex_digit hex_digit hex_digit hex_digit ) ;
(* 7.2 Numeric types *)
number = "#" [ "-" ] digits [ "." digits ] [ exponent ] ;
integer = "##" [ "-" ] digits ;
currency = "#$" [ "-" ] digits [ "." digits ] [ ":" currency_code ] ;
percent = "#%" [ "-" ] digits [ "." digits ] ;
exponent = ( "e" | "E" ) [ "+" | "-" ] digits ;
currency_code = letter letter letter ; (* ISO 4217, parser uppercases *)
(* 7.3 Boolean and null *)
boolean = "?" "true" | "?" "false" | "true" | "false" ;
null = "~" ;
(* 7.4 References *)
(* References point at another path within the document or its metadata.
A leading "." denotes a relative path; "$" denotes the metadata root. *)
reference = "@" reference_target ;
reference_target = relative_ref | absolute_ref | meta_ref ;
relative_ref = "." path_element { ( "." | array_index ) path_element } ;
absolute_ref = path_element { ( "." | array_index ) path_element } ;
meta_ref = "$" "." path_element { "." path_element } ;
(* 7.5 Binary *)
binary = "^" [ algorithm ":" ] base64_content ;
algorithm = identifier ;
base64_content = { base64_char } [ "=" [ "=" ] ] ;
base64_char = letter | digit | "+" | "/" ;
(* 7.6 Temporal types *)
(* Dates use ISO 8601. The parser semantically validates month and day
ranges; values like 2024-13-01 or 2024-02-30 are rejected at parse
time, not just at validation time. *)
date = digit digit digit digit "-" digit digit "-" digit digit ;
timestamp = date "T" digit digit ":" digit digit ":" digit digit
[ "." digits ]
[ tz_offset ] ;
tz_offset = "Z"
| ( "+" | "-" ) digit digit [ ":" digit digit ] ;
time = "T" digit [ digit ]
[ ":" digit [ digit ]
[ ":" digit [ digit ] [ "." digits ] ] ] ;
duration = "P" [ digits "Y" ] [ digits "M" ] [ digits "W" ] [ digits "D" ]
[ "T" [ digits "H" ] [ digits "M" ] [ digits "S" ] ] ;
(* 7.7 Extension values *)
(* The "&" prefix marks an implementation-defined extension value. The
payload after the prefix is parsed as identifier-dotted namespace plus
any of the standard value forms. *)
extension_value = "&" identifier { "." identifier } [ value ] ;
(* --------------------------------------------------------------------------
8. TRAILING DIRECTIVES
-------------------------------------------------------------------------- *)
(* Directives that follow a value attach metadata such as positional info
for fixed-width input, length bounds, or transform flags. *)
trailing_directives = { ":" directive_name [ directive_value ] } ;
directive_name = identifier ;
directive_value = number | integer | quoted_string | identifier ;
(* --------------------------------------------------------------------------
9. TOP-LEVEL DIRECTIVES
-------------------------------------------------------------------------- *)
directive = import_directive
| schema_directive
| conditional_directive ;
import_directive = "@import" whitespace import_path
[ whitespace "as" whitespace identifier ]
[ trailing_comment ] newline ;
import_path = quoted_string | unquoted_path ;
unquoted_path = { char_except ( whitespace | newline | ";" ) } ;
schema_directive = "@schema" whitespace url [ trailing_comment ] newline ;
url = quoted_string | unquoted_path ;
conditional_directive
= "@if" whitespace condition [ trailing_comment ] newline ;
condition = { char_except_newline } ;
(* --------------------------------------------------------------------------
10. LEXICAL PRIMITIVES
-------------------------------------------------------------------------- *)
letter = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I"
| "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R"
| "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
| "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i"
| "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r"
| "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
digits = digit { digit } ;
hex_digit = digit
| "a" | "b" | "c" | "d" | "e" | "f"
| "A" | "B" | "C" | "D" | "E" | "F" ;
whitespace = " " | "\t" ;
char_except_newline = ? any character except "\n" and "\r" ? ;
char_except = ? any character except the listed exclusions ? ;
trailing_comment = whitespace comment ;
(* --------------------------------------------------------------------------
11. SEMANTIC CONSTRAINTS
--------------------------------------------------------------------------
These rules constrain otherwise-valid grammar productions. They are
enforced by the parser even though they cannot be expressed in pure
context-free EBNF.
* No path may be assigned more than once within a document. (P007)
* Array indices for the same path must be contiguous starting at 0. (P016)
* Array indices may not exceed MAX_ARRAY_INDEX. (P015)
* Total path nesting depth may not exceed MAX_NESTING_DEPTH. (P010)
* Quoted strings may not contain unescaped newlines. (P004)
* Numbers may not have an exponent without digits. (P001)
* Currency codes are normalized to uppercase.
* Bare unquoted strings are forbidden as values; quote them. (P002)
* Comments are not recognized inside header braces; ";" inside "{ ... }"
is treated as a literal character.
* Relative headers (leading ".") resolve against the most recent
ABSOLUTE header, not the most recent header of any kind.
* Date and timestamp values are validated for month (01-12) and
day-of-month at parse time.
* Modifiers ("!", "*", "-") may only precede a value.
* The "$" path is reserved for document metadata; assignments under
"$.xxx" are stored on the document metadata map, not the assignment map.
-------------------------------------------------------------------------- *)
Comments & Directives
Comments begin with semicolon (
;) and extend to end of line:Comment Rules
;character and extend to end of line only;on each linedesc = "a; b"is validLines starting with
@are directives (import, schema, conditional):