Table of Contents
Break Syntax Chapter
As proposed by Rich Morin.
A lot of nits detected by Rich Morin, Tichy Braoo and Alexis Layton.
Several broken links detected by Tichy Braoo.
Break Syntax Chapter
To several smaller chapters.
from examples and the bool type definition. This should have been
done a while back (when a plain value could no longer be
Changed to 00:00:00Z instead of 12:00:00Z.
Better reflects the intent.
To allow for easier media retargeting. Support HTML and PDF (using XEP).
Fix production bugs
Prevent header/trailer from appearing as content - Thanks to Robin Green for catching this one.
Change “Serialization” back to “Document”
By popular demand.
Allow leading empty lines, and indented content lines to start with a
As per the updated tag URI RFC.
Used the term “(YAML) processor” instead of parser, loader etc. This avoids forcing preconcieved notions on system architecture to affect the semantics.
Rearranged and renamed node/value productions
Hopefully this would increase production readability.
Add a note explaining the production prefixes.
Changed to release candidate.
Allow inline collection form in complex keys
Reusing the productions from the *-in-seq form.
Allow omitting complex and flow keys values
Allows for a nicer set notation.
Allow implicit single pair maps in flow sequences
Allows for a nicer ordered map connection.
Major re-write of model and intro chapters
Add figures, rephrase, make more readable.
Empty plain scalars
Are forbidden as keys and as flow sequence entries.
Are allowed in flow collections and following ... line. Empty lines ending with a specific line break are now treated as comments even if following a block scalar.
Folding and Chomping
Rules have been massively simplified.
Completely new examples set, with highlights associating parts of the characters stream with specific productions.
Re-work almost all productions in the syntax section, together with examples.
Incorporated minor wording issues (thanks to Sam Vilain).
New directive syntax, %TAG directives, patched indentation rules, moved !!str, !!seq and !!map to the repository. Hopefully this is the last "global" change.
Replace all internal links with index entries. It is simply amazing how much work is expressed in this short sentence.
Was changed to “---” instead of “--- %YAML:1.0”.
Now allows all collection styles, not only block collections.
Block sequence indentation
Wording, productions and examples fixed to allow for more human-friendly interpretation of “-” as indentation.
Were renamed with a hungarian-like prefix notation and reformatted so that cut-and-paste from HTML would yield proper results.
Vocabulary instead of language
The term “vocabulary” better describes the intent of sub-domains of yaml.org (for type families).
Was added (similar to map-in-seq).
Changed from “
# ” to
Changed from “
Anchors and directive names
Can be any non-space flow char.
May start on a following line.
-” plain value
Is canceled (ambiguous with next-line flow). Therefore the
-/+ Boolean implicits are gone as well.
Corrected date indicators.
Integers and floats
Added sexagesimal (base 60) format (using
:”) for time (and degrees).
Removed the “
Changed from “
Brian’s wording fixes
Spelling and minor re-wording.
#TAB was removed
Tabs are now completely banned from indentation.
Escaping of type family name
The interaction between “
%” escaping has been clarified
in a better way.
Zero explicit indentation
Is now allowed for top level nodes.
Moved language-independent types out of the spec.
Only !map, !seq and !str are in it now.
Changed implicit types
No longer require (). Implicit empty string is !null. Changed nil to undef. These changes depend on a resolution to the implicit typing issues; hence all the separate type documents are not in “last call” status. Yet.
Changed NULL escape
From \z to \0.
- entry” indentation
Consider the “
-” to be
indentation, but not the following spaces.
Changed format separator to “
This makes transfer methods into URI-references (allow fragments). Type families are URIs (without fragments).
Mutability == collection
Make all collections mutable and all scalars immutable. This simplifies generic tools and the native model.
Moved information models to the end.
Graph implicit types
Allow types to remain unknown (implicit) all the way up to and including the graph model; allow implicit collection types.
Graph implicit types
Allow types to remain unknown (implicit) all the way up to and including the graph model; allow implicit collection types.
Was changed to “Last Call”. At long last.
May begin with a “
/” to allow
paths to be unquoted.
// special keys.
They conflict with the new implicit string format. What to replace them with, if anything, is still TBD.
Minor nits were wrong in some examples (4.3.1, 4.6.4, 4.6.5, 4.6.7).
Renamed “public” type families to “global”
This better conveys the intent.
As of this draft, changes are in a separated document.
The previous pagination was nice but it was very wasteful of pages. In the spirit of keeping the spec small, page breaks were moved, reducing the printed size by 10 pages or so.
Table of content
Was enhanced to specify page numbers as well as links. Also, entry 3.1.4 was missing.
Section 1.2, paragraph 7
Remove the sentence about indentation being used for multiple documents (it isn’t).
Using “an flow” (leftover from the global substitute) and various other minor fixes.
Renamed “nested” to “block”
In all the wording. This makes the wording match the productions.
Cleaned canonical formats
In some cases, the canonical format was a subset of another format, rather than being an implicit format on its own. This is allowed, and is now explicit in the format definition. Also fixed the string formats to satisfy the requirement that a value would only match one implicit or explicit format.
Should be much improved. Explicit page breaks were added and some reorganization was done so that breaks should reflect logical chunks. At least in A4 and Letter pages size on IE6 :-)
Productions 94 and 98
Now reflect the correct indentation policy for map-in-seq.
Productions 129 and 130
Were renamed to “folded-paragraph” and “folded-text-line” to better reflect the intent.
Was added as per the discussions in the mailing list. This added two productions.
Some examples were re-worked as part of the re-organization. Hopefully for the better. The XML example was removed.
Add page-break in many locations for better printing, make table of contents two columns.
Change “block scalar” to “literal scalar”, change “in-line” to “flow”. This allows for two sorts of mechanisms, the “block” mechanism and the “flow” mechanism. Within block are literal and folded scalars; within flow are plain, single, and double. This involved a few global substitutions which will probably introduce error, those are:
Restructure the preview section for general readability.
Minor change to the comment-break production
Use (word) instead of .word for “enums”
This is going to be our generic way to handle implicits that otherwise would be taken to be strings.
Change “word” format names to “english”
More correctly reflects their semantics.
Change canonical Booleans to
+/-. Had to
make a special case to allow “
in unquoted values to do it. This doesn’t apply to keys, but using
the English format there makes more sense anyway. This allows
canonical boolean to be implicit.
In addition to
~, consistent with booleans.
Merge them into timestamp (and rename it to “time”).
Relax string regexp
Anything starting with a “text” character
-”, letter or digit),
except for int/flt/time.
Change space escapes
\ ” is now a space,
\-” is now NBSP.
To begin throwaway comments.
Renamed the three chomping variants
To “strip”, “clip” (default) and “keep”.
Changed in-line folding
To allow empty lines to represent line feeds and to preserve specific line breaks. The WIKI feature isn’t included as it depends on indentation.
Changed simple style
To reflect the new restrictions (keys can’t span; in-lines can’t have
, ” and “
”; for top-level values, anything goes except
comments. Comments can no longer be interleaved in such scalars.
?” into a space separator
No reason not to. It was always followed by a space anyway.
Renamed scalar styles
Rename “plain” to “block” and
“simple” to “plain”. As a result of the
above it isn’t that simple any more
having both “plain” and “simple” was
confusing. Reverting to “block” more accurately reflects
the semantics, freeing “plain” for the unquoted style.
To make the document more printer-friendly. The main change was to
line-. The latter is particularly painful,
but there was no other good choice.
Added a warning that float values may not round-trip exactly due to the use of native float data types.
Provide mailing list URL
So that its address would be clear even from the printed version.
Line Processing section
Was renamed “Space processing”.
Were cleaned up and now allow comments before the header line.
Many tweaking to ensure printing doesn’t clip the comments of the productions tables.
Extended to allow for infinity and not a number.
\ (without the leading
Whole section was re-worked to accommodate the latest decisions in the mailing list. Leaf and branch are now renamed to collection and scalar; keyed and series are renamed to mapping and sequence.
Are now allowed, including multi-line in-line scalars.
Limit implicit strings
To word and space characters only.
# % ^
No need to reserve them - they can’t be used as first characters in a simple value anyway because there’s no implicit type using them. Unreserving them allows them to be used in implicit types in the future.
To preserve line breaks around more-indented and specific empty lines.
Run the whole document throught Word for spell checking.
Changed to use style sheet in a more dependable way, and now reflects the fact the native and generic models “are one” in some respects.
Wording for String
Changed to reflect the fact
-” isn’t a word character.
Was added (“
-” for chomp,
+” for keep).
Escaped nested style
Was removed. It is redundant due to “..” spanning lines.
Explicit notation is now required for empty scalars. This reverts the change in the previous draft.
No alternate style for collections.
Map and seq can now only be written in keyed and series style.
Are now allowed to trail lines except for inside multi line text. Comments (explicit and implicit) may trail multi-line text values as long as they are less indented.
Is now correctly limited to word characters.
Is no longer allowed in in-line collections.
Production 170 fixed to prohibit in-line indicators after a space indicator, or two space indicators followed by a space.
Was listed as “canonical”. Fixed to “explicit”.
Time intro examples
Used the wrong format
Correct minor problems (missing “(n)”, improper handling
of spaces in in-line collections, missing
Integer and Float formats
Now ignores any “
,” inside the
value. This required making “
into a space separator.
Is now a type of its own.
Was removed. Will probably be a Perl-specific type.
Were simplified to only 6 (3 nested and 3 in-line).
Is now restricted to “
is allowed if encoding does not change. Production 49 has been
corrected accordingly, and the reference to the obsolete
non-alias_node was corrected.
This special case (production 93) has been restricted to in-line leaf keys without any properties. This minimizes lookahead in parsers.
Now allows an empty unquoted value (production 163).
Anchors for aliases
Are now forbidden.
Fixed a problem with the regexp. Allow both a valid ISO8601 format
T”) and a space separated
format (for readability).
Break node productions to smaller bits.
#TAB directive is now required to allow
tabs in indentation.
Floating point numbers in all examples now start with
In various places.
Empty line indentation
Is now be
indent(<=n) instead of the
The yaml: scheme has been replaced by an http: one. Mapping of XML
namespaces is now done using “
instead of “
;”. The prefix
indicator is now “
^” and the
format indicator is now “
Is now split to type family and format; type family is now a URI.
Now includes the format.
Throughout the whole spec.
Format in Transfer Method
Is now separated by a “
Removed the Probable changes section; it is now replaced by the YAC list.
YAC 3, YAC 5
Added a uniform URI based namespace scheme.
Unrecognized explicitly typed implicit (simple) leafs are allowed.
C1 Control codes are explicitly forbidden.
In-series in-line syntax was modified according to the flexible indentation scheme (YAC needs to be updated).
Rename implicit leaves to simple leaves.
Throwaways are now allowed everywhere. Blank lines are comments. There are ambiguity problems in chomped leaf values.
Indentation is now generic (flexible).
Nested leaf format is now built out of three orthogonal properties.
Top level nodes can be in-line.
--- #YAML:1.0” is assumed if
there’s no header.
YAML Ain’t Markup Language
Are now implemented as a text implicit transfer format. This changed slightly the definition of an escaped leaf so that the two would be equivalent.
Relative DNS type familys
Are now supported using the simplest form only. the definition of an escaped leaf so that the two would be equivalent.
Were fixed according to Brian’s inputs.
Is now chomped using “
|-” for consistency.
Can now be anything starting with
--”; therefore is required
before the first document in a multi-document stream.
Can now accept any printable characters, not just words.
!” now means “force
New Scalar Styles
We now have the following:
| || \ \\ ' "
Treat sequence/map as Collection Styles
In productions and in information model.
Add Structured Keys
Using a key indicator (done - Oren).
Added a detailed examples section to the introduction to better acquaint the user so that the spec can proceed with some basic knowledge.
Are now supported. Empty maps/sequences are a natural special case.
Moved list of changes down... it was cluttering the top of the spec.
Information model and Preview
Completely new rewrite.
Minor wording fixes, added internal links, etc.
Was renamed to “sequence”.
Was changed to “---” instead of “----”.
Was changed to one space instead of one tab.
Is no longer an implicit type. The surrounding
[=...=]” are kept,
however, in case we change our mind later (e.g., if we introduce
pipelining). The type was renamed to “binary” to stress
its class rather than the encoding used.
Was renamed to “real” to decouple it from specific in-memory representation. Mathematicians may object :-)
Was removed from the sequence map.
Type vs. Class
Added some wording to clarify the difference. Most likely this will need to be changed once we settle the pipelining issue.
Were completely overhauled, again, to accomodate the new semantics.
Next Line Scalars
Now have two separate indicators, one for quoted and one for unquoted values.
Are now an error. The parser may ignore the second occurrence with a warning.
Has been changed to use tabs instead of spaces.
Were added. The persistent comment key was changed to
Were changed. “
-” now signifies a
list entry and “
\” signifies a
next-line leaf value. “
%” are no longer necessary (they
may be if we ever support map/list keys). As a result no lookahead is
Are now possible in a single file (again), using
----” as a separator.
Has been changed in numerous locations, hopefully to make it clearer. There was also some shuffling of the text sections to remove redundancy.
Were thoroughly overhauled and therefore undoubtedly contain new bugs. Also, all the shorthand production names were replaced by long ones to improve readability.
Are no longer allowed. Structure keys are used instead, where some have only an in-memory representation.
Are no longer allowed. This may have to be revisited when Perl 6 comes out.
Are still supported but as an explicit type rather than as a hack.
Has been shrunk to only the common types, with a reference to yaml.org for a fuller list of types. The three core types were added as required types.
Type vs. Kind
This distinction was inserted explicitly into the text, with several examples to drive the point home.
Is now defined as simply printable Unicode characters without explicit ranges. This makes the spec resistant to the evolution of the Unicode spec.
The set of such indicators has been minimized. There is now a conflict between reserving them for future use and allowing people to use them as markers for implicit leaf types.
Has been renamed to unquoted leaf.
Has been generalized to allow for types nodes.
Has been added with an assortment of suggested types.
Keys can now be any nodes to allow for Java serialization.
Are now supported for Perl serialization.
Simple Scalar and End Of Lines
Moved eol productions to the end, rather than the start, of most productions. The wording and productions for the simple leaf were fixed to match each other and the intended semantics. The simple leaf example set was enhanced to clarify the proper interpretation.
Both empty top level maps and no top level maps are now allowed, and hence so are empty documents.
Thanks to Joe Lapp for reviewing the 22 Jul 2001 draft and recommending these changes.
Fixed phrasing in the abstract, and sections 1.3, 2.1, 2.3.1, 2.4.3, 2.4.4, 2.4.5, 2.4.6 and 2.5.3.
Fixed productions: added production 47, 59, fixed productions 57, 58, 60 and 64 (productions numbers in the 22 Jul 2001 draft are off by one in some cases). Most are bug fixes. Actual changes include allowing for empty lines surrounding a top level map, allowing an optional trailing separator line, and forbidding annotations which have no sensible semantics (anchor to null, anchor to a reference, shorthand for a reference).
Due to the decision to leave all API related issues outside the core spec, the spec has been re-merged into a single file, covering just what used to be the introduction and serialization sections of the previous specs.
The spec now refers only to the Unicode standard. Due to the efforts by the Unicode and ISO/IEC 10646 groups, both standards are in almost complete agreement. The additional features provided by the ISO/IEC standard are rarely used in practice, while Unicode is simpler and is more widely supported by existing languages and systems.
Indentation is now a strict 4 spaces per level. This allows for the new white space policy and the new block notation.
The spec introduces a shorthand notation for attaching special keys to any node kind (converting it to a map if necessary). This will need more work.
Null nodes have finally been added, after somehow eluding all previous versions.
* optional prefix for leaf list
entries to a mandatory : and therefore remove the special name
“bulleted list entries”.
Multi-line simple keys are now out. The door is open for re-introducing them, however.
Change Whitespace Policy
White space folding has been replaced by line break folding. White space is now always significant, except for indentation and for separation of structure tokens.
Block Scalar Syntax
The syntax for block leaves has been replaced by a more elegant one.
The spec is now separated into several files. This allows different versions of the spec to share the same version of unchanged section, and make it easier to refer to a particular version of important pieces of the spec such as serialization and interfaces. All the HTML files use the same shared CSS file. Cross references between the separate parts of the spec are now relative, though references to older versions are absolute and refer to the main site.
Change the wording on the information model to allow for graphs with cycles. The alternative is to define the anchor semantics in such a way that would preclude cycles.
Null Character Escape
The escape sequence \z was added to allow convenient escaping of the ASCII zero (null) character.
Remove Binary Scalars
The information model now contains just one type of leaf. The special syntax for binary leaves has been removed. This functionality will be re-added in the form of a color.
Remove Class Shorthand
The syntax no longer supports the !class syntax. This functionality will be re-added in the form of a color.
Change the optional prefix for leaf list entries to
* and rename such entries to “bulleted
Make Keys More Scalars-Compatible
Allow for multi-line simple keys and unify the description of leaf keys and values where it makes sense.
All the HTML pages have gone through Tidy. Also, all the HTML files have been run through an HTML validation service and a CSS validation service. Broken links and spelling were checked using another online HTML validator. This needs to be repeated for all future drafts.
Relationship with MIME
Beyond using base64 for binary leaves, no additional special relationship with MIME is expected. Hence references to the MIME and mail RFCs were moved from section 1.1 (“required reading”) to section 1.2 (“background material”).
Indentation is now completely strict for all leaf styles. Also, the productions were changes to use a consistent semantics to the indentation level parameter.
List Scalar Prefixes
A list leaf entry may be prefixed by an optional : indicator to improve readability of multi-line simple leaf values.
Leading zeros are now ignored for comparing anchor strings.
No Empty Line At Start
The document production was fixed so as not to require an empty line at the start of a document.
The set of character escapes is now maximal (including the rare \e escape for the useful ASCII ESC character). Also, it is now possible to “escape” a line break in a quoted string (the previous drafts were inconsistent at this point).
32 Bit Characters
The current draft allows such characters, and includes a specialized
escaping format (“
The changes section was added for easier comparison of different versions. The final draft will not contain this section.
The indicator was changed from
! to allow for
# to be
used for comments.
No Empty Line At End
The document production was fixed so as not to require an empty line at the end of a document.
Indentation in quoted strings and binary blocks is now strict to ensure readability.
Problems in the productions were fixed, especially where related to white space issues and formatting of the result.
The link to the Unicode FAQ was moved to section 2.2.2.
The information model now distinguishes between text and binary leaves.