From f7bab78839cea5674658a6a0298f88ef5ccca019 Mon Sep 17 00:00:00 2001 From: Ilya Ryzhenkov Date: Thu, 25 Sep 2014 22:20:58 +0400 Subject: Markdown, sections, styles and lots more. --- test/data/markdown/spec.txt | 6150 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 6150 insertions(+) create mode 100644 test/data/markdown/spec.txt (limited to 'test/data') diff --git a/test/data/markdown/spec.txt b/test/data/markdown/spec.txt new file mode 100644 index 00000000..fce87924 --- /dev/null +++ b/test/data/markdown/spec.txt @@ -0,0 +1,6150 @@ +--- +title: CommonMark Spec +author: +- John MacFarlane +version: 2 +date: 2014-09-19 +... + +# Introduction + +## What is Markdown? + +Markdown is a plain text format for writing structured documents, +based on conventions used for indicating formatting in email and +usenet posts. It was developed in 2004 by John Gruber, who wrote +the first Markdown-to-HTML converter in perl, and it soon became +widely used in websites. By 2014 there were dozens of +implementations in many languages. Some of them extended basic +Markdown syntax with conventions for footnotes, definition lists, +tables, and other constructs, and some allowed output not just in +HTML but in LaTeX and many other formats. + +## Why is a spec needed? + +John Gruber's [canonical description of Markdown's +syntax](http://daringfireball.net/projects/markdown/syntax) +does not specify the syntax unambiguously. Here are some examples of +questions it does not answer: + +1. How much indentation is needed for a sublist? The spec says that + continuation paragraphs need to be indented four spaces, but is + not fully explicit about sublists. It is natural to think that + they, too, must be indented four spaces, but `Markdown.pl` does + not require that. This is hardly a "corner case," and divergences + between implementations on this issue often lead to surprises for + users in real documents. (See [this comment by John + Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) + +2. Is a blank line needed before a block quote or header? + Most implementations do not require the blank line. However, + this can lead to unexpected results in hard-wrapped text, and + also to ambiguities in parsing (note that some implementations + put the header inside the blockquote, while others do not). + (John Gruber has also spoken [in favor of requiring the blank + lines](http://article.gmane.org/gmane.text.markdown.general/2146).) + +3. Is a blank line needed before an indented code block? + (`Markdown.pl` requires it, but this is not mentioned in the + documentation, and some implementations do not require it.) + + ``` markdown + paragraph + code? + ``` + +4. What is the exact rule for determining when list items get + wrapped in `

` tags? Can a list be partially "loose" and partially + "tight"? What should we do with a list like this? + + ``` markdown + 1. one + + 2. two + 3. three + ``` + + Or this? + + ``` markdown + 1. one + - a + + - b + 2. two + ``` + + (There are some relevant comments by John Gruber + [here](http://article.gmane.org/gmane.text.markdown.general/2554).) + +5. Can list markers be indented? Can ordered list markers be right-aligned? + + ``` markdown + 8. item 1 + 9. item 2 + 10. item 2a + ``` + +6. Is this one list with a horizontal rule in its second item, + or two lists separated by a horizontal rule? + + ``` markdown + * a + * * * * * + * b + ``` + +7. When list markers change from numbers to bullets, do we have + two lists or one? (The Markdown syntax description suggests two, + but the perl scripts and many other implementations produce one.) + + ``` markdown + 1. fee + 2. fie + - foe + - fum + ``` + +8. What are the precedence rules for the markers of inline structure? + For example, is the following a valid link, or does the code span + take precedence ? + + ``` markdown + [a backtick (`)](/url) and [another backtick (`)](/url). + ``` + +9. What are the precedence rules for markers of emphasis and strong + emphasis? For example, how should the following be parsed? + + ``` markdown + *foo *bar* baz* + ``` + +10. What are the precedence rules between block-level and inline-level + structure? For example, how should the following be parsed? + + ``` markdown + - `a long code span can contain a hyphen like this + - and it can screw things up` + ``` + +11. Can list items include headers? (`Markdown.pl` does not allow this, + but headers can occur in blockquotes.) + + ``` markdown + - # Heading + ``` + +12. Can link references be defined inside block quotes or list items? + + ``` markdown + > Blockquote [foo]. + > + > [foo]: /url + ``` + +13. If there are multiple definitions for the same reference, which takes + precedence? + + ``` markdown + [foo]: /url1 + [foo]: /url2 + + [foo][] + ``` + +In the absence of a spec, early implementers consulted `Markdown.pl` +to resolve these ambiguities. But `Markdown.pl` was quite buggy, and +gave manifestly bad results in many cases, so it was not a +satisfactory replacement for a spec. + +Because there is no unambiguous spec, implementations have diverged +considerably. As a result, users are often surprised to find that +a document that renders one way on one system (say, a github wiki) +renders differently on another (say, converting to docbook using +pandoc). To make matters worse, because nothing in Markdown counts +as a "syntax error," the divergence often isn't discovered right away. + +## About this document + +This document attempts to specify Markdown syntax unambiguously. +It contains many examples with side-by-side Markdown and +HTML. These are intended to double as conformance tests. An +accompanying script `runtests.pl` can be used to run the tests +against any Markdown program: + + perl runtests.pl spec.txt PROGRAM + +Since this document describes how Markdown is to be parsed into +an abstract syntax tree, it would have made sense to use an abstract +representation of the syntax tree instead of HTML. But HTML is capable +of representing the structural distinctions we need to make, and the +choice of HTML for the tests makes it possible to run the tests against +an implementation without writing an abstract syntax tree renderer. + +This document is generated from a text file, `spec.txt`, written +in Markdown with a small extension for the side-by-side tests. +The script `spec2md.pl` can be used to turn `spec.txt` into pandoc +Markdown, which can then be converted into other formats. + +In the examples, the `→` character is used to represent tabs. + +# Preprocessing + +A [line](#line) +is a sequence of zero or more characters followed by a line +ending (CR, LF, or CRLF) or by the end of +file. + +This spec does not specify an encoding; it thinks of lines as composed +of characters rather than bytes. A conforming parser may be limited +to a certain encoding. + +Tabs in lines are expanded to spaces, with a tab stop of 4 characters: + +. +→foo→baz→→bim +. +

foo baz     bim
+
+. + +. + a→a + ὐ→a +. +
a   a
+ὐ   a
+
+. + +Line endings are replaced by newline characters (LF). + +A line containing no characters, or a line containing only spaces (after +tab expansion), is called a [blank line](#blank-line). + + +# Blocks and inlines + +We can think of a document as a sequence of [blocks](#block)---structural elements like paragraphs, block quotations, +lists, headers, rules, and code blocks. Blocks can contain other +blocks, or they can contain [inline](#inline) content: +words, spaces, links, emphasized text, images, and inline code. + +## Precedence + +Indicators of block structure always take precedence over indicators +of inline structure. So, for example, the following is a list with +two items, not a list with one item containing a code span: + +. +- `one +- two` +. + +. + +This means that parsing can proceed in two steps: first, the block +structure of the document can be discerned; second, text lines inside +paragraphs, headers, and other block constructs can be parsed for inline +structure. The second step requires information about link reference +definitions that will be available only at the end of the first +step. Note that the first step requires processing lines in sequence, +but the second can be parallelized, since the inline parsing of +one block element does not affect the inline parsing of any other. + +## Container blocks and leaf blocks + +We can divide blocks into two types: +[container blocks](#container-block), +which can contain other blocks, and [leaf blocks](#leaf-block), + which cannot. + +# Leaf blocks + +This section describes the different kinds of leaf block that make up a +Markdown document. + +## Horizontal rules + +A line consisting of 0-3 spaces of indentation, followed by a sequence +of three or more matching `-`, `_`, or `*` characters, each followed +optionally by any number of spaces, forms a [horizontal +rule](#horizontal-rule). + +. +*** +--- +___ +. +
+
+
+. + +Wrong characters: + +. ++++ +. +

+++

+. + +. +=== +. +

===

+. + +Not enough characters: + +. +-- +** +__ +. +

-- +** +__

+. + +One to three spaces indent are allowed: + +. + *** + *** + *** +. +
+
+
+. + +Four spaces is too many: + +. + *** +. +
***
+
+. + +. +Foo + *** +. +

Foo +***

+. + +More than three characters may be used: + +. +_____________________________________ +. +
+. + +Spaces are allowed between the characters: + +. + - - - +. +
+. + +. + ** * ** * ** * ** +. +
+. + +. +- - - - +. +
+. + +Spaces are allowed at the end: + +. +- - - - +. +
+. + +However, no other characters may occur at the end or the +beginning: + +. +_ _ _ _ a + +a------ +. +

_ _ _ _ a

+

a------

+. + +It is required that all of the non-space characters be the same. +So, this is not a horizontal rule: + +. + *-* +. +

-

+. + +Horizontal rules do not need blank lines before or after: + +. +- foo +*** +- bar +. + +
+ +. + +Horizontal rules can interrupt a paragraph: + +. +Foo +*** +bar +. +

Foo

+
+

bar

+. + +Note, however, that this is a setext header, not a paragraph followed +by a horizontal rule: + +. +Foo +--- +bar +. +

Foo

+

bar

+. + +When both a horizontal rule and a list item are possible +interpretations of a line, the horizontal rule is preferred: + +. +* Foo +* * * +* Bar +. + +
+ +. + +If you want a horizontal rule in a list item, use a different bullet: + +. +- Foo +- * * * +. + +. + +## ATX headers + +An [ATX header](#atx-header) +consists of a string of characters, parsed as inline content, between an +opening sequence of 1--6 unescaped `#` characters and an optional +closing sequence of any number of `#` characters. The opening sequence +of `#` characters cannot be followed directly by a nonspace character. +The closing `#` characters may be followed by spaces only. The opening +`#` character may be indented 0-3 spaces. The raw contents of the +header are stripped of leading and trailing spaces before being parsed +as inline content. The header level is equal to the number of `#` +characters in the opening sequence. + +Simple headers: + +. +# foo +## foo +### foo +#### foo +##### foo +###### foo +. +

foo

+

foo

+

foo

+

foo

+
foo
+
foo
+. + +More than six `#` characters is not a header: + +. +####### foo +. +

####### foo

+. + +A space is required between the `#` characters and the header's +contents. Note that many implementations currently do not require +the space. However, the space was required by the [original ATX +implementation](http://www.aaronsw.com/2002/atx/atx.py), and it helps +prevent things like the following from being parsed as headers: + +. +#5 bolt +. +

#5 bolt

+. + +This is not a header, because the first `#` is escaped: + +. +\## foo +. +

## foo

+. + +Contents are parsed as inlines: + +. +# foo *bar* \*baz\* +. +

foo bar *baz*

+. + +Leading and trailing blanks are ignored in parsing inline content: + +. +# foo +. +

foo

+. + +One to three spaces indentation are allowed: + +. + ### foo + ## foo + # foo +. +

foo

+

foo

+

foo

+. + +Four spaces are too much: + +. + # foo +. +
# foo
+
+. + +. +foo + # bar +. +

foo +# bar

+. + +A closing sequence of `#` characters is optional: + +. +## foo ## + ### bar ### +. +

foo

+

bar

+. + +It need not be the same length as the opening sequence: + +. +# foo ################################## +##### foo ## +. +

foo

+
foo
+. + +Spaces are allowed after the closing sequence: + +. +### foo ### +. +

foo

+. + +A sequence of `#` characters with a nonspace character following it +is not a closing sequence, but counts as part of the contents of the +header: + +. +### foo ### b +. +

foo ### b

+. + +Backslash-escaped `#` characters do not count as part +of the closing sequence: + +. +### foo \### +## foo \#\## +# foo \# +. +

foo #

+

foo ##

+

foo #

+. + +ATX headers need not be separated from surrounding content by blank +lines, and they can interrupt paragraphs: + +. +**** +## foo +**** +. +
+

foo

+
+. + +. +Foo bar +# baz +Bar foo +. +

Foo bar

+

baz

+

Bar foo

+. + +ATX headers can be empty: + +. +## +# +### ### +. +

+

+

+. + +## Setext headers + +A [setext header](#setext-header) +consists of a line of text, containing at least one nonspace character, +with no more than 3 spaces indentation, followed by a [setext header +underline](#setext-header-underline). A [setext header +underline](#setext-header-underline) +is a sequence of `=` characters or a sequence of `-` characters, with no +more than 3 spaces indentation and any number of trailing +spaces. The header is a level 1 header if `=` characters are used, and +a level 2 header if `-` characters are used. The contents of the header +are the result of parsing the first line as Markdown inline content. + +In general, a setext header need not be preceded or followed by a +blank line. However, it cannot interrupt a paragraph, so when a +setext header comes after a paragraph, a blank line is needed between +them. + +Simple examples: + +. +Foo *bar* +========= + +Foo *bar* +--------- +. +

Foo bar

+

Foo bar

+. + +The underlining can be any length: + +. +Foo +------------------------- + +Foo += +. +

Foo

+

Foo

+. + +The header content can be indented up to three spaces, and need +not line up with the underlining: + +. + Foo +--- + + Foo +----- + + Foo + === +. +

Foo

+

Foo

+

Foo

+. + +Four spaces indent is too much: + +. + Foo + --- + + Foo +--- +. +
Foo
+---
+
+Foo
+
+
+. + +The setext header underline can be indented up to three spaces, and +may have trailing spaces: + +. +Foo + ---- +. +

Foo

+. + +Four spaces is too much: + +. +Foo + --- +. +

Foo +---

+. + +The setext header underline cannot contain internal spaces: + +. +Foo += = + +Foo +--- - +. +

Foo += =

+

Foo

+
+. + +Trailing spaces in the content line do not cause a line break: + +. +Foo +----- +. +

Foo

+. + +Nor does a backslash at the end: + +. +Foo\ +---- +. +

Foo\

+. + +Since indicators of block structure take precedence over +indicators of inline structure, the following are setext headers: + +. +`Foo +---- +` + + +. +

`Foo

+

`

+

<a title="a lot

+

of dashes"/>

+. + +The setext header underline cannot be a lazy line: + +. +> Foo +--- +. +
+

Foo

+
+
+. + +A setext header cannot interrupt a paragraph: + +. +Foo +Bar +--- + +Foo +Bar +=== +. +

Foo +Bar

+
+

Foo +Bar +===

+. + +But in general a blank line is not required before or after: + +. +--- +Foo +--- +Bar +--- +Baz +. +
+

Foo

+

Bar

+

Baz

+. + +Setext headers cannot be empty: + +. + +==== +. +

====

+. + + +## Indented code blocks + +An [indented code block](#indented-code-block) +
is composed of one or more +[indented chunks](#indented-chunk) separated by blank lines. +An [indented chunk](#indented-chunk) +is a sequence of non-blank lines, each indented four or more +spaces. An indented code block cannot interrupt a paragraph, so +if it occurs before or after a paragraph, there must be an +intervening blank line. The contents of the code block are +the literal contents of the lines, including trailing newlines, +minus four spaces of indentation. An indented code block has no +attributes. + +. + a simple + indented code block +. +
a simple
+  indented code block
+
+. + +The contents are literal text, and do not get parsed as Markdown: + +. + + *hi* + + - one +. +
<a/>
+*hi*
+
+- one
+
+. + +Here we have three chunks separated by blank lines: + +. + chunk1 + + chunk2 + + + + chunk3 +. +
chunk1
+
+chunk2
+
+
+
+chunk3
+
+. + +Any initial spaces beyond four will be included in the content, even +in interior blank lines: + +. + chunk1 + + chunk2 +. +
chunk1
+  
+  chunk2
+
+. + +An indented code block cannot interrupt a paragraph. (This +allows hanging indents and the like.) + +. +Foo + bar + +. +

Foo +bar

+. + +However, any non-blank line with fewer than four leading spaces ends +the code block immediately. So a paragraph may occur immediately +after indented code: + +. + foo +bar +. +
foo
+
+

bar

+. + +And indented code can occur immediately before and after other kinds of +blocks: + +. +# Header + foo +Header +------ + foo +---- +. +

Header

+
foo
+
+

Header

+
foo
+
+
+. + +The first line can be indented more than four spaces: + +. + foo + bar +. +
    foo
+bar
+
+. + +Blank lines preceding or following an indented code block +are not included in it: + +. + + + foo + + +. +
foo
+
+. + +Trailing spaces are included in the code block's content: + +. + foo +. +
foo  
+
+. + + +## Fenced code blocks + +A [code fence](#code-fence)
is a sequence +of at least three consecutive backtick characters (`` ` ``) or +tildes (`~`). (Tildes and backticks cannot be mixed.) +A [fenced code block](#fenced-code-block) +begins with a code fence, indented no more than three spaces. + +The line with the opening code fence may optionally contain some text +following the code fence; this is trimmed of leading and trailing +spaces and called the [info string](#info-string). + The info string may not contain any backtick +characters. (The reason for this restriction is that otherwise +some inline code would be incorrectly interpreted as the +beginning of a fenced code block.) + +The content of the code block consists of all subsequent lines, until +a closing [code fence](#code-fence) of the same type as the code block +began with (backticks or tildes), and with at least as many backticks +or tildes as the opening code fence. If the leading code fence is +indented N spaces, then up to N spaces of indentation are removed from +each line of the content (if present). (If a content line is not +indented, it is preserved unchanged. If it is indented less than N +spaces, all of the indentation is removed.) + +The closing code fence may be indented up to three spaces, and may be +followed only by spaces, which are ignored. If the end of the +containing block (or document) is reached and no closing code fence +has been found, the code block contains all of the lines after the +opening code fence until the end of the containing block (or +document). (An alternative spec would require backtracking in the +event that a closing code fence is not found. But this makes parsing +much less efficient, and there seems to be no real down side to the +behavior described here.) + +A fenced code block may interrupt a paragraph, and does not require +a blank line either before or after. + +The content of a code fence is treated as literal text, not parsed +as inlines. The first word of the info string is typically used to +specify the language of the code sample, and rendered in the `class` +attribute of the `code` tag. However, this spec does not mandate any +particular treatment of the info string. + +Here is a simple example with backticks: + +. +``` +< + > +``` +. +
<
+ >
+
+. + +With tildes: + +. +~~~ +< + > +~~~ +. +
<
+ >
+
+. + +The closing code fence must use the same character as the opening +fence: + +. +``` +aaa +~~~ +``` +. +
aaa
+~~~
+
+. + +. +~~~ +aaa +``` +~~~ +. +
aaa
+```
+
+. + +The closing code fence must be at least as long as the opening fence: + +. +```` +aaa +``` +`````` +. +
aaa
+```
+
+. + +. +~~~~ +aaa +~~~ +~~~~ +. +
aaa
+~~~
+
+. + +Unclosed code blocks are closed by the end of the document: + +. +``` +. +
+. + +. +````` + +``` +aaa +. +

+```
+aaa
+
+. + +A code block can have all empty lines as its content: + +. +``` + + +``` +. +

+  
+
+. + +A code block can be empty: + +. +``` +``` +. +
+. + +Fences can be indented. If the opening fence is indented, +content lines will have equivalent opening indentation removed, +if present: + +. + ``` + aaa +aaa +``` +. +
aaa
+aaa
+
+. + +. + ``` +aaa + aaa +aaa + ``` +. +
aaa
+aaa
+aaa
+
+. + +. + ``` + aaa + aaa + aaa + ``` +. +
aaa
+ aaa
+aaa
+
+. + +Four spaces indentation produces an indented code block: + +. + ``` + aaa + ``` +. +
```
+aaa
+```
+
+. + +Code fences (opening and closing) cannot contain internal spaces: + +. +``` ``` +aaa +. +

+aaa

+. + +. +~~~~~~ +aaa +~~~ ~~ +. +
aaa
+~~~ ~~
+
+. + +Fenced code blocks can interrupt paragraphs, and can be followed +directly by paragraphs, without a blank line between: + +. +foo +``` +bar +``` +baz +. +

foo

+
bar
+
+

baz

+. + +Other blocks can also occur before and after fenced code blocks +without an intervening blank line: + +. +foo +--- +~~~ +bar +~~~ +# baz +. +

foo

+
bar
+
+

baz

+. + +An [info string](#info-string) can be provided after the opening code fence. +Opening and closing spaces will be stripped, and the first word, prefixed +with `language-`, is used as the value for the `class` attribute of the +`code` element within the enclosing `pre` element. + +. +```ruby +def foo(x) + return 3 +end +``` +. +
def foo(x)
+  return 3
+end
+
+. + +. +~~~~ ruby startline=3 $%@#$ +def foo(x) + return 3 +end +~~~~~~~ +. +
def foo(x)
+  return 3
+end
+
+. + +. +````; +```` +. +
+. + +Info strings for backtick code blocks cannot contain backticks: + +. +``` aa ``` +foo +. +

aa +foo

+. + +Closing code fences cannot have info strings: + +. +``` +``` aaa +``` +. +
``` aaa
+
+. + + +## HTML blocks + +An [HTML block tag](#html-block-tag) is +an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag +name is one of the following (case-insensitive): +`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, +`body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, +`output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, +`section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, +`fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, +`footer`, `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, +`video`, `script`, `style`. + +An [HTML block](#html-block) begins with an +[HTML block tag](#html-block-tag), [HTML comment](#html-comment), +[processing instruction](#processing-instruction), +[declaration](#declaration), or [CDATA section](#cdata-section). +It ends when a [blank line](#blank-line) or the end of the +input is encountered. The initial line may be indented up to three +spaces, and subsequent lines may have any indentation. The contents +of the HTML block are interpreted as raw HTML, and will not be escaped +in HTML output. + +Some simple examples: + +. + + + + +
+ hi +
+ +okay. +. + + + + +
+ hi +
+

okay.

+. + +. +
+ *hello* + +. +
+ *hello* + +. + +Here we have two code blocks with a Markdown paragraph between them: + +. +
+ +*Markdown* + +
+. +
+

Markdown

+
+. + +In the following example, what looks like a Markdown code block +is actually part of the HTML block, which continues until a blank +line or the end of the document is reached: + +. +
+``` c +int x = 33; +``` +. +
+``` c +int x = 33; +``` +. + +A comment: + +. + +. + +. + +A processing instruction: + +. + +. + +. + +CDATA: + +. + +. + +. + +The opening tag can be indented 1-3 spaces, but not 4: + +. + + + +. + +
<!-- foo -->
+
+. + +An HTML block can interrupt a paragraph, and need not be preceded +by a blank line. + +. +Foo +
+bar +
+. +

Foo

+
+bar +
+. + +However, a following blank line is always needed, except at the end of +a document: + +. +
+bar +
+*foo* +. +
+bar +
+*foo* +. + +An incomplete HTML block tag may also start an HTML block: + +. +
The only restrictions are that block-level HTML elements — +> e.g. `
`, ``, `
`, `

`, etc. — must be separated from +> surrounding content by blank lines, and the start and end tags of the +> block should not be indented with tabs or spaces. + +In some ways Gruber's rule is more restrictive than the one given +here: + +- It requires that an HTML block be preceded by a blank line. +- It does not allow the start tag to be indented. +- It requires a matching end tag, which it also does not allow to + be indented. + +Indeed, most Markdown implementations, including some of Gruber's +own perl implementations, do not impose these restrictions. + +There is one respect, however, in which Gruber's rule is more liberal +than the one given here, since it allows blank lines to occur inside +an HTML block. There are two reasons for disallowing them here. +First, it removes the need to parse balanced tags, which is +expensive and can require backtracking from the end of the document +if no matching end tag is found. Second, it provides a very simple +and flexible way of including Markdown content inside HTML tags: +simply separate the Markdown from the HTML using blank lines: + +. +

+ +*Emphasized* text. + +
+. +
+

Emphasized text.

+
+. + +Compare: + +. +
+*Emphasized* text. +
+. +
+*Emphasized* text. +
+. + +Some Markdown implementations have adopted a convention of +interpreting content inside tags as text if the open tag has +the attribute `markdown=1`. The rule given above seems a simpler and +more elegant way of achieving the same expressive power, which is also +much simpler to parse. + +The main potential drawback is that one can no longer paste HTML +blocks into Markdown documents with 100% reliability. However, +*in most cases* this will work fine, because the blank lines in +HTML are usually followed by HTML block tags. For example: + +. +
+ + + + + + + +
+Hi +
+. + + + + +
+Hi +
+. + +Moreover, blank lines are usually not necessary and can be +deleted. The exception is inside `
` tags; here, one can
+replace the blank lines with `
` entities.
+
+So there is no important loss of expressive power with the new rule.
+
+## Link reference definitions
+
+A [link reference definition](#link-reference-definition)
+ consists of a [link
+label](#link-label), indented up to three spaces, followed
+by a colon (`:`), optional blank space (including up to one
+newline), a [link destination](#link-destination), optional
+blank space (including up to one newline), and an optional [link
+title](#link-title), which if it is present must be separated
+from the [link destination](#link-destination) by whitespace.
+No further non-space characters may occur on the line.
+
+A [link reference-definition](#link-reference-definition)
+does not correspond to a structural element of a document.  Instead, it
+defines a label which can be used in [reference links](#reference-link)
+and reference-style [images](#image) elsewhere in the document.  [Link
+reference definitions] can come either before or after the links that use
+them.
+
+.
+[foo]: /url "title"
+
+[foo]
+.
+

foo

+. + +. + [foo]: + /url + 'the title' + +[foo] +. +

foo

+. + +. +[Foo*bar\]]:my_(url) 'title (with parens)' + +[Foo*bar\]] +. +

Foo*bar]

+. + +. +[Foo bar]: + +'title' + +[Foo bar] +. +

Foo bar

+. + +The title may be omitted: + +. +[foo]: +/url + +[foo] +. +

foo

+. + +The link destination may not be omitted: + +. +[foo]: + +[foo] +. +

[foo]:

+

[foo]

+. + +A link can come before its corresponding definition: + +. +[foo] + +[foo]: url +. +

foo

+. + +If there are several matching definitions, the first one takes +precedence: + +. +[foo] + +[foo]: first +[foo]: second +. +

foo

+. + +As noted in the section on [Links], matching of labels is +case-insensitive (see [matches](#matches)). + +. +[FOO]: /url + +[Foo] +. +

Foo

+. + +. +[ΑΓΩ]: /φου + +[αγω] +. +

αγω

+. + +Here is a link reference definition with no corresponding link. +It contributes nothing to the document. + +. +[foo]: /url +. +. + +This is not a link reference definition, because there are +non-space characters after the title: + +. +[foo]: /url "title" ok +. +

[foo]: /url "title" ok

+. + +This is not a link reference definition, because it is indented +four spaces: + +. + [foo]: /url "title" + +[foo] +. +
[foo]: /url "title"
+
+

[foo]

+. + +This is not a link reference definition, because it occurs inside +a code block: + +. +``` +[foo]: /url +``` + +[foo] +. +
[foo]: /url
+
+

[foo]

+. + +A [link reference definition](#link-reference-definition) cannot +interrupt a paragraph. + +. +Foo +[bar]: /baz + +[bar] +. +

Foo +[bar]: /baz

+

[bar]

+. + +However, it can directly follow other block elements, such as headers +and horizontal rules, and it need not be followed by a blank line. + +. +# [Foo] +[foo]: /url +> bar +. +

Foo

+
+

bar

+
+. + +Several [link references](#link-reference) can occur one after another, +without intervening blank lines. + +. +[foo]: /foo-url "foo" +[bar]: /bar-url + "bar" +[baz]: /baz-url + +[foo], +[bar], +[baz] +. +

foo, +bar, +baz

+. + +[Link reference definitions](#link-reference-definition) can occur +inside block containers, like lists and block quotations. They +affect the entire document, not just the container in which they +are defined: + +. +[foo] + +> [foo]: /url +. +

foo

+
+
+. + + +## Paragraphs + +A sequence of non-blank lines that cannot be interpreted as other +kinds of blocks forms a [paragraph](#paragraph). +The contents of the paragraph are the result of parsing the +paragraph's raw content as inlines. The paragraph's raw content +is formed by concatenating the lines and removing initial and final +spaces. + +A simple example with two paragraphs: + +. +aaa + +bbb +. +

aaa

+

bbb

+. + +Paragraphs can contain multiple lines, but no blank lines: + +. +aaa +bbb + +ccc +ddd +. +

aaa +bbb

+

ccc +ddd

+. + +Multiple blank lines between paragraph have no effect: + +. +aaa + + +bbb +. +

aaa

+

bbb

+. + +Leading spaces are skipped: + +. + aaa + bbb +. +

aaa +bbb

+. + +Lines after the first may be indented any amount, since indented +code blocks cannot interrupt paragraphs. + +. +aaa + bbb + ccc +. +

aaa +bbb +ccc

+. + +However, the first line may be indented at most three spaces, +or an indented code block will be triggered: + +. + aaa +bbb +. +

aaa +bbb

+. + +. + aaa +bbb +. +
aaa
+
+

bbb

+. + +Final spaces are stripped before inline parsing, so a paragraph +that ends with two or more spaces will not end with a hard line +break: + +. +aaa +bbb +. +

aaa
+bbb

+. + +## Blank lines + +[Blank lines](#blank-line) between block-level elements are ignored, +except for the role they play in determining whether a [list](#list) +is [tight](#tight) or [loose](#loose). + +Blank lines at the beginning and end of the document are also ignored. + +. + + +aaa + + +# aaa + + +. +

aaa

+

aaa

+. + + +# Container blocks + +A [container block](#container-block) is a block that has other +blocks as its contents. There are two basic kinds of container blocks: +[block quotes](#block-quote) and [list items](#list-item). +[Lists](#list) are meta-containers for [list items](#list-item). + +We define the syntax for container blocks recursively. The general +form of the definition is: + +> If X is a sequence of blocks, then the result of +> transforming X in such-and-such a way is a container of type Y +> with these blocks as its content. + +So, we explain what counts as a block quote or list item by explaining +how these can be *generated* from their contents. This should suffice +to define the syntax, although it does not give a recipe for *parsing* +these constructions. (A recipe is provided below in the section entitled +[A parsing strategy](#appendix-a-a-parsing-strategy).) + +## Block quotes + +A [block quote marker](#block-quote-marker) +consists of 0-3 spaces of initial indent, plus (a) the character `>` together +with a following space, or (b) a single character `>` not followed by a space. + +The following rules define [block quotes](#block-quote): + + +1. **Basic case.** If a string of lines *Ls* constitute a sequence + of blocks *Bs*, then the result of appending a [block quote + marker](#block-quote-marker) to the beginning of each line in *Ls* + is a [block quote](#block-quote) containing *Bs*. + +2. **Laziness.** If a string of lines *Ls* constitute a [block + quote](#block-quote) with contents *Bs*, then the result of deleting + the initial [block quote marker](#block-quote-marker) from one or + more lines in which the next non-space character after the [block + quote marker](#block-quote-marker) is [paragraph continuation + text](#paragraph-continuation-text) is a block quote with *Bs* as + its content. + [Paragraph continuation text](#paragraph-continuation-text) is text + that will be parsed as part of the content of a paragraph, but does + not occur at the beginning of the paragraph. + +3. **Consecutiveness.** A document cannot contain two [block + quotes](#block-quote) in a row unless there is a [blank + line](#blank-line) between them. + +Nothing else counts as a [block quote](#block-quote). + +Here is a simple example: + +. +> # Foo +> bar +> baz +. +
+

Foo

+

bar +baz

+
+. + +The spaces after the `>` characters can be omitted: + +. +># Foo +>bar +> baz +. +
+

Foo

+

bar +baz

+
+. + +The `>` characters can be indented 1-3 spaces: + +. + > # Foo + > bar + > baz +. +
+

Foo

+

bar +baz

+
+. + +Four spaces gives us a code block: + +. + > # Foo + > bar + > baz +. +
> # Foo
+> bar
+> baz
+
+. + +The Laziness clause allows us to omit the `>` before a +paragraph continuation line: + +. +> # Foo +> bar +baz +. +
+

Foo

+

bar +baz

+
+. + +A block quote can contain some lazy and some non-lazy +continuation lines: + +. +> bar +baz +> foo +. +
+

bar +baz +foo

+
+. + +Laziness only applies to lines that are continuations of +paragraphs. Lines containing characters or indentation that indicate +block structure cannot be lazy. + +. +> foo +--- +. +
+

foo

+
+
+. + +. +> - foo +- bar +. +
+
    +
  • foo
  • +
+
+
    +
  • bar
  • +
+. + +. +> foo + bar +. +
+
foo
+
+
+
bar
+
+. + +. +> ``` +foo +``` +. +
+
+
+

foo

+
+. + +A block quote can be empty: + +. +> +. +
+
+. + +. +> +> +> +. +
+
+. + +A block quote can have initial or final blank lines: + +. +> +> foo +> +. +
+

foo

+
+. + +A blank line always separates block quotes: + +. +> foo + +> bar +. +
+

foo

+
+
+

bar

+
+. + +(Most current Markdown implementations, including John Gruber's +original `Markdown.pl`, will parse this example as a single block quote +with two paragraphs. But it seems better to allow the author to decide +whether two block quotes or one are wanted.) + +Consecutiveness means that if we put these block quotes together, +we get a single block quote: + +. +> foo +> bar +. +
+

foo +bar

+
+. + +To get a block quote with two paragraphs, use: + +. +> foo +> +> bar +. +
+

foo

+

bar

+
+. + +Block quotes can interrupt paragraphs: + +. +foo +> bar +. +

foo

+
+

bar

+
+. + +In general, blank lines are not needed before or after block +quotes: + +. +> aaa +*** +> bbb +. +
+

aaa

+
+
+
+

bbb

+
+. + +However, because of laziness, a blank line is needed between +a block quote and a following paragraph: + +. +> bar +baz +. +
+

bar +baz

+
+. + +. +> bar + +baz +. +
+

bar

+
+

baz

+. + +. +> bar +> +baz +. +
+

bar

+
+

baz

+. + +It is a consequence of the Laziness rule that any number +of initial `>`s may be omitted on a continuation line of a +nested block quote: + +. +> > > foo +bar +. +
+
+
+

foo +bar

+
+
+
+. + +. +>>> foo +> bar +>>baz +. +
+
+
+

foo +bar +baz

+
+
+
+. + +When including an indented code block in a block quote, +remember that the [block quote marker](#block-quote-marker) includes +both the `>` and a following space. So *five spaces* are needed after +the `>`: + +. +> code + +> not code +. +
+
code
+
+
+
+

not code

+
+. + + +## List items + +A [list marker](#list-marker) is a +[bullet list marker](#bullet-list-marker) or an [ordered list +marker](#ordered-list-marker). + +A [bullet list marker](#bullet-list-marker) +is a `-`, `+`, or `*` character. + +An [ordered list marker](#ordered-list-marker) +is a sequence of one of more digits (`0-9`), followed by either a +`.` character or a `)` character. + +The following rules define [list items](#list-item): + +1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of + blocks *Bs* starting with a non-space character and not separated + from each other by more than one blank line, and *M* is a list + marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result + of prepending *M* and the following spaces to the first line of + *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a + list item with *Bs* as its contents. The type of the list item + (bullet or ordered) is determined by the type of its list marker. + If the list item is ordered, then it is also assigned a start + number, based on the ordered list marker. + +For example, let *Ls* be the lines + +. +A paragraph +with two lines. + + indented code + +> A block quote. +. +

A paragraph +with two lines.

+
indented code
+
+
+

A block quote.

+
+. + +And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says +that the following is an ordered list item with start number 1, +and the same contents as *Ls*: + +. +1. A paragraph + with two lines. + + indented code + + > A block quote. +. +
    +
  1. A paragraph +with two lines.

    +
    indented code
    +
    +
    +

    A block quote.

    +
  2. +
+. + +The most important thing to notice is that the position of +the text after the list marker determines how much indentation +is needed in subsequent blocks in the list item. If the list +marker takes up two spaces, and there are three spaces between +the list marker and the next nonspace character, then blocks +must be indented five spaces in order to fall under the list +item. + +Here are some examples showing how far content must be indented to be +put under the list item: + +. +- one + + two +. +
    +
  • one
  • +
+

two

+. + +. +- one + + two +. +
    +
  • one

    +

    two

  • +
+. + +. + - one + + two +. +
    +
  • one
  • +
+
 two
+
+. + +. + - one + + two +. +
    +
  • one

    +

    two

  • +
+. + +It is tempting to think of this in terms of columns: the continuation +blocks must be indented at least to the column of the first nonspace +character after the list marker. However, that is not quite right. +The spaces after the list marker determine how much relative indentation +is needed. Which column this indentation reaches will depend on +how the list item is embedded in other constructions, as shown by +this example: + +. + > > 1. one +>> +>> two +. +
+
+
    +
  1. one

    +

    two

  2. +
+
+
+. + +Here `two` occurs in the same column as the list marker `1.`, +but is actually contained in the list item, because there is +sufficent indentation after the last containing blockquote marker. + +The converse is also possible. In the following example, the word `two` +occurs far to the right of the initial text of the list item, `one`, but +it is not considered part of the list item, because it is not indented +far enough past the blockquote marker: + +. +>>- one +>> + > > two +. +
+
+
    +
  • one
  • +
+

two

+
+
+. + +A list item may not contain blocks that are separated by more than +one blank line. Thus, two blank lines will end a list, unless the +two blanks are contained in a [fenced code block](#fenced-code-block). + +. +- foo + + bar + +- foo + + + bar + +- ``` + foo + + + bar + ``` +. +
    +
  • foo

    +

    bar

  • +
  • foo

  • +
+

bar

+
    +
  • foo
    +
    +
    +bar
    +
  • +
+. + +A list item may contain any kind of block: + +. +1. foo + + ``` + bar + ``` + + baz + + > bam +. +
    +
  1. foo

    +
    bar
    +
    +

    baz

    +
    +

    bam

    +
  2. +
+. + +2. **Item starting with indented code.** If a sequence of lines *Ls* + constitute a sequence of blocks *Bs* starting with an indented code + block and not separated from each other by more than one blank line, + and *M* is a list marker *M* of width *W* followed by + one space, then the result of prepending *M* and the following + space to the first line of *Ls*, and indenting subsequent lines of + *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. + If a line is empty, then it need not be indented. The type of the + list item (bullet or ordered) is determined by the type of its list + marker. If the list item is ordered, then it is also assigned a + start number, based on the ordered list marker. + +An indented code block will have to be indented four spaces beyond +the edge of the region where text will be included in the list item. +In the following case that is 6 spaces: + +. +- foo + + bar +. +
    +
  • foo

    +
    bar
    +
  • +
+. + +And in this case it is 11 spaces: + +. + 10. foo + + bar +. +
    +
  1. foo

    +
    bar
    +
  2. +
+. + +If the *first* block in the list item is an indented code block, +then by rule #2, the contents must be indented *one* space after the +list marker: + +. + indented code + +paragraph + + more code +. +
indented code
+
+

paragraph

+
more code
+
+. + +. +1. indented code + + paragraph + + more code +. +
    +
  1. indented code
    +
    +

    paragraph

    +
    more code
    +
  2. +
+. + +Note that an additional space indent is interpreted as space +inside the code block: + +. +1. indented code + + paragraph + + more code +. +
    +
  1.  indented code
    +
    +

    paragraph

    +
    more code
    +
  2. +
+. + +Note that rules #1 and #2 only apply to two cases: (a) cases +in which the lines to be included in a list item begin with a nonspace +character, and (b) cases in which they begin with an indented code +block. In a case like the following, where the first block begins with +a three-space indent, the rules do not allow us to form a list item by +indenting the whole thing and prepending a list marker: + +. + foo + +bar +. +

foo

+

bar

+. + +. +- foo + + bar +. +
    +
  • foo
  • +
+

bar

+. + +This is not a significant restriction, because when a block begins +with 1-3 spaces indent, the indentation can always be removed without +a change in interpretation, allowing rule #1 to be applied. So, in +the above case: + +. +- foo + + bar +. +
    +
  • foo

    +

    bar

  • +
+. + + +3. **Indentation.** If a sequence of lines *Ls* constitutes a list item + according to rule #1 or #2, then the result of indenting each line + of *L* by 1-3 spaces (the same for each line) also constitutes a + list item with the same contents and attributes. If a line is + empty, then it need not be indented. + +Indented one space: + +. + 1. A paragraph + with two lines. + + indented code + + > A block quote. +. +
    +
  1. A paragraph +with two lines.

    +
    indented code
    +
    +
    +

    A block quote.

    +
  2. +
+. + +Indented two spaces: + +. + 1. A paragraph + with two lines. + + indented code + + > A block quote. +. +
    +
  1. A paragraph +with two lines.

    +
    indented code
    +
    +
    +

    A block quote.

    +
  2. +
+. + +Indented three spaces: + +. + 1. A paragraph + with two lines. + + indented code + + > A block quote. +. +
    +
  1. A paragraph +with two lines.

    +
    indented code
    +
    +
    +

    A block quote.

    +
  2. +
+. + +Four spaces indent gives a code block: + +. + 1. A paragraph + with two lines. + + indented code + + > A block quote. +. +
1.  A paragraph
+    with two lines.
+
+        indented code
+
+    > A block quote.
+
+. + + +4. **Laziness.** If a string of lines *Ls* constitute a [list + item](#list-item) with contents *Bs*, then the result of deleting + some or all of the indentation from one or more lines in which the + next non-space character after the indentation is + [paragraph continuation text](#paragraph-continuation-text) is a + list item with the same contents and attributes. + +Here is an example with lazy continuation lines: + +. + 1. A paragraph +with two lines. + + indented code + + > A block quote. +. +
    +
  1. A paragraph +with two lines.

    +
    indented code
    +
    +
    +

    A block quote.

    +
  2. +
+. + +Indentation can be partially deleted: + +. + 1. A paragraph + with two lines. +. +
    +
  1. A paragraph +with two lines.
  2. +
+. + +These examples show how laziness can work in nested structures: + +. +> 1. > Blockquote +continued here. +. +
+
    +
  1. +

    Blockquote +continued here.

    +
  2. +
+
+. + +. +> 1. > Blockquote +> continued here. +. +
+
    +
  1. +

    Blockquote +continued here.

    +
  2. +
+
+. + + +5. **That's all.** Nothing that is not counted as a list item by rules + #1--4 counts as a [list item](#list-item). + +The rules for sublists follow from the general rules above. A sublist +must be indented the same number of spaces a paragraph would need to be +in order to be included in the list item. + +So, in this case we need two spaces indent: + +. +- foo + - bar + - baz +. +
    +
  • foo +
      +
    • bar +
        +
      • baz
      • +
    • +
  • +
+. + +One is not enough: + +. +- foo + - bar + - baz +. +
    +
  • foo
  • +
  • bar
  • +
  • baz
  • +
+. + +Here we need four, because the list marker is wider: + +. +10) foo + - bar +. +
    +
  1. foo +
      +
    • bar
    • +
  2. +
+. + +Three is not enough: + +. +10) foo + - bar +. +
    +
  1. foo
  2. +
+
    +
  • bar
  • +
+. + +A list may be the first block in a list item: + +. +- - foo +. +
    +
    • +
    • foo
    • +
  • +
+. + +. +1. - 2. foo +. +
    +
    • +
      1. +
      2. foo
      3. +
    • +
  1. +
+. + +A list item may be empty: + +. +- foo +- +- bar +. +
    +
  • foo
  • +
  • +
  • bar
  • +
+. + +. +- +. +
    +
  • +
+. + +### Motivation + +John Gruber's Markdown spec says the following about list items: + +1. "List markers typically start at the left margin, but may be indented + by up to three spaces. List markers must be followed by one or more + spaces or a tab." + +2. "To make lists look nice, you can wrap items with hanging indents.... + But if you don't want to, you don't have to." + +3. "List items may consist of multiple paragraphs. Each subsequent + paragraph in a list item must be indented by either 4 spaces or one + tab." + +4. "It looks nice if you indent every line of the subsequent paragraphs, + but here again, Markdown will allow you to be lazy." + +5. "To put a blockquote within a list item, the blockquote's `>` + delimiters need to be indented." + +6. "To put a code block within a list item, the code block needs to be + indented twice — 8 spaces or two tabs." + +These rules specify that a paragraph under a list item must be indented +four spaces (presumably, from the left margin, rather than the start of +the list marker, but this is not said), and that code under a list item +must be indented eight spaces instead of the usual four. They also say +that a block quote must be indented, but not by how much; however, the +example given has four spaces indentation. Although nothing is said +about other kinds of block-level content, it is certainly reasonable to +infer that *all* block elements under a list item, including other +lists, must be indented four spaces. This principle has been called the +*four-space rule*. + +The four-space rule is clear and principled, and if the reference +implementation `Markdown.pl` had followed it, it probably would have +become the standard. However, `Markdown.pl` allowed paragraphs and +sublists to start with only two spaces indentation, at least on the +outer level. Worse, its behavior was inconsistent: a sublist of an +outer-level list needed two spaces indentation, but a sublist of this +sublist needed three spaces. It is not surprising, then, that different +implementations of Markdown have developed very different rules for +determining what comes under a list item. (Pandoc and python-Markdown, +for example, stuck with Gruber's syntax description and the four-space +rule, while discount, redcarpet, marked, PHP Markdown, and others +followed `Markdown.pl`'s behavior more closely.) + +Unfortunately, given the divergences between implementations, there +is no way to give a spec for list items that will be guaranteed not +to break any existing documents. However, the spec given here should +correctly handle lists formatted with either the four-space rule or +the more forgiving `Markdown.pl` behavior, provided they are laid out +in a way that is natural for a human to read. + +The strategy here is to let the width and indentation of the list marker +determine the indentation necessary for blocks to fall under the list +item, rather than having a fixed and arbitrary number. The writer can +think of the body of the list item as a unit which gets indented to the +right enough to fit the list marker (and any indentation on the list +marker). (The laziness rule, #4, then allows continuation lines to be +unindented if needed.) + +This rule is superior, we claim, to any rule requiring a fixed level of +indentation from the margin. The four-space rule is clear but +unnatural. It is quite unintuitive that + +``` markdown +- foo + + bar + + - baz +``` + +should be parsed as two lists with an intervening paragraph, + +``` html +
    +
  • foo
  • +
+

bar

+
    +
  • baz
  • +
+``` + +as the four-space rule demands, rather than a single list, + +``` html +
    +
  • foo

    +

    bar

    +
      +
    • baz
    • +
  • +
+``` + +The choice of four spaces is arbitrary. It can be learned, but it is +not likely to be guessed, and it trips up beginners regularly. + +Would it help to adopt a two-space rule? The problem is that such +a rule, together with the rule allowing 1--3 spaces indentation of the +initial list marker, allows text that is indented *less than* the +original list marker to be included in the list item. For example, +`Markdown.pl` parses + +``` markdown + - one + + two +``` + +as a single list item, with `two` a continuation paragraph: + +``` html +
    +
  • one

    +

    two

  • +
+``` + +and similarly + +``` markdown +> - one +> +> two +``` + +as + +``` html +
+
    +
  • one

    +

    two

  • +
+
+``` + +This is extremely unintuitive. + +Rather than requiring a fixed indent from the margin, we could require +a fixed indent (say, two spaces, or even one space) from the list marker (which +may itself be indented). This proposal would remove the last anomaly +discussed. Unlike the spec presented above, it would count the following +as a list item with a subparagraph, even though the paragraph `bar` +is not indented as far as the first paragraph `foo`: + +``` markdown + 10. foo + + bar +``` + +Arguably this text does read like a list item with `bar` as a subparagraph, +which may count in favor of the proposal. However, on this proposal indented +code would have to be indented six spaces after the list marker. And this +would break a lot of existing Markdown, which has the pattern: + +``` markdown +1. foo + + indented code +``` + +where the code is indented eight spaces. The spec above, by contrast, will +parse this text as expected, since the code block's indentation is measured +from the beginning of `foo`. + +The one case that needs special treatment is a list item that *starts* +with indented code. How much indentation is required in that case, since +we don't have a "first paragraph" to measure from? Rule #2 simply stipulates +that in such cases, we require one space indentation from the list marker +(and then the normal four spaces for the indented code). This will match the +four-space rule in cases where the list marker plus its initial indentation +takes four spaces (a common case), but diverge in other cases. + +## Lists + +A [list](#list) is a sequence of one or more +list items [of the same type](#of-the-same-type). The list items +may be separated by single [blank lines](#blank-line), but two +blank lines end all containing lists. + +Two list items are [of the same type](#of-the-same-type) + if they begin with a [list +marker](#list-marker) of the same type. Two list markers are of the +same type if (a) they are bullet list markers using the same character +(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same +delimiter (either `.` or `)`). + +A list is an [ordered list](#ordered-list) +if its constituent list items begin with +[ordered list markers](#ordered-list-marker), and a [bullet +list](#bullet-list) if its constituent list +items begin with [bullet list markers](#bullet-list-marker). + +The [start number](#start-number) +of an [ordered list](#ordered-list) is determined by the list number of +its initial list item. The numbers of subsequent list items are +disregarded. + +A list is [loose](#loose) if it any of its constituent list items are +separated by blank lines, or if any of its constituent list items +directly contain two block-level elements with a blank line between +them. Otherwise a list is [tight](#tight). (The difference in HTML output +is that paragraphs in a loose with are wrapped in `

` tags, while +paragraphs in a tight list are not.) + +Changing the bullet or ordered list delimiter starts a new list: + +. +- foo +- bar ++ baz +. +

    +
  • foo
  • +
  • bar
  • +
+
    +
  • baz
  • +
+. + +. +1. foo +2. bar +3) baz +. +
    +
  1. foo
  2. +
  3. bar
  4. +
+
    +
  1. baz
  2. +
+. + +There can be blank lines between items, but two blank lines end +a list: + +. +- foo + +- bar + + +- baz +. +
    +
  • foo

  • +
  • bar

  • +
+
    +
  • baz
  • +
+. + +As illustrated above in the section on [list items](#list-item), +two blank lines between blocks *within* a list item will also end a +list: + +. +- foo + + + bar +- baz +. +
    +
  • foo
  • +
+

bar

+
    +
  • baz
  • +
+. + +Indeed, two blank lines will end *all* containing lists: + +. +- foo + - bar + - baz + + + bim +. +
    +
  • foo +
      +
    • bar +
        +
      • baz
      • +
    • +
  • +
+
  bim
+
+. + +Thus, two blank lines can be used to separate consecutive lists of +the same type, or to separate a list from an indented code block +that would otherwise be parsed as a subparagraph of the final list +item: + +. +- foo +- bar + + +- baz +- bim +. +
    +
  • foo
  • +
  • bar
  • +
+
    +
  • baz
  • +
  • bim
  • +
+. + +. +- foo + + notcode + +- foo + + + code +. +
    +
  • foo

    +

    notcode

  • +
  • foo

  • +
+
code
+
+. + +List items need not be indented to the same level. The following +list items will be treated as items at the same list level, +since none is indented enough to belong to the previous list +item: + +. +- a + - b + - c + - d + - e + - f +- g +. +
    +
  • a
  • +
  • b
  • +
  • c
  • +
  • d
  • +
  • e
  • +
  • f
  • +
  • g
  • +
+. + +This is a loose list, because there is a blank line between +two of the list items: + +. +- a +- b + +- c +. +
    +
  • a

  • +
  • b

  • +
  • c

  • +
+. + +So is this, with a empty second item: + +. +* a +* + +* c +. +
    +
  • a

  • +
  • +
  • c

  • +
+. + +These are loose lists, even though there is no space between the items, +because one of the items directly contains two block-level elements +with a blank line between them: + +. +- a +- b + + c +- d +. +
    +
  • a

  • +
  • b

    +

    c

  • +
  • d

  • +
+. + +. +- a +- b + + [ref]: /url +- d +. +
    +
  • a

  • +
  • b

  • +
  • d

  • +
+. + +This is a tight list, because the blank lines are in a code block: + +. +- a +- ``` + b + + + ``` +- c +. +
    +
  • a
  • +
  • b
    +
    +
    +
  • +
  • c
  • +
+. + +This is a tight list, because the blank line is between two +paragraphs of a sublist. So the inner list is loose while +the other list is tight: + +. +- a + - b + + c +- d +. +
    +
  • a +
      +
    • b

      +

      c

    • +
  • +
  • d
  • +
+. + +This is a tight list, because the blank line is inside the +block quote: + +. +* a + > b + > +* c +. +
    +
  • a +
    +

    b

    +
  • +
  • c
  • +
+. + +This list is tight, because the consecutive block elements +are not separated by blank lines: + +. +- a + > b + ``` + c + ``` +- d +. +
    +
  • a +
    +

    b

    +
    +
    c
    +
  • +
  • d
  • +
+. + +A single-paragraph list is tight: + +. +- a +. +
    +
  • a
  • +
+. + +. +- a + - b +. +
    +
  • a +
      +
    • b
    • +
  • +
+. + +Here the outer list is loose, the inner list tight: + +. +* foo + * bar + + baz +. +
    +
  • foo

    +
      +
    • bar
    • +
    +

    baz

  • +
+. + +. +- a + - b + - c + +- d + - e + - f +. +
    +
  • a

    +
      +
    • b
    • +
    • c
    • +
  • +
  • d

    +
      +
    • e
    • +
    • f
    • +
  • +
+. + +# Inlines + +Inlines are parsed sequentially from the beginning of the character +stream to the end (left to right, in left-to-right languages). +Thus, for example, in + +. +`hi`lo` +. +

hilo`

+. + +`hi` is parsed as code, leaving the backtick at the end as a literal +backtick. + +## Backslash escapes + +Any ASCII punctuation character may be backslash-escaped: + +. +\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ +. +

!"#$%&'()*+,-./:;<=>?