The Mk2 Markup Language

This is a draft standard, this notice will disappear once the specification is final.

Mk2 is a human readable plain text language for expressing Geneva documents.¹ It is designed with both ergonomics and technical pragmatism in mind.


This formal definition uses the modified BNF syntax of ANSI CL's Notational Conventions.¹ The following axioms are used throughout the definition:

String—A character sequence. The exact grammar depends on the surrounding context. See Escape Rules.

LF—A character sequence denoting a line break. The exact representation is platform dependent.

EOF—The end of input.

SP—A whitespace character. The exact set of characters considered whitespace is platform dependent.

Document and Section

document[ element separator ]* EOF
section"<" title separator [ element separator ]* ">" separator
elementsection | table | plaintext | media | listing | paragraph
separatordouble-lf | EOF
double-lfLF [ LF ]+
Table 1. Document and section syntax.

Paragraph and Listing

item"+" rich-text
Table 2. Paragraph and Listing syntax.

Table, Media and Plaintext

table"#table" description "#" LF table-body
table-bodyrow* last-row
rowcolumn+ LF
column"|" rich-text
Table 3. Table syntax.
media"#media" description "#" LF String
Table 4. Media syntax.
plaintext"#code" description "#" LF line+ end
lineString LF
endSP* "#"
Table 5. Plaintext syntax.

Rich Text

text-tokenbold | italic | fixed-width | url | plain
bold"*" String "*"
italic"_" String "_"
fixed-width"{" String "}"
url"[" String "]" [ "(" String ")" ]
Table 6. Rich text syntax.

Escape Rules

The “\” (backslash) can be used to escape the next character. The grammatical significance of a character following “\” is ignored.

The exact grammar of the String axiom is context dependent. A String may not contain unescaped terminating sequences. A terminating sequence is the set of any token following the String axiom in a rule and double-lf. In order to escape a terminating sequence its first character must be escaped.

For illustration consider the grammar in Table 7 which utilizes the String axiom. In rule the String axiom is followed by terminator, thus “foo” is a terminating sequence of String in rule. Valid and invalid character sequences for String in rule are shown in Table 8.

ruleString terminator
Table 7. Exemplary grammar rules to illustrate escape rules for the String axiom.
quick brown \fooquick brown foo
Table 8. Valid and invalid character sequences for String in rule.


Document and Section

The Mk2 file in Figure 1 contains a paragraph (A quick brown fox...) and a section titled “On Pangrams” which contains another paragraph (A pangram is...).

A quick brown fox jumps over the lazy dog.

< On Pangrams

 A pangram is a phrase that contains all of the letters of the

Figure 1

Listing and Text Tokens

The listing in Figure 2 contains six items, each being a single text token.

+ Plain text token
+ *Bold text token*
+ _Italic text token_
+ {Fixed-width text token}
+ []
+ [Labeled URL](
Figure 2

Table, Media and Plaintext

The Mk2 file in Figure 3 contains table, media and plaintext object, each having a description and their respective bodies.

#table Source: Wikipedia.#
| State                  | Area          | Total Population
| Bavaria                | 70,549.44 km² | 12,604,244
| North Rhine-Westphalia | 34,084.13 km² | 17,571,856

#media Imaginary embedded video.#

#code {SQUARE} function in Common Lisp.#
(defun square (n)
  (expt n 2))
Figure 3


Mk2 is designed to avoid the need of escaping control tokens as much as possible. Still there are some cases where the user has to use the \ (backslash) character to avoid the semantics of a specific token. Below are examples of the most common cases.

In ECMAScript anonymous functions can be expressed using the {function (...) { ... \}} special form.In ECMAScript anonymous functions can be expressed using the function (...) { ... } special form.
Figure 4 Escaping unintended text token markup.

The Mk2 file in Figure 4 escapes the first } (curly bracket) character inside a fixed width text token in order to avoid terminating the fixed width token prematurely. Not that only the closing bracket needs to be escaped because it is the only terminating token of the String in a fixed width token.

On DOS, {\\} (backslash) is used to separate the components of a pathname.On DOS, \ (backslash) is used to separate the components of a pathname.
Figure 5 Including the literal backslash character.

Sometimes the user needs to include the literal backslash character in his prose. The \ (backslash) character can be escaped using itself just like any other character as Figure 5 shows.