The Mk2 Markup Language
This is a draft standard, this notice will disappear once the specification is final.
Mk2 is a human readable plain text language for expressing Geneva documents.¹ It is designed with both ergonomics and technical pragmatism in mind.
Syntax
This formal definition uses the modified BNF syntax of ANSI CL's Notational Conventions.¹ The following axioms are used throughout the definition:
String—A character sequence. The exact grammar depends on the surrounding context. See Escape Rules.
LF—A character sequence denoting a line break. The exact representation is platform dependent.
EOF—The end of input.
SP—A whitespace character. The exact set of characters considered whitespace is platform dependent.
Document and Section
Symbol | Expression |
---|---|
document | [ element separator ] * EOF |
section | "<" title separator [ element separator ] * ">" separator |
title | rich-text |
element | section | table | plaintext | media | listing | paragraph |
separator | double-lf | EOF |
double-lf | LF [ LF ] + |
Paragraph and Listing
Symbol | Expression |
---|---|
paragraph | text-token+ |
listing | item+ |
item | "+" rich-text |
Table, Media and Plaintext
Symbol | Expression |
---|---|
table | "#table" description "#" LF table-body |
description | rich-text |
table-body | row* last-row |
row | column+ LF |
last-row | column+ |
column | "|" rich-text |
Symbol | Expression |
---|---|
media | "#media" description "#" LF String |
description | rich-text |
Symbol | Expression |
---|---|
plaintext | "#code" description "#" LF line+ end |
description | rich-text |
line | String LF |
end | SP* "#" |
Rich Text
Symbol | Expression |
---|---|
rich-text | text-token* |
text-token | bold | italic | fixed-width | url | plain |
bold | "*" String "*" |
italic | "_" String "_" |
fixed-width | "{" String "}" |
url | "[" String "]" [ "(" String ")" ] |
plain | String |
Escape Rules
The “\
” (backslash) can be used to escape the next character. The grammatical significance of a character following “\
” is ignored.
The exact grammar of the String axiom is context dependent. A String may not contain unescaped terminating sequences. A terminating sequence is the set of any token following the String axiom in a rule and double-lf. In order to escape a terminating sequence its first character must be escaped.
For illustration consider the grammar in Table 7 which utilizes the String axiom. In rule the String axiom is followed by terminator, thus “foo
” is a terminating sequence of String in rule. Valid and invalid character sequences for String in rule are shown in Table 8.
Symbol | Expression |
---|---|
rule | String terminator |
terminator | "foo" |
Valid | Invalid |
---|---|
quick brown \foo | quick brown foo |
Examples
Document and Section
The Mk2 file in Figure 1 contains a paragraph (A quick brown fox...) and a section titled “On Pangrams” which contains another paragraph (A pangram is...).
Listing and Text Tokens
The listing in Figure 2 contains six items, each being a single text token.
Table, Media and Plaintext
The Mk2 file in Figure 3 contains table, media and plaintext object, each having a description and their respective bodies.
Escaping
Mk2 is designed to avoid the need of escaping control tokens as much as possible. Still there are some cases where the user has to use the \
(backslash) character to avoid the semantics of a specific token. Below are examples of the most common cases.
Mk2 | Result |
---|---|
In ECMAScript anonymous functions can be expressed using the {function (...) { ... \}} special form. | In ECMAScript anonymous functions can be expressed using the function (...) { ... } special form. |
The Mk2 file in Figure 4 escapes the first }
(curly bracket) character inside a fixed width text token in order to avoid terminating the fixed width token prematurely. Not that only the closing bracket needs to be escaped because it is the only terminating token of the String in a fixed width token.
Mk2 | Result |
---|---|
On DOS, {\\} (backslash) is used to separate the components of a pathname. | On DOS, \ (backslash) is used to separate the components of a pathname. |
Sometimes the user needs to include the literal backslash character in his prose. The \
(backslash) character can be escaped using itself just like any other character as Figure 5 shows.