The GDLisp Parser
The GDLisp language is a Lisp dialect, with a few extensions to make object-oriented programming using Godot types more straightforward.
Listed below is a full description of the GDLisp grammar using Extended Backup-Naur form.
GDLisp source code is always written in UTF-8. GDLisp supports the full breadth of Unicode code points, so unless otherwise stated, “character” refers to a valid Unicode character.
prefixed-expr = [ prefix ], expr ;
prefix = "'" | "#'" | "`" | "," | ",." ;
expr = literal | list-expr | array-expr |
dict-expr | vector-expr |
nested-name | nested-node-path |
self-name | self-node-path ;
literal = "#t" | "#f" | integer | float | string | symbol ;
list-expr = "(", ")" | "(", prefixed-expr, {prefixed-expr}, ["." prefixed-expr], ")" ;
array-expr = "[", {prefixed-expr}, "]" ;
dict-expr = "{", {prefixed-expr, prefixed-expr}, "}" ;
vector-expr = "V{", prefixed-expr, prefixed-expr, [prefixed-expr], "}" ;
nested-name = expr, ":", symbol ;
nested-node-path = expr, ":", node-path ;
self-name = "@", symbol ;
self-node-path = node-path ;
integer = ? integer literal ? ;
float = ? floating point literal ? ;
string = ? string literal ? ;
symbol = ? symbol literal ? ;
node-path = ? node path literal ? ;
GDLisp code, like any other Lisp dialect, is built up of zero or more S-expressions, where an S-expression is defined to be an (optionally prefixed) form of any of the following.
A literal expression
A list
An array
A dictionary
A (2D or 3D) vector
A nested name or node path
A self name or node path
Unless otherwise stated, an arbitrary amount of whitespace is allowed
between adjacent tokens in the above EBNF grammar. “Whitespace” is,
here, defined as any Unicode character with the White_Space=yes
property.
Literals
literal = "#t" | "#f" | integer | float | string | symbol ;
integer = ? integer literal ? ;
float = ? floating point literal ? ;
string = ? string literal ? ;
symbol = ? symbol literal ? ;
node-path = ? node path literal ? ;
Literals in GDLisp are integers, floating point values, strings, symbols, or node path literals.
Integer Literals
An integer literal consists of an optional sign (+ or -)
followed by one or more ASCII digits (0 to 9). The following
are valid integer literals: 0, +56, -9, 10000,
00900. Leading zeroes in an integer literal are ignored. Unlike in
C, the presence of a leading zero does not cause the subsequent
number to be interpreted as ASCII.
Floating Point Literals
A floating point literal is an expression which matches the following regular expression
[+-]?[0-9]+(\.[0-9]+)?([eE][+-]?[0-9]+)?
and is not a valid integer literal.
String Literals
A string literal is a sequence of zero or more characters enclosed in
quotation marks ". Inside the quotation marks is a sequence of
individual string characters, where each individual character is one
of
Any Unicode character other than a backslash
\or a quotation mark".A valid escape sequence beginning in a backslash
\.
Escape Sequences
An escape sequence in GDLisp begins with a backslash and consists of one or more characters indicating what character the sequence should be translated to in the resulting code. GDLisp supports most of the commonly-used escape sequences found in C-style languages.
\ntranslates to a newline (0x0A)\ttranslates to a horizontal tab (0x09)\rtranslates to a carriage return (0x0D)\atranslates to the ‘alert’ character (0x07)\btranslates to the backspace character (0x08)\ftranslates to a form feed (0x0C)\vtranslates to a vertical tab (0x0B)\"translates to a literal quotation mark (0x22)\'translates to a literal apostrophe (0x27)\\translates to a literal backslash (0x5C)
Finally, a backslash followed by a u is a Unicode literal. Unicode
literals can be represented in two forms: basic and extended.
A basic Unicode literal consists of \u followed by exactly four
valid hexadecimal characters. The four characters are interpreted as a
number in base 16 and must point to a valid Unicode code point. Note
that there are characters outside of the basic multilingual plane
(such as emoji) that cannot be represented in this way. For such
characters, the extended form is provided.
An extended Unicode literal consists of \u followed by a
curly-brace-enclosed list of at least one hexadecimal character. The
characters in the list are interpreted as a number in base 16 and must
point to a valid Unicode code point.
A backslash followed by any other character in a string literal is an error.
Symbol Literals
Symbols are the cornerstone of a Lisp program and are used as variable and function names. These are, generally speaking, the valid identifiers in a Lisp program.
symbol = starting-char, { following-char }, { qualifier } ;
qualifier = ".", following-char, { following-char } ;
A symbol consists of a starting character, followed by zero or more following characters, then subsequently followed by zero or more qualifiers. A qualifier consists of a dot followed by one or more following characters.
The starting character of a symbol literal can be any of the following.
An ASCII letter
Any of the following:
_~+=-\/!%^&*<>?Any non-ASCII character which falls into the Unicode categories L, Mn, Nl, No, S, Pc, Pd, or Po.
A “following” character can be any starting character, a valid ASCII number, or any non-ASCII character in the Unicode category N.
The following are examples of valid identifiers in GDLisp: foo,
bar, satisfies?, set-element, list/map,
com.mercerenies.gdlisp.
Note
Unlike in some Lisp dialects, symbols in GDLisp are case
sensitive. That is, foo and Foo are distinct
symbols that refer to distinct variables or functions.
Node Path Literals
A node path literal is the primary means of accessing nodes in the
scene tree whose names are known at compile-time. A node path literal
consists of a dollar sign $ followed by either a quoted string
literal or a sequence of one or more of the following:
* An ASCII letter or number
* Any of the following: _~+=-\/!$%^&*<>?
Note that only ASCII characters are allowed in the non-quoted node path form. To include Unicode characters in a node path, it is necessary to quote the path.
Lists
list-expr = "(", ")" | "(", prefixed-expr, {prefixed-expr}, ["." prefixed-expr], ")" ;
In GDLisp, the fundamental unit of composition is a cons cell, sometimes called a pair. A cons cell consists of two elements, conventionally referred to as the car and the cdr, separated by a dot and enclosed in parentheses.
(a . b)
By convention, lists are built up as a singly-linked list, using cons cells as the links. The car of each is the first element, or “head”, of the list, and the cdr of each is the rest of the list, or “tail”. The end of the list is denoted with the special “null” atom, indicating by a pair of parentheses with nothing in between.
This convention is so widely used in Lisp programs that the syntax supports it directly. That is, a sequence of of one or more S-expressions, followed by the dotted end of the list, is interpreted as a list whose final cdr is the rightmost term. Concretely, the following are equivalent.
(a b c . d)
(a . (b . (c . d)))
Similarly, if the dotted terminator is left off, it is assumed to be
the special () null object, so the following are equivalent.
(a b c d)
(a . (b . (c . d . ())))
An S-expression can be viewed as a dotted list, consisting of a
leading list of values (in the car portion of cons cells)
terminated by an arbitrary non-cons value as the final cdr. For
instance, the S-expression (a b c . d) (or, equivalently, (a .
(b . (c . d))) would be viewed as a dotted list consisting of the
elements a, b, and c, terminated by d.
We call a dotted list which is terminated by the () null object a
proper list. All lists appearing unquoted in GDLisp source code must
be proper lists.
Arrays
array-expr = "[", {prefixed-expr}, "]" ;
Like in GDScript, a GDLisp array is a general-purpose random-access
data structure. Array literals are written as a proper list whose
first element is the symbol array. For convenience, zero or more
expressions wrapped in square brackets will desugar to this syntax.
Note that, unlike in GDScript, elements in an array literal are not
separated by commas.
Examples:
[] ==> (array)
[1 2 3 4] ==> (array 1 2 3 4)
Dictionaries
dict-expr = "{", {prefixed-expr, prefixed-expr}, "}" ;
A dictionary expression is a collection of an even number of
expressions, enclosed in curly braces. Like array literals, dictionary
literals are not a new form of syntax but are instead mere sugar for
something that can be expressed with only basic S-expressions. The
first element of each pair of expressions is a key and the second is a
value in the resulting dictionary. It is an error to have a
brace-enclosed collection of an odd number of expressions. A
dictionary literal desugars to a proper list whose first element is
the symbol dict.
Examples:
{} ==> (dict)
{1 2 3 4} ==> (dict 1 2 3 4)
Vectors
vector-expr = "V{", prefixed-expr, prefixed-expr, [prefixed-expr], "}" ;
Vectors are used in Godot to represent objects in 2D or 3D space. A
vector in GDLisp is constructed using the vector built-in
function. Since vectors are so ubiquitous, the syntax V{ ... } is
provided, which desugars to a call to the vector function with the
given arguments.
Nested Names
nested-name = expr, ":", symbol ;
nested-node-path = expr, ":", node-path ;
self-name = "@", symbol ;
self-node-path = node-path ;
Godot is built on an single-inheritance, messaging-passing object-oriented paradigm. This means that it’s very common to call a function on an object, not just an independent function that exists in the abstract.
In GDLisp, the equivalent to the GDScript “dot” operator is the
access-slot built-in special form. That is, foo.bar in
GDScript would be translated to the GDLisp (access-slot foo bar).
This is relatively awkward and verbose to write all the time, so
several shortcuts are provided. These are purely syntax sugar and are
not new data structures in the abstract syntax tree of the
language.
The following translations are made.
foo:bar, with a literal colon in the middle, is translated to(access-slot foo bar). The left-hand side can be any (non-prefixed) expression, and the right-hand side must be a symbol literal. Note that a prefix in front of the left-hand side will be parsed at a lower precedence than the:, so'a:bwill be parsed as(quote (access-slot a b)), not(access-slot (quote a) b). If the precedence is not to your liking, then you can always write out the S-expressions by hand rather than using the syntax sugar, and sometimes this is necessary in complex macro expansions.foo:$bartranslates to a function call equivalent to(foo:get-node "bar"). Written in full generality, this is((access-slot foo get-node) "bar"). Internally, the actual expression calls a GDLisp built-in function that allows for better optimization potential, but the effect is the same as long as your custom nodes do not overrideget-node. The right-hand side must be a node path literal, either quoted or unquoted.@bartranslates to(access-slot self bar), or equivalentlyself:bar.A node path literal on its own
$bartranslates to a function call equivalent to((access-slot self get-node) "bar"). As with the nested form,@$bardoes not literally translate to the latter, instead factoring through a GDLisp middleman for optimization purposes.
Prefixes
prefixed-expr = [ prefix ], expr ;
prefix = "'" | "#'" | "`" | "," | ",." ;
Expressions, in general, can have a single prefix applied to them. Each prefix is mere syntax sugar for some slightly more complicated S-expression form. The semantics of these forms are explained in future sections, but the translations are defined here.
'footranslates to(quote foo)#'footranslates to(function foo)`footranslates to(quasiquote foo),footranslates to(unquote foo),.footranslates to(unquote-spliced foo)
Comments
There are two types of comments in GDLisp. Both are ignored entirely by the implementation and will not be present in the compiled GDScript code.
Line comments begin with a semicolon
;and continue until the next carriage return or linefeed character, or until the end of the file. Conventionally, line comments which occupy the entire line will be written with two semicolons, so it’s common to see a pair of semicolons denote a line comment. But this is merely a convention and is, as far as the GDLisp parser is concerned, an irrelevant distinction.Block comments begin with
#|and continue until the next|#. Note that block comments cannot be nested, so additional leading sequences of#|inside a block comment will be ignored, not paired off against matching delimiters.