guidelines. The structure of the grammar informs the structure of the
Racket syntax objects it generates.}
@item{The language uses a few conventions to simplify the expression of
@item{The language uses a few conventions to simplify the expression of
grammars. The first rule in the grammar is assumed to be the
grammars. The first rule in the grammar is assumed to be the
@ -169,11 +166,7 @@ symbols, or instances constructed with @racket[token]. Tokens can optionally pro
@subsection{Example: a small DSL for ASCII diagrams}
@subsection{Example: a small DSL for ASCII diagrams}
@margin-note{This example is
Suppose we'd like to define a language for
@link["http://stackoverflow.com/questions/12345647/rewrite-this-script-by-designing-an-interpreter-in-racket"]{derived from a question} on Stack Overflow.}
To understand @tt{brag}'s design, let's look
at a toy problem. We'd like to define a language for
drawing simple ASCII diagrams. So if we write something like this:
drawing simple ASCII diagrams. So if we write something like this:
@nested[#:style 'inset]{
@nested[#:style 'inset]{
@ -202,10 +195,7 @@ XXXXXXXXX
}|}
}|}
This makes sense in a casual way. But let's be more precise about how the language works.
@subsection{Syntax and semantics}
We're being somewhat casual with what we mean by the program above. Let's try to nail down some meanings.
Each line of the program has a semicolon at the end, and describes the output of several @emph{rows} of the line drawing. Let's look at two of the lines in the example:
Each line of the program has a semicolon at the end, and describes the output of several @emph{rows} of the line drawing. Let's look at two of the lines in the example:
@ -246,7 +236,8 @@ We write a @tt{brag} program as an BNF grammar, where patterns can be:
The result of a @tt{brag} program is a module with a @racket[parse] function
The result of a @tt{brag} program is a module with a @racket[parse] function
that can parse tokens and produce a syntax object as a result.
that can parse tokens and produce a syntax object as a result.
Let's exercise this function:
Let's try this function:
@interaction[#:eval my-eval
@interaction[#:eval my-eval
(require brag/support)
(require brag/support)
@eval:alts[(require "simple-line-drawing.rkt")
@eval:alts[(require "simple-line-drawing.rkt")
@ -455,8 +446,8 @@ And now we've got an interpreter!
@subsection{From interpretation to compilation}
@subsection{From interpretation to compilation}
@margin-note{For a gentler tutorial on writing @litchar{#lang} extensions, see:
@margin-note{For a gentler tutorial on writing @litchar{#lang}-based languages, see
@link["http://hashcollision.org/brainfudge"]{F*dging up a Racket}.} (Just as a
@link["http://beautifulracket.com"]{Beautiful Racket}.} (Just as a
warning: the following material is slightly more advanced, but shows how
warning: the following material is slightly more advanced, but shows how
writing a compiler for the line-drawing language reuses the ideas for the
writing a compiler for the line-drawing language reuses the ideas for the
interpreter.)
interpreter.)
@ -557,10 +548,6 @@ utilities for creating a module reader. It uses the lexer and
@tt{brag}-generated parser we defined earlier, and also tells Racket that it should compile the forms in the syntax
@tt{brag}-generated parser we defined earlier, and also tells Racket that it should compile the forms in the syntax
object using a module called @filepath{semantics.rkt}.
object using a module called @filepath{semantics.rkt}.
@margin-note{For a systematic treatment on capturing the semantics of
a language, see @link["http://cs.brown.edu/~sk/Publications/Books/ProgLangs/"]{Programming Languages: Application and
Interpretation}.}
Let's look into @filepath{semantics.rkt} and see what's involved in
Let's look into @filepath{semantics.rkt} and see what's involved in
@margin-note{By the way, we can just as easily rewrite the semantics so that
@racket[compile-rows] does explicitly call @racket[compile-chunk]. Often,
though, it's easier to write the transformation functions in this piecemeal way
and depend on the Racket macro expansion system to do the rewriting as it
encounters each of the forms.}
@itemize[
@itemize[
@item{@tt{brag}'s native data structure is the syntax object because the
@item{@tt{brag}'s native data structure is the syntax object because the
@ -633,13 +627,7 @@ majority of Racket's language-processing infrastructure knows how to read and
write this structured value.}
write this structured value.}
@item{
@item{Unlike in interpretation, @racket[compile-rows] doesn't
@margin-note{By the way, we can just as easily rewrite the semantics so that
@racket[compile-rows] does explicitly call @racket[compile-chunk]. Often,
though, it's easier to write the transformation functions in this piecemeal way
and depend on the Racket macro expansion system to do the rewriting as it
encounters each of the forms.}
Unlike in interpretation, @racket[compile-rows] doesn't
compile each chunk by directly calling @racket[compile-chunk]. Rather, it
compile each chunk by directly calling @racket[compile-chunk]. Rather, it
depends on the Racket macro expander to call each @racket[compile-XXX] function
depends on the Racket macro expander to call each @racket[compile-XXX] function
as it encounters a @racket[drawing], @racket[rows], or @racket[chunk] in the
as it encounters a @racket[drawing], @racket[rows], or @racket[chunk] in the
@ -654,7 +642,7 @@ the macro expansion system to do this:
]
]
Altogether, @tt{brag}'s intent is to be a parser generator generator for Racket
Altogether, @tt{brag}'s intent is to be a parser generator for Racket
that's easy and fun to use. It's meant to fit naturally with the other tools
that's easy and fun to use. It's meant to fit naturally with the other tools
in the Racket language toolchain. Hopefully, it will reduce the friction in
in the Racket language toolchain. Hopefully, it will reduce the friction in
making new languages with alternative concrete syntaxes.
making new languages with alternative concrete syntaxes.
@ -678,7 +666,7 @@ A @deftech{rule} is a sequence consisting of: a @tech{rule identifier}, a colon
A @deftech{rule identifier} is an @tech{identifier} that is not in upper case.
A @deftech{rule identifier} is an @tech{identifier} that is not in upper case.
A @deftech{token identifier} is an @tech{identifier} that is in upper case.
A @deftech{symbolic token identifier} is an @tech{identifier} that is in upper case.
An @deftech{identifier} is a character sequence of letters, numbers, and
An @deftech{identifier} is a character sequence of letters, numbers, and
characters in @racket["-.!$%&/<=>?^_~@"]. It must not contain
characters in @racket["-.!$%&/<=>?^_~@"]. It must not contain
@ -689,7 +677,12 @@ quantification.
A @deftech{pattern} is one of the following:
A @deftech{pattern} is one of the following:
@itemize[
@itemize[
@item{an implicit sequence of @tech{pattern}s separated by whitespace}
@item{an implicit sequence of @tech{pattern}s separated by whitespace}
@item{a terminal: either a literal string or a @tech{token identifier}}
@item{a terminal: either a literal string or a @tech{symbolic token identifier}.
When used in a pattern, both these terminals will match the same set of inputs. A literal string can match the string itself, or a @racket[token] whose type field contains that string (or its symbol form). So @racket["FOO"] would match @racket["FOO"], @racket[(token "FOO" "bar")], or @racket[(token 'FOO "bar")]. A symbolic token identifier can also match the string version of the identifier, or a @racket[token] whose type field is the symbol or string form of the identifier. So @racket[FOO] would also match @racket["FOO"], @racket[(token 'FOO "bar")], or @racket[(token "FOO" "bar")]. (In every case, the value of a token, like @racket["bar"], can be anything, and may or may not be the same as its type.)
Because their underlying meanings are the same, the symbolic token identifier ends up being a notational convenience for readability inside a grammar pattern. Typically, the literal string @racket["FOO"] is used to connote ``match the string @racket["FOO"] exactly'' and the symbolic token identifier @racket[FOO] specially connotes ``match any token of type @racket['FOO]''.}
@item{a @tech{rule identifier}}
@item{a @tech{rule identifier}}
@item{a @deftech{choice pattern}: a sequence of @tech{pattern}s delimited with @litchar{|} characters.}
@item{a @deftech{choice pattern}: a sequence of @tech{pattern}s delimited with @litchar{|} characters.}
@item{a @deftech{quantifed pattern}: a @tech{pattern} followed by either @litchar{*} (``zero or more'') or @litchar{+} (``one or more'')}
@item{a @deftech{quantifed pattern}: a @tech{pattern} followed by either @litchar{*} (``zero or more'') or @litchar{+} (``one or more'')}
@ -715,7 +708,7 @@ object: "world" | WORLD
the elements @tt{sentence}, @tt{verb}, @tt{greeting}, and @tt{object} are rule
the elements @tt{sentence}, @tt{verb}, @tt{greeting}, and @tt{object} are rule
identifiers. The first rule, @litchar{sentence: verb optional-adjective
identifiers. The first rule, @litchar{sentence: verb optional-adjective
object}, is a rule whose right side is an implicit pattern sequence of three
object}, is a rule whose right side is an implicit pattern sequence of three
sub-patterns. The uppercased @tt{WORLD} is a token identifier. The fourth rule in the program associates @tt{greeting} with a @tech{choice pattern}.
sub-patterns. The uppercased @tt{WORLD} is a symbolic token identifier. The fourth rule in the program associates @tt{greeting} with a @tech{choice pattern}.
@ -837,7 +830,7 @@ the BNF grammar. For each rule @racket[r] and its associated pattern @racket[p],
@item{For implicit and explicit sequences of @tech{pattern}s @racket[p1],
@item{For implicit and explicit sequences of @tech{pattern}s @racket[p1],
@racket[p2], ..., the corresponding values, spliced into the
@racket[p2], ..., the corresponding values, spliced into the
structure.}
structure.}
@item{For terminals, the value associated to the token.}
@item{For terminals, the value of the token.}
@item{For @tech{rule identifier}s: the associated parse value for the rule.}
@item{For @tech{rule identifier}s: the associated parse value for the rule.}
@item{For @tech{choice pattern}s: the associated parse value for one of the matching subpatterns.}
@item{For @tech{choice pattern}s: the associated parse value for one of the matching subpatterns.}
@item{For @tech{quantifed pattern}s and @tech{optional pattern}s: the corresponding values, spliced into the structure.}
@item{For @tech{quantifed pattern}s and @tech{optional pattern}s: the corresponding values, spliced into the structure.}
@ -848,8 +841,7 @@ pattern that informs the parser to introduces nested structure into the syntax
object.
object.
If the grammar has ambiguity, @tt{brag} will choose and return a parse, though
If the grammar is ambiguous, @tt{brag} will choose one of the possible parse results, though it doesn't guarantee which.
it does not guarantee which one it chooses.
If the parse cannot be performed successfully, or if a token in the
If the parse cannot be performed successfully, or if a token in the