This improves the lexing of escape sequences within strings that appear in a grammar. It relies on Racket’s `read` to interpret these escape sequences rather than a hard-coded hash table. This gives strings in a grammar pretty much the same semantics as standard Racket strings, including support for octal and hex escape sequences for Unicode codepoints. Though this passes all current tests, there are still some oddball corner cases that can be discovered by sticking together certain combinations of escape sequences (backslashes, double quotes, and codepoints). The better solution would be to peek into the input port for a double quote, and if it’s there, use the standard Racket lexer to pull out the string (this lexer already handles the weirdo cases). We can’t do this, however, because brag also supports single-quoted strings, which need to have the same semantics, and the Racket lexer won’t work with those. So I think we’re stuck with the homegrown solution (for consistency with both kinds of quotes) even at the expense of a few unresolved corner cases. Let’s leave that question for another day, as these cases haven’t surfaced in practical use thus far.pull/33/head
parent
92b7dcc067
commit
ba5c6c7ab5
@ -0,0 +1,6 @@
|
|||||||
|
#lang brag
|
||||||
|
start: A c def hello-world
|
||||||
|
A : "\"\101\\" ; A
|
||||||
|
c : '\'\U0063\\' ; c
|
||||||
|
def : "*\u64\\\x65f\"" ; de
|
||||||
|
hello-world : "\150\145\154\154\157\40\167\157\162\154\144"
|
@ -0,0 +1,10 @@
|
|||||||
|
#lang racket/base
|
||||||
|
|
||||||
|
(require brag/examples/codepoints
|
||||||
|
rackunit)
|
||||||
|
|
||||||
|
(check-equal? (parse-to-datum '("\"A\\" "'c\\" "*d\\ef\"" "hello world"))
|
||||||
|
'(start (A "\"A\\")
|
||||||
|
(c "'c\\")
|
||||||
|
(def "*d\\ef\"")
|
||||||
|
(hello-world "hello world")))
|
Loading…
Reference in New Issue