From ab6d67eb8bbce6b1ec7d0df58656fd4b6a74feb5 Mon Sep 17 00:00:00 2001 From: Danny Yoo Date: Wed, 16 Jan 2013 10:24:34 -0700 Subject: [PATCH] Strip the copy-and-paste for cfg-parser, keeping the delta vs. parser. original commit: 4dfe4097720f52968a60f37e2bd8de99b969107c --- collects/parser-tools/parser-tools.scrbl | 139 ++--------------------- 1 file changed, 12 insertions(+), 127 deletions(-) diff --git a/collects/parser-tools/parser-tools.scrbl b/collects/parser-tools/parser-tools.scrbl index d4c3be0..71ce3df 100644 --- a/collects/parser-tools/parser-tools.scrbl +++ b/collects/parser-tools/parser-tools.scrbl @@ -693,8 +693,6 @@ the right choice when using @racket[lexer] in other situations. @racketmodname[parser-tools/cfg-parser] provides another parser generator as an alternative to @racketmodname[parser-tools/yacc]. -Unlike @racket[parser], @racket[cfg-parser] can consume ambiguous grammars. -Its interface is a subset of @racketmodname[parser-tools/yacc]. @defform/subs[#:literals (grammar tokens start end precs src-pos suppress debug yacc-output prec) @@ -708,135 +706,22 @@ Its interface is a subset of @racketmodname[parser-tools/yacc]. (end token-id ...) (@#,racketidfont{error} expr) (src-pos)])]{ - Creates a parser. The clauses may be in any order, as long as there - are no duplicates and all non-@italic{OPTIONAL} declarations are - present: - + Creates a parser similar to that of @racket[parser]. Unlike @racket[parser], + @racket[cfg-parser] can consume ambiguous grammars. + Its interface is a subset of @racketmodname[parser-tools/yacc]. + The major differences in the interface are: + @itemize[ - @item{@racketblock0[(grammar (non-terminal-id - ((grammar-id ...) maybe-prec expr) - ...) - ...)] - - Declares the grammar to be parsed. Each @racket[grammar-id] can - be a @racket[token-id] from a @racket[group-id] named in a - @racket[tokens] declaration, or it can be a - @racket[non-terminal-id] declared in the @racket[grammar] - declaration. The @racket[expr] is a - ``semantic action,'' which is evaluated when the input is found - to match its corresponding production. - - Each action is Racket code that has the same scope as its - parser's definition, except that the variables @racket[$1], ..., - @racketidfont{$}@math{i} are bound, where @math{i} is the number - of @racket[grammar-id]s in the corresponding production. Each - @racketidfont{$}@math{k} is bound to the result of the action - for the @math{k}@superscript{th} grammar symbol on the right of - the production, if that grammar symbol is a non-terminal, or the - value stored in the token if the grammar symbol is a terminal. - If the @racket[src-pos] option is present in the parser, then - variables @racket[$1-start-pos], ..., - @racketidfont{$}@math{i}@racketidfont{-start-pos} and - @racket[$1-end-pos], ..., - @racketidfont{$}@math{i}@racketidfont{-end-pos} and are also - available, and they refer to the position structures - corresponding to the start and end of the corresponding - @racket[grammar-symbol]. Grammar symbols defined as empty-tokens - have no @racketidfont{$}@math{k} associated, but do have - @racketidfont{$}@math{k}@racketidfont{-start-pos} and - @racketidfont{$}@math{k}@racketidfont{-end-pos}. - Also @racketidfont{$n-start-pos} and @racketidfont{$n-end-pos} - are bound to the largest start and end positions, (i.e., - @racketidfont{$}@math{i}@racketidfont{-start-pos} and - @racketidfont{$}@math{i}@racketidfont{-end-pos}). - - An @tech{error production} can be defined by providing - a production of the form @racket[(error α)], where α is a - string of grammar symbols, possibly empty. - - All of the productions for a given non-terminal must be grouped - with it. That is, no @racket[non-terminal-id] may appear twice - on the left hand side in a parser.} - - - @item{@racket[(tokens group-id ...)] - - Declares that all of the tokens defined in each - @racket[group-id]---as bound by @racket[define-tokens] or - @racket[define-empty-tokens]---can be used by the parser in the - @racket[grammar] declaration.} - - @item{@racket[(start non-terminal-id)] - - Declares a starting non-terminal for the grammar. - Note: unlike @racket[parser], @racket[cfg-parser] does not - currently support multiple starting non-terminals - for the grammar.} - - - @item{@racket[(end token-id ...)] - - Specifies a set of tokens from which some member must follow any - valid parse. For example, an EOF token would be specified for a - parser that parses entire files and a newline token for a parser - that parses entire lines individually.} - - - @item{@racket[(@#,racketidfont{error} expr)] - - The @racket[expr] should evaluate to a function which will be - executed for its side-effect whenever the parser encounters an - error. - - If the @racket[src-pos] declaration is present, the function - should accept 5 arguments,: - - @racketblock[(lambda (tok-ok? tok-name tok-value _start-pos _end-pos) - ....)] - - Otherwise it should accept 3: - - @racketblock[(lambda (tok-ok? tok-name tok-value) - ....)] - - The first argument will be @racket[#f] if and only if the error - is that an invalid token was received. The second and third - arguments will be the name and the value of the token at which - the error was detected. The fourth and fifth arguments, if - present, provide the source positions of that token.} - - - @item{@racket[(src-pos)] @italic{OPTIONAL} - - Causes the generated parser to expect input in the form - @racket[(make-position-token _token _start-pos _end-pos)] instead - of simply @racket[_token]. Include this option when using the - parser with a lexer generated with @racket[lexer-src-pos].} - ] - - The result of a @racket[parser] expression with one @racket[start] - non-terminal is a function, @racket[_parse], that takes one - argument. This argument must be a zero argument function, - @racket[_gen], that produces successive tokens of the input each - time it is called. If desired, the @racket[_gen] may return - symbols instead of tokens, and the parser will treat symbols as - tokens of the corresponding name (with @racket[#f] as a value, so - it is usual to return symbols only in the case of empty tokens). - The @racket[_parse] function returns the value associated with the - parse tree by the semantic actions. If the parser encounters an - error, after invoking the supplied error function, it will try to - use @tech{error production}s to continue parsing. If it cannot, it - raises @racket[exn:fail:read]. - - - Each time the Racket code for a @racket[cfg-parser] is compiled - (e.g. when a @filepath{.rkt} file containing a @racket[cfg-parser] form - is loaded), the parser generator is run. To avoid this overhead - place the parser into a module and compile the module to a - @filepath{.zo} bytecode file. + Unlike @racket[parser], @racket[cfg-parser] only allows for + a single non-terminal-id.} + + @item{@racket[cfg-parser] does not support the @racket[precs], + @racket[suppress], @racket[debug], or @racket[yacc-output] + options of @racket[parser].} + ] }