You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
txexpr/scribblings/txexpr.scrbl

553 lines
16 KiB
Racket

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

#lang scribble/manual
@; for documentation purposes, use the xexpr? from xml.
@; the one in txexpr is just to patch over an issue with
@; `valid-char?` in Racket 6.
@(require scribble/eval (for-label racket txexpr xml rackunit))
@(define my-eval (make-base-eval))
@(my-eval `(require txexpr xml rackunit))
@title{txexpr: Tagged X-expressions}
@author[(author+email "Matthew Butterick" "mb@mbtype.com")]
@defmodule[#:multi (txexpr (submod txexpr safe))]
A set of small but handy functions for improving the readability and reliability of programs that operate on tagged X-expressions (for short, @italic{txexpr}s).
@section{Installation}
At the command line:
@verbatim{raco pkg install txexpr}
After that, you can update the package from the command line:
@verbatim{raco pkg update txexpr}
@section{Importing the module}
The module can be invoked two ways: fast or safe.
Fast mode is the default, which you get by importing the module in the usual way: @code{(require txexpr)}.
Safe mode enables the function contracts documented below. Use safe mode by importing the module as @code{(require (submod txexpr safe))}.
@section{Whats a txexpr?}
It's an X-expression with the following grammar:
@racketgrammar*[
#:literals (cons list symbol? string? xexpr?)
[txexpr (list tag (list attr ...) element ...)
(cons tag (list element ...))]
[tag symbol?]
[attr (list key value)]
[key symbol?]
[value string?]
[element xexpr?]
]
A txexpr is a list with a symbol in the first position  the @italic{tag} followed by a series of @italic{elements}, which are other X-expressions. Optionally, a txexpr can have a list of @italic{attributes} in the second position.
@examples[#:eval my-eval
(txexpr? '(span "Brennan" "Dale"))
(txexpr? '(span "Brennan" (em "Richard") "Dale"))
(txexpr? '(span [[class "hidden"][id "names"]] "Brennan" "Dale"))
(txexpr? '(span lt gt amp))
(txexpr? '("We really" "should have" "a tag"))
(txexpr? '(span [[class not-quoted]] "Brennan"))
(txexpr? '(span [class "hidden"] "Brennan" "Dale"))
]
The last one is a common mistake. Because the keyvalue pair is not enclosed in a @racket[list], it's interpreted as a nested txexpr within the first txexpr, as you may not find out until you try to read its attributes:
@margin-note{There's no way of eliminating this ambiguity, short of always requiring an attribute list empty if necessary  in your txexpr. See also @racket[xexpr-drop-empty-attributes].}
@examples[#:eval my-eval
(get-attrs '(span [class "hidden"] "Brennan" "Dale"))
(get-elements '(span [class "hidden"] "Brennan" "Dale"))
]
Tagged X-expressions are most commonly found in HTML & XML documents. Though the notation is different in Racket, the data structure is identical:
@examples[#:eval my-eval
(xexpr->string '(span [[id "names"]] "Brennan" (em "Richard") "Dale"))
(string->xexpr "<span id=\"names\">Brennan<em>Richard</em>Dale</span>")
]
After converting to and from HTML, we get back the original X-expression. Well, almost. The brackets turned into parentheses no big deal, since they mean the same thing in Racket. Also, per its usual practice, @racket[string->xexpr] added an empty attribute list after @racket[em]. This is also benign.
@section{Why not just use @exec{match}, @exec{quasiquote}, and so on?}
If you prefer those, please do. But I've found two benefits to using module functions:
@bold{Readability.} In code that already has a lot of matching and quasiquoting going on, these functions make it easy to see where & how txexprs are being used.
@bold{Reliability.} Because txexprs come in two close but not quite equal forms, careful coders will always have to take both cases into account.
The programming is trivial, but the annoyance is real.
@section{Interface}
@deftogether[(
@defproc[
(txexpr?
[v any/c])
boolean?]
@defproc[
(txexpr-tag?
[v any/c])
boolean?]
@defproc[
(txexpr-attr?
[v any/c])
boolean?]
@defproc[
(txexpr-attr-key?
[v any/c])
boolean?]
@defproc[
(txexpr-attr-value?
[v any/c])
boolean?]
@defproc[
(txexpr-element?
[v any/c])
boolean?]
)]
Predicates for @racket[_txexpr]s that implement this grammar:
@racketgrammar*[
#:literals (cons list symbol? string? xexpr?)
[txexpr (list tag (list attr ...) element ...)
(cons tag (list element ...))]
[tag symbol?]
[attr (list key value)]
[key symbol?]
[value string?]
[element xexpr?]
]
@deftogether[(
@defproc[
(txexpr-tags?
[v any/c])
boolean?]
@defproc[
(txexpr-attrs?
[v any/c])
boolean?]
@defproc[
(txexpr-elements?
[v any/c])
boolean?]
)]
Shorthand for @code{(listof txexpr-tag?)}, @code{(listof txexpr-attr?)}, and @code{(listof txexpr-element?)}.
@defproc[
(validate-txexpr
[possible-txexpr any/c])
txexpr?]
Like @racket[txexpr?], but raises a descriptive error if @racket[_possible-txexpr] is invalid, and otherwise returns @racket[_possible-txexpr] itself.
@examples[#:eval my-eval
(validate-txexpr 'root)
(validate-txexpr '(root))
(validate-txexpr '(root ((id "top")(class 42))))
(validate-txexpr '(root ((id "top")(class "42"))))
(validate-txexpr '(root ((id "top")(class "42")) ("hi")))
(validate-txexpr '(root ((id "top")(class "42")) "hi"))
]
@deftogether[(
@defproc[
(can-be-txexpr-attr-key?
[v any/c])
boolean?]
@defproc[
(can-be-txexpr-attr-value?
[v any/c])
boolean?]
)]
Predicates for input arguments that are trivially converted to an attribute @racket[_key] or @racket[_value]
@deftogether[(
@defproc[
(->txexpr-attr-key
[v can-be-txexpr-attr-key?])
txexpr-attr-key?]
@defproc[
(->txexpr-attr-value
[v can-be-txexpr-attr-value?])
txexpr-attr-value?]
)]
 with these conversion functions.
@defproc[
(txexpr->values
[tx txexpr?])
(values txexpr-tag? txexpr-attrs? txexpr-elements?)]
Dissolves a @racket[_txexpr] into its components and returns all three.
@examples[#:eval my-eval
(txexpr->values '(div))
(txexpr->values '(div "Hello" (p "World")))
(txexpr->values '(div [[id "top"]] "Hello" (p "World")))
]
@defproc[
(txexpr->list
[tx txexpr?])
(list txexpr-tag?
txexpr-attrs?
txexpr-elements?)]
Like @racket[txexpr->values], but returns the three components in a list.
@examples[#:eval my-eval
(txexpr->list '(div))
(txexpr->list '(div "Hello" (p "World")))
(txexpr->list '(div [[id "top"]] "Hello" (p "World")))
]
@defproc[
(xexpr->html
[x xexpr?])
string?]
Convert @racket[_x] to an HTML string. Better than @racket[xexpr->string] because consistent with the HTML spec, it will not escape text that appears within @code{script} or @code{style} blocks. For convenience, this function will take any X-expression, not just tagged X-expressions.
@examples[#:eval my-eval
(define tx '(root (script "3 > 2") "Why is 3 > 2?"))
(xexpr->string tx)
(xexpr->html tx)
(map xexpr->html (list "string" 'entity 65))
]
@deftogether[(
@defproc[
(get-tag
[tx txexpr?])
txexpr-tag?]
@defproc[
(get-attrs
[tx txexpr?])
txexpr-attr?]
@defproc[
(get-elements
[tx txexpr?])
(listof txexpr-element?)]
)]
Accessor functions for the individual pieces of a @racket[_txexpr].
@examples[#:eval my-eval
(get-tag '(div [[id "top"]] "Hello" (p "World")))
(get-attrs '(div [[id "top"]] "Hello" (p "World")))
(get-elements '(div [[id "top"]] "Hello" (p "World")))
]
@defproc[
(make-txexpr
[tag txexpr-tag?]
[attrs txexpr-attrs? @empty]
[elements txexpr-elements? @empty])
txexpr?]
Assemble a @racket[_txexpr] from its parts. If you don't have attributes, but you do have elements, you'll need to pass @racket[empty] as the second argument. Note that unlike @racket[xml->xexpr], if the attribute list is empty, it's not included in the resulting expression.
@examples[#:eval my-eval
(make-txexpr 'div)
(make-txexpr 'div '() '("Hello" (p "World")))
(make-txexpr 'div '[[id "top"]])
(make-txexpr 'div '[[id "top"]] '("Hello" (p "World")))
(define tx '(div [[id "top"]] "Hello" (p "World")))
(make-txexpr (get-tag tx)
(get-attrs tx) (get-elements tx))
]
@defproc[
(can-be-txexpr-attrs?
[v any/c])
boolean?]
Predicate for functions that handle @racket[_txexpr-attrs]. Covers values that are easily converted into pairs of @racket[_attr-key] and @racket[_attr-value]. Namely: single @racket[_xexpr-attr]s, lists of @racket[_xexpr-attr]s (i.e., what you get from @racket[get-attrs]), or interleaved symbols and strings (each pair will be concatenated into a single @racket[_xexpr-attr]).
@deftogether[(
@defproc[
(attrs->hash [x can-be-txexpr-attrs?] ...)
hash?]
@defproc[
(hash->attrs
[h hash?])
txexpr-attrs?]
)]
Convert @racket[_attrs] to an immutable hash, and back again.
@examples[#:eval my-eval
(define tx '(div [[id "top"][class "red"]] "Hello" (p "World")))
(attrs->hash (get-attrs tx))
(hash->attrs '#hash((class . "red") (id . "top")))
]
@defproc[
(attrs-have-key?
[attrs (or/c txexpr-attrs? txexpr?)]
[key can-be-txexpr-attr-key?])
boolean?]
Returns @racket[#t] if the @racket[_attrs] contain a value for the given @racket[_key], @racket[#f] otherwise.
@examples[#:eval my-eval
(define tx '(div [[id "top"][class "red"]] "Hello" (p "World")))
(attrs-have-key? tx 'id)
(attrs-have-key? tx 'grackle)
]
@defproc[
(attrs-equal?
[attrs (or/c txexpr-attrs? txexpr?)]
[other-attrs (or/c txexpr-attrs? txexpr?)])
boolean?]
Returns @racket[#t] if @racket[_attrs] and @racket[_other-attrs] contain the same keys and values, @racket[#f] otherwise. The order of attributes is irrelevant.
@examples[#:eval my-eval
(define tx1 '(div [[id "top"][class "red"]] "Hello"))
(define tx2 '(p [[class "red"][id "top"]] "Hello"))
(define tx3 '(p [[id "bottom"][class "red"]] "Hello"))
(attrs-equal? tx1 tx2)
(attrs-equal? tx1 tx3)
]
@defproc[
(attr-ref
[tx txexpr?]
[key can-be-txexpr-attr-key?])
can-be-txexpr-attr-value?]
Given a @racket[_key], look up the corresponding @racket[_value] in the attributes of a @racket[_txexpr]. Asking for a nonexistent key produces an error.
@examples[#:eval my-eval
(attr-ref tx 'class)
(attr-ref tx 'id)
(attr-ref tx 'nonexistent-key)
]
@defproc[
(attr-ref*
[tx txexpr?]
[key can-be-txexpr-attr-key?])
(listof can-be-txexpr-attr-value?)]
Like @racket[attr-ref], but returns a recursively gathered list of all the @racket[_value]s for that key within @racket[_tx]. Asking for a nonexistent key produces @racket[null].
@examples[#:eval my-eval
(define tx '(div [[class "red"]] "Hello" (em ([class "blue"]) "world")))
(attr-ref* tx 'class)
(attr-ref* tx 'nonexistent-key)
]
@defproc[
(attr-set
[tx txexpr?]
[key can-be-txexpr-attr-key?]
[value can-be-txexpr-attr-value?])
txexpr?]
Given a @racket[_txexpr], set the value of attribute @racket[_key] to @racket[_value]. Return the updated @racket[_txexpr].
@examples[#:eval my-eval
(define tx '(div [[class "red"][id "top"]] "Hello" (p "World")))
(attr-set tx 'id "bottom")
(attr-set tx 'class "blue")
(attr-set (attr-set tx 'id "bottom") 'class "blue")
]
@defproc[
(attr-set*
[tx txexpr?]
[key can-be-txexpr-attr-key?]
[value can-be-txexpr-attr-value?] ... ... )
txexpr?]
Like @racket[attr-set], but accepts any number of keys and values.
@examples[#:eval my-eval
(define tx '(div "Hello"))
(attr-set* tx 'id "bottom" 'class "blue")
]
@defproc[
(attr-join
[tx txexpr?]
[key can-be-txexpr-attr-key?]
[value can-be-txexpr-attr-value?])
txexpr?]
Given a @racket[_txexpr], append the value of attribute @racket[_key] with @racket[_value]. Return the updated @racket[_txexpr].
@examples[#:eval my-eval
(define tx '(div [[class "red"]] "Hello"))
(attr-join tx 'class "small")
]
@defproc[
(merge-attrs
[attrs (listof can-be-txexpr-attrs?)] ...)
txexpr-attrs?]
Combine a series of attributes into a single @racket[_txexpr-attrs] item. This function addresses three annoyances that surface in working with txexpr attributes.
@itemlist[#:style 'ordered
@item{You can pass the attributes in multiple forms. See @racket[can-be-txexpr-attrs?] for further details.}
@item{Attributes with the same name are merged, with the later value taking precedence (i.e., @racket[hash] behavior). }
@item{Attributes are sorted in alphabetical order.}]
@examples[#:eval my-eval
(define tx '(div [[id "top"][class "red"]] "Hello" (p "World")))
(define tx-attrs (get-attrs tx))
tx-attrs
(merge-attrs tx-attrs 'editable "true")
(merge-attrs tx-attrs 'id "override-value")
(define my-attr '(id "another-override"))
(merge-attrs tx-attrs my-attr)
(merge-attrs my-attr tx-attrs)
]
@defproc[
(remove-attrs
[tx txexpr?])
txexpr?]
Recursively remove all attributes.
@examples[#:eval my-eval
(define tx '(div [[id "top"]] "Hello" (p [[id "lower"]] "World")))
(remove-attrs tx)
]
@defproc[
(map-elements
[proc procedure?]
[tx txexpr?])
txexpr?]
Recursively apply @racket[_proc] to all elements, leaving tags and attributes alone. Using plain @racket[map] will only process elements at the top level of the current @racket[_txexpr]. Usually that's not what you want.
@examples[#:eval my-eval
(define tx '(div "Hello!" (p "Welcome to" (strong "Mars"))))
(define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
(map upcaser tx)
(map-elements upcaser tx)
]
In practice, most @racket[_xexpr-element]s are strings. But woe befalls those who pass string procedures to @racket[map-elements], because an @racket[_xexpr-element] can be any kind of @racket[xexpr?], and an @racket[xexpr?] is not necessarily a string.
@examples[#:eval my-eval
(define tx '(p "Welcome to" (strong "Mars" amp "Sons")))
(map-elements string-upcase tx)
(define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
(map-elements upcaser tx)
]
@defproc[
(map-elements/exclude
[proc procedure?]
[tx txexpr?]
[exclude-test (txexpr? . -> . boolean?)])
txexpr?]
Like @racket[map-elements], but skips any @racket[_txexprs] that evaluate to @racket[#t] under @racket[_exclude-test]. The @racket[_exclude-test] gets a whole txexpr as input, so it can test any of its parts.
@examples[#:eval my-eval
(define tx '(div "Hello!" (p "Welcome to" (strong "Mars"))))
(define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
(map-elements upcaser tx)
(map-elements/exclude upcaser tx (λ(x) (equal? (get-tag x) 'strong)))
]
Be careful with the wider consequences of exclusion tests. When @racket[_exclude-test] is true, the @racket[_txexpr] is excluded, but so is everything underneath that @racket[_txexpr]. In other words, there is no way to re-include (un-exclude?) elements nested under an excluded element.
@examples[#:eval my-eval
(define tx '(div "Hello!" (p "Welcome to" (strong "Mars"))))
(define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
(map-elements upcaser tx)
(map-elements/exclude upcaser tx (λ(x) (equal? (get-tag x) 'p)))
(map-elements/exclude upcaser tx (λ(x) (equal? (get-tag x) 'div)))
]
@defproc[
(splitf-txexpr
[tx txexpr?]
[pred procedure?]
[replace-proc procedure? (λ(x) null)])
(values txexpr? (listof txexpr-element?))]
Recursively descend through @racket[_txexpr] and extract all elements that match @racket[_pred]. Returns two values: a @racket[_txexpr] with the matching elements removed, and the list of matching elements. Sort of esoteric, but I've needed it more than once, so here it is.
@examples[#:eval my-eval
(define tx '(div "Wonderful day" (meta "weather" "good") "for a walk"))
(define is-meta? (λ(x) (and (txexpr? x) (equal? 'meta (get-tag x)))))
(splitf-txexpr tx is-meta?)
]
Ordinarily, the result of the split operation is to remove the elements that match @racket[_pred]. But you can change this behavior with the optional @racket[_replace-proc] argument.
@examples[#:eval my-eval
(define tx '(div "Wonderful day" (meta "weather" "good") "for a walk"))
(define is-meta? (λ(x) (and (txexpr? x) (equal? 'meta (get-tag x)))))
(define replace-meta (λ(x) '(em "meta was here")))
(splitf-txexpr tx is-meta? replace-meta)
]
@defproc[
(check-txexprs-equal?
[tx1 txexpr?]
[tx2 txexpr?])
void?]
Designed to be used with @racketmodname[rackunit]. Check whether @racket[_tx1] and @racket[_tx2] are @racket[equal?] except for ordering of attributes (which ordinarily has no semantic significance). Return @racket[void] if so, otherwise raise a check failure.
@examples[#:eval my-eval
(define tx1 '(div ((attr-a "foo")(attr-z "bar"))))
(define tx2 '(div ((attr-z "bar")(attr-a "foo"))))
(parameterize ([current-check-handler (λ _ (display "not "))])
(display "txexprs are ")
(check-txexprs-equal? tx1 tx2)
(displayln "equal"))
]
If ordering of attributes is relevant to your test, then just use @racket[check-equal?] as usual.
@examples[#:eval my-eval
(define tx1 '(div ((attr-a "foo")(attr-z "bar"))))
(define tx2 '(div ((attr-z "bar")(attr-a "foo"))))
(parameterize ([current-check-handler (λ _ (display "not "))])
(display "txexprs are ")
(check-equal? tx1 tx2)
(displayln "equal"))
]
@section{License & source code}
This module is licensed under the LGPL.
Source repository at @link["http://github.com/mbutterick/txexpr"]{http://github.com/mbutterick/txexpr}. Suggestions & corrections welcome.