diff --git a/doc/Backstory.html b/doc/Backstory.html index 5080b89..71e59e4 100644 --- a/doc/Backstory.html +++ b/doc/Backstory.html @@ -1,2 +1,2 @@ -3 Backstory
On this page:
3.1 Web development and its discontents
3.2 The better idea:   a programming model
3.3 “Now you have two problems”
3.4 Rethinking the solution for digital books
3.5 Enter Racket
3.6 What is Pollen?
6.1.0.5

3 Backstory

I created Pollen to overcome limitations & frustrations I repeatedly encountered with existing web-publishing tools.

If you agree with my characterization of those problems, then you’ll probably like the solution that Pollen offers. If not, you probably won’t.

3.1 Web development and its discontents

I made my first web page in 1994, shortly after the web was invented. I opened my text editor (at the time, BBEdit), pecked out <html><body>Hello world</body></html>, then loaded it in Mosaic. So did a million other nerds.

If you weren’t around then, you didn’t miss much. Everything about the web was horrible: the web browsers, the computers running the browsers, the dial-up connections feeding the browsers, and of course HTML itself. At that point, the desktop-software experience was already slick and refined. By comparison, using the web felt like banging rocks together.

That’s no longer true. The web is now 20 years old. During that time, most parts of the web have improved dramatically — for instance, the connections are faster, the browsers are more sophisticated, the screens have more pixels.

But one part has not improved: the way we make web pages. Over the years, tools promising to simplify HTML development have come and mostly gone — from PageMill to Dreamweaver to WordPress to Jekyll. Meanwhile, true web jocks have remained loyal to the original HTML power tool: the humble text editor.

In one way, this makes sense. Web pages are mostly made of text-based data — HTML, CSS, JavaScript, and so on — and the simplest way to mainpulate this data is with a text editor. While HTML and CSS are not programming languages, they lend themselves to semantic and logical structure that’s most easily expressed by editing them as text. Furthermore, text-based editing makes debugging and performance improvements easier.

But text-based editing is also limited. Though the underlying description of a web page is notionally human-readable, it’s largely optimized to be readable by other software — namely, web browsers. HTML markup in particular is verbose and easily mistyped. And isn’t it fatally dull to manage all the boilerplate, like surrounding every paragraph with <p>...</p>? Yes, it is.

For these reasons, much of web development should lend itself to abstraction & automation. Abstraction means consolidating repetitve, complex patterns into simpler, parameterized forms. Automation means avoiding the manual drudgery of generating the output files. But in practice, tools that enable this abstraction & automation have been slow to arrive, and most have come hobbled with unacceptable deficiencies.

3.2 The better idea: a programming model

Parallel with my HTML education, I also goofed around with various programming languages — C, C++, Perl, Java, PHP, JavaScript, Python. Unlike HTML, programming languages excel at abstraction and automation. This seemed like the obvious direction for web development to go.

What distinguishes the text-editing model from the programming model? It’s a matter of direct vs. indirect manipulation of output. The text-editing model treats HTML as something to be written directly with a text editor. Whereas the programming model treats HTML — or whatever the output is — as the result of compiling a set of source files, which are written in a programming language. The costs of working indirectly via the programming language are offset by the benefits of abstraction & automation.

On the early web, the text-editing model was appealingly precise and quick. On small projects, it worked well enough. But as projects grew, the text-editing model was going to lose steam. I wasn’t the only one to notice. Shortly after those million nerds made their first web page by hand, many of them set about devising ways to apply a programming model to web development.

3.3 “Now you have two problems”

What followed was a steady stream of products, frameworks, tools, and content management systems that claimed to bring a programming model to web development. Some were better than others. But none of them displaced the text editor as the preferred tool of web developers.

Why not? All these tools promised a great leap forward in solving the web-development problem. In practice, they simply redistributed the pain. I needn’t bore you with enumerating the deficiencies of specific tools, because they’ve tended to fail in the same thematic ways:

I’ve tried a lot of these tools over the years. Some I liked. Some I didn’t. Invariably, however, whenever I could still make do with hand-editing an HTML project, I would. After trying to cajole the web framework du jour into doing my bidding, it was relaxing to trade off some efficiency for control.

3.4 Rethinking the solution for digital books

In 2008, I launched a website called Typography for Lawyers. Initially, I’d conceived of it as a book. Then I thought “no one’s going to publish that.” So it became a website, that I aimed to make as book-like as possible. But hand-editing wasn’t going to be enough.

So I used WordPress. The major chore became scraping out all the crap that typically lives in blog templates. Largely because of this, people liked the site, because it was simpler & cleaner than the usual WordPress website.

Eventually, a publisher offered to release it as a paperback. Later came the inevitable request to make it into a Kindle book. As a fan of typography, I hate the Kindle. The layout controls are coarse, and so is the reading experience. But I didn’t run and hide. Basically a Kindle book is a little website made with 1995-era HTML. So I coded up some tools in Perl to convert my book to Kindle format while preserving the formatting and images as well as possible.

At that point, I noticed I had converted Typography for Lawyers into web format twice, using two different sets of tools. Before someone asked me to do it a third time, I started thinking about how I might create source code for the book that allowed me to render it into different formats.

This was the beginning of the Pollen project.

I wrote the initial version of Pollen in Python. I devised a simplified markup-notation language for the source files. This language was compiled into XML-ish data structures using ply (Python lex/yacc). These structures were parsed into trees using LXML. The trees were combined with templates made in Chameleon. These templates were rendered and previewed with the Bottle web server.

Did it work? Sort of. Source code went in; web pages came out. But it was also complicated and fragile. Moreover, though the automation was there, there wasn’t yet enough abstraction at the source layer. I started thinking about how I could add a source preprocessor.

3.5 Enter Racket

I had come across Racket while researching languages suitable for HTML/XML processing. I had unexpectedly learned about the secret kinship of XML and Lisp: though XML is not a programming language, it uses a variant of Lisp syntax. Thus Lisp languages are particularly adept at handling XMLish structures. That was interesting.

After comparing some of the Lisp & Scheme variants, Racket stood out because it had a text-based dialect called Scribble. Scribble could be used to embed code within textual content. That was interesting too. Among other things, this meant Scribble could be used as a general-purpose preprocessor. So I thought I’d see if I could add it to Pollen.

It worked. So well, in fact, that I started thinking about whether I could reimplement other parts of Pollen in Racket. Then I started thinking about reimplementing all of it in Racket.

So I did. And here we are.

3.6 What is Pollen?

Pollen is a publishing system built on top of Scribble and Racket. So far I’ve optimized Pollen for digital books, because that’s mainly what I use it for. But it can be used for small projects too.

As a publishing system, Pollen includes:

Pollen addresses the deficiencies I experienced with other tools:

 
\ No newline at end of file +3 Backstory
On this page:
3.1 Web development and its discontents
3.2 The better idea:   a programming model
3.3 “Now you have two problems”
3.4 Rethinking the solution for digital books
3.5 Enter Racket
3.6 What is Pollen?
3.7 Further reading
6.1.0.5

3 Backstory

I created Pollen to overcome limitations & frustrations I repeatedly encountered with existing web-publishing tools.

If you agree with my characterization of those problems, then you’ll probably like the solution that Pollen offers. If not, you probably won’t.

3.1 Web development and its discontents

I made my first web page in 1994, shortly after the web was invented. I opened my text editor (at the time, BBEdit), pecked out <html><body>Hello world</body></html>, then loaded it in Mosaic. So did a million other nerds.

If you weren’t around then, you didn’t miss much. Everything about the web was horrible: the web browsers, the computers running the browsers, the dial-up connections feeding the browsers, and of course HTML itself. At that point, the desktop-software experience was already slick and refined. By comparison, using the web felt like banging rocks together.

That’s no longer true. The web is now more than 20 years old. During that time, most parts of the web have improved dramatically — for instance, the connections are faster, the browsers are more sophisticated, the screens have more pixels.

But one part has not improved: the way we make web pages. Over the years, tools promising to simplify HTML development have come and mostly gone — from PageMill to Dreamweaver to WordPress to Jekyll. Meanwhile, true web jocks have remained loyal to the original HTML power tool: the humble text editor.

In one way, this makes sense. Web pages are mostly made of text-based data — HTML, CSS, JavaScript, and so on — and the simplest way to mainpulate this data is with a text editor. While HTML and CSS are not programming languages, they lend themselves to semantic and logical structure that’s most easily expressed by editing them as text. Furthermore, text-based editing makes debugging and performance improvements easier.

But text-based editing is also limited. Though the underlying description of a web page is notionally human-readable, it’s optimized to be readable by other software — namely, web browsers. HTML markup in particular is verbose and easily mistyped. And isn’t it fatally dull to manage all the boilerplate, like surrounding every paragraph with <p>...</p>? Yes, it is.

For these reasons, much of web development should lend itself to abstraction & automation. Abstraction means consolidating repetitve, complex patterns into simpler, parameterized forms. Automation means avoiding the manual drudgery of generating the output files. But in practice, tools that enable this abstraction & automation have been slow to arrive, and most have come hobbled with unacceptable deficiencies.

3.2 The better idea: a programming model

Parallel with my HTML education, I also goofed around with various programming languages — C, C++, Perl, Java, PHP, JavaScript, Python. Unlike HTML, programming languages excel at abstraction and automation. This seemed like the obvious direction for web development to go.

What distinguishes the text-editing model from the programming model? It’s a matter of direct vs. indirect manipulation of output. The text-editing model treats HTML as something to be written directly with a text editor. Whereas the programming model treats HTML — or whatever the output is — as the result of compiling a set of source files, which are written in a programming language. The costs of working indirectly via the programming language are offset by the benefits of abstraction & automation.

On the early web, the text-editing model was appealingly precise and quick. On small projects, it worked well enough. But as projects grew, the text-editing model was going to lose steam. I wasn’t the only one to notice. Shortly after those million nerds made their first web page by hand, many of them set about devising ways to apply a programming model to web development.

3.3 “Now you have two problems”

What followed was a steady stream of products, frameworks, tools, and content management systems that claimed to bring a programming model to web development. Some were better than others. But none of them displaced the text editor as the preferred tool of web developers.

Why not? All these tools promised a great leap forward in solving the web-development problem. In practice, they simply redistributed the pain. I needn’t bore you with enumerating the deficiencies of specific tools, because they’ve tended to fail in the same thematic ways:

I’ve tried a lot of these tools over the years. Some I liked. Some I didn’t. Invariably, however, whenever I could still make do with hand-editing an HTML project, I would. After trying to cajole the web framework du jour into doing my bidding, it was relaxing to trade off some efficiency for control.

3.4 Rethinking the solution for digital books

In 2008, I launched a website called Typography for Lawyers. Initially, I’d conceived of it as a book. Then I thought “no one’s going to publish that.” So it became a website, that I aimed to make as book-like as possible. But hand-editing wasn’t going to be enough.

So I used WordPress. The major chore became scraping out all the crap that typically lives in blog templates. Largely because of this, people liked the site, because it was simpler & cleaner than the usual WordPress website.

Eventually, a publisher offered to release it as a paperback, which came out in 2010.

Later came the inevitable request to make it into a Kindle book. As a fan of typography, I hate the Kindle. The layout controls are coarse, and so is the reading experience. But I didn’t run and hide. Basically a Kindle book is a little website made with 1995-era HTML. So I coded up some tools in Perl to convert my book to Kindle format while preserving the formatting and images as well as possible.

At that point, I noticed I had converted Typography for Lawyers into web format twice, using two different sets of tools. Before someone asked me to do it a third time, I started thinking about how I might create source code for the book that allowed me to render it into different formats.

That was the beginning of the Pollen project.

I wrote the initial version of Pollen in Python. I devised a simplified markup-notation language for the source files. This language was compiled into XML-ish data structures using ply (Python lex/yacc). These structures were parsed into trees using LXML. The trees were combined with templates made in Chameleon. These templates were rendered and previewed with the Bottle web server.

Did it work? Sort of. Source code went in; web pages came out. But it was also complicated and fragile. Moreover, though the automation was there, there wasn’t yet enough abstraction at the source layer. I started thinking about how I could add a source preprocessor.

3.5 Enter Racket

I had come across Racket while researching languages suitable for HTML/XML processing. I had unexpectedly learned about the secret kinship of XML and Lisp: though XML is not a programming language, it uses a variant of Lisp syntax. Thus Lisp languages are particularly adept at handling XMLish structures. That was interesting.

After comparing some of the Lisp & Scheme variants, Racket stood out because it had a text-based dialect called Scribble. Scribble could be used to embed code within textual content. That was interesting too. Among other things, this meant Scribble could be used as a general-purpose preprocessor. So I thought I’d see if I could add it to Pollen.

It worked. So well, in fact, that I started thinking about whether I could reimplement other parts of Pollen in Racket. Then I started thinking about reimplementing all of it in Racket.

So I did. And here we are.

3.6 What is Pollen?

Pollen is a publishing system built on top of Scribble and Racket. So far, I’ve optimized Pollen for digital books, because that’s mainly what I use it for. But it can be used for small projects too.

As a publishing system, Pollen includes:

Pollen addresses the deficiencies I experienced with other tools:

3.7 Further reading

In Why Racket? Why Lisp?, I explain why Racket was the right tool for this job.

 
\ No newline at end of file diff --git a/doc/Decode.html b/doc/Decode.html index 4c8b407..de176bf 100644 --- a/doc/Decode.html +++ b/doc/Decode.html @@ -1,2 +1,2 @@ -11.2 Decode
11 Module reference
11.1 Cache
11.2 Decode
11.3 File
11.4 Pagetree
11.5 Render
11.6 Template
11.7 Tag
11.8 Top
11.9 World
11.2 Decode
On this page:
decode
decode-elements
11.2.1 Block
project-block-tags
register-block-tag
block-txexpr?
11.2.2 Typography
whitespace?
whitespace/  nbsp?
smart-quotes
smart-dashes
detect-linebreaks
detect-paragraphs
wrap-hanging-quotes
6.1.0.5

11.2 Decode

 (require pollen/decode) package: pollen

The doc export of a Pollen markup file is a simple X-expression. Decoding refers to any post-processing of this X-expression. The pollen/decode module provides tools for creating decoders.

The decode step can happen separately from the compilation of the file. But you can also attach a decoder to the markup file’s root node, so the decoding happens automatically when the markup is compiled, and thus automatically incorporated into doc. (Following this approach, you could also attach multiple decoders to different tags within doc.)

You can, of course, embed function calls within Pollen markup. But since markup is optimized for authors, decoding is useful for operations that can or should be moved out of the authoring layer.

One example is presentation and layout. For instance, detect-paragraphs is a decoder function that lets authors mark paragraphs in their source simply by using two carriage returns.

Another example is conversion of output into a particular data format. Most Pollen functions are optimized for HTML output, but one could write a decoder that targets another format.

procedure

(decode tagged-xexpr    
  [#:txexpr-tag-proc txexpr-tag-proc    
  #:txexpr-attrs-proc txexpr-attrs-proc    
  #:txexpr-elements-proc txexpr-elements-proc    
  #:block-txexpr-proc block-txexpr-proc    
  #:inline-txexpr-proc inline-txexpr-proc    
  #:string-proc string-proc    
  #:symbol-proc symbol-proc    
  #:valid-char-proc valid-char-proc    
  #:cdata-proc cdata-proc    
  #:exclude-tags tags-to-exclude])  txexpr?
  tagged-xexpr : txexpr?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Recursively process a tagged-xexpr, usually the one exported from a Pollen source file as doc.

This function doesn’t do much on its own. Rather, it provides the hooks upon which harder-working functions can be hung.

Recall from [future link: Pollen mechanics] that any tag can have a function attached to it. By default, the tagged-xexpr from a source file is tagged with root. So the typical way to use decode is to attach your decoding functions to it, and then define root to invoke your decode function. Then it will be automatically applied to every doc during compile.

For instance, here’s how decode is attached to root in Butterick’s Practical Typography. There’s not much to it —

(define (root . items)
  (decode (make-txexpr 'root '() items)
          #:txexpr-elements-proc detect-paragraphs
          #:block-txexpr-proc (compose1 hyphenate wrap-hanging-quotes)
          #:string-proc (compose1 smart-quotes smart-dashes)
          #:exclude-tags '(style script)))

The hyphenate function is not part of Pollen, but rather the hyphenate package, which you can install separately.

This illustrates another important point: even though decode presents an imposing list of arguments, you’re unlikely to use all of them at once. These represent possibilities, not requirements. For instance, let’s see what happens when decode is invoked without any of its optional arguments.

Examples:

> (define tx '(root "I wonder" (em "why") "this works."))
> (decode tx)

'(root "I wonder" (em "why") "this works.")

Right — nothing. That’s because the default value for the decoding arguments is the identity function, (λ (x) x). So all the input gets passed through intact unless another action is specified.

The *-proc arguments of decode take procedures that are applied to specific categories of elements within txexpr.

The txexpr-tag-proc argument is a procedure that handles X-expression tags.

Examples:

> (define tx '(p "I'm from a strange" (strong "namespace")))
; Tags are symbols, so a tag-proc should return a symbol
> (decode tx #:txexpr-tag-proc (λ(t) (string->symbol (format "ns:~a" t))))

'(ns:p "I'm from a strange" (ns:strong "namespace"))

The txexpr-attrs-proc argument is a procedure that handles lists of X-expression attributes. (The txexpr module, included at no extra charge with Pollen, includes useful helper functions for dealing with these attribute lists.)

Examples:

> (define tx '(p [[id "first"]] "If I only had a brain."))
; Attrs is a list, so cons is OK for simple cases
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(p ((class "PhD") (id "first")) "If I only had a brain.")

Note that txexpr-attrs-proc will change the attributes of every tagged X-expression, even those that don’t have attributes. This is useful, because sometimes you want to add attributes where none existed before. But be careful, because the behavior may make your processing function overinclusive.

Examples:

> (define tx '(div (p [[id "first"]] "If I only had a brain.")
  (p "Me too.")))
; This will insert the new attribute everywhere
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(div

  ((class "PhD"))

  (p ((class "PhD") (id "first")) "If I only had a brain.")

  (p ((class "PhD")) "Me too."))

; This will add the new attribute only to non-null attribute lists
> (decode tx #:txexpr-attrs-proc
  (λ(attrs) (if (null? attrs) attrs (cons '[class "PhD"] attrs))))

'(div (p ((class "PhD") (id "first")) "If I only had a brain.") (p "Me too."))

The txexpr-elements-proc argument is a procedure that operates on the list of elements that represents the content of each tagged X-expression. Note that each element of an X-expression is subject to two passes through the decoder: once now, as a member of the list of elements, and also later, through its type-specific decoder (i.e., string-proc, symbol-proc, and so on).

Examples:

> (define tx '(div "Double" "\n" "toil" amp "trouble"))
; Every element gets doubled ...
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es)))

'(div "Double" "Double" "\n" "\n" "toil" "toil" amp amp "trouble" "trouble")

; ... but only strings get capitalized
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es))
  #:string-proc (λ(s) (string-upcase s)))

'(div "DOUBLE" "DOUBLE" "\n" "\n" "TOIL" "TOIL" amp amp "TROUBLE" "TROUBLE")

So why do you need txexpr-elements-proc? Because some types of element decoding depend on context, thus it’s necessary to handle the elements as a group. For instance, the doubling function above, though useless, requires handling the element list as a whole, because elements are being added.

A more useful example: paragraph detection. The behavior is not merely a map across each element:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
; Context matters. Trailing whitespace is ignored ...
> (paras '(body "The first paragraph." "\n\n"))

'(body "The first paragraph.")

; ... but whitespace between strings is converted to a break.
> (paras '(body "The first paragraph." "\n\n" "And another."))

'(body (p "The first paragraph.") (p "And another."))

; A combination of both types
> (paras '(body "The first paragraph." "\n\n" "And another." "\n\n"))

'(body (p "The first paragraph.") (p "And another."))

The block-txexpr-proc argument and the inline-txexpr-proc arguments are procedures that operate on tagged X-expressions. If the X-expression meets the block-txexpr? test, it is processed by block-txexpr-proc. Otherwise, it is processed by inline-txexpr-proc. Thus every tagged X-expression will be handled by one or the other. Of course, if you want block and inline elements to be handled the same way, you can set block-txexpr-proc and inline-txexpr-proc to be the same procedure.

Examples:

> (define tx '(div "Please" (em "mind the gap") (h1 "Tuesdays only")))
> (define add-ns (λ(tx) (make-txexpr
      (string->symbol (format "ns:~a" (get-tag tx)))
      (get-attrs tx)
      (get-elements tx))))
; div and h1 are block elements, so this will only affect them
> (decode tx #:block-txexpr-proc add-ns)

'(ns:div "Please" (em "mind the gap") (ns:h1 "Tuesdays only"))

; em is an inline element, so this will only affect it
> (decode tx #:inline-txexpr-proc add-ns)

'(div "Please" (ns:em "mind the gap") (h1 "Tuesdays only"))

; this will affect all elements
> (decode tx #:block-txexpr-proc add-ns #:inline-txexpr-proc add-ns)

'(ns:div "Please" (ns:em "mind the gap") (ns:h1 "Tuesdays only"))

The string-proc, symbol-proc, valid-char-proc, and cdata-proc arguments are procedures that operate on X-expressions that are strings, symbols, valid-chars, and CDATA, respectively. Deliberately, the output contracts for these procedures accept any kind of X-expression (meaning, the procedure can change the X-expression type).

Examples:

; A div with string, entity, character, and cdata elements
> (define tx `(div "Moe" amp 62 ,(cdata #f #f "3 > 2;")))
> (define rulify (λ(x) '(hr)))
; The rulify function is selectively applied to each
> (print (decode tx #:string-proc rulify))

'(div (hr) amp 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:symbol-proc rulify))

'(div "Moe" (hr) 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:valid-char-proc rulify))

'(div "Moe" amp (hr) #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:cdata-proc rulify))

'(div "Moe" amp 62 (hr))

Finally, the tags-to-exclude argument is a list of tags that will be exempted from decoding. Though you could get the same result by testing the input within the individual decoding functions, that’s tedious and potentially slower.

Examples:

> (define tx '(p "I really think" (em "italics") "should be lowercase."))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(p "I REALLY THINK" (em "ITALICS") "SHOULD BE LOWERCASE.")

> (decode tx #:string-proc (λ(s) (string-upcase s)) #:exclude-tags '(em))

'(p "I REALLY THINK" (em "italics") "SHOULD BE LOWERCASE.")

The tags-to-exclude argument is useful if you’re decoding source that’s destined to become HTML. According to the HTML spec, material within a <style> or <script> block needs to be preserved literally. In this example, if the CSS and JavaScript blocks are capitalized, they won’t work. So exclude '(style script), and problem solved.

Examples:

> (define tx '(body (h1 [[class "Red"]] "Let's visit Planet Telex.")
  (style [[type "text/css"]] ".Red {color: green;}")
  (script [[type "text/javascript"]] "var area = h * w;")))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".RED {COLOR: GREEN;}")

  (script ((type "text/javascript")) "VAR AREA = H * W;"))

> (decode tx #:string-proc (λ(s) (string-upcase s))
  #:exclude-tags '(style script))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".Red {color: green;}")

  (script ((type "text/javascript")) "var area = h * w;"))

procedure

(decode-elements elements 
  [#:txexpr-tag-proc txexpr-tag-proc 
  #:txexpr-attrs-proc txexpr-attrs-proc 
  #:txexpr-elements-proc txexpr-elements-proc 
  #:block-txexpr-proc block-txexpr-proc 
  #:inline-txexpr-proc inline-txexpr-proc 
  #:string-proc string-proc 
  #:symbol-proc symbol-proc 
  #:valid-char-proc valid-char-proc 
  #:cdata-proc cdata-proc 
  #:exclude-tags tags-to-exclude]) 
  txexpr-elements?
  elements : txexpr-elements?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Identical to decode, but takes txexpr-elements? as input rather than a whole tagged X-expression, and likewise returns txexpr-elements? rather than a tagged X-expression. A convenience variant for use inside tag functions.

11.2.1 Block

Because it’s convenient, Pollen puts tagged X-expressions into two categories: block and inline. Why is it convenient? When using decode, you often want to treat the two categories differently. Not that you have to. But this is how you can.

parameter

(project-block-tags)  (listof txexpr-tag?)

(project-block-tags block-tags)  void?
  block-tags : (listof txexpr-tag?)
A parameter that defines the set of tags that decode will treat as blocks. This parameter is initialized with the HTML block tags, namely:

(address article aside audio blockquote body canvas dd div dl fieldset figcaption figure footer form h1 h2 h3 h4 h5 h6 header hgroup noscript ol output p pre section table tfoot ul video)

procedure

(register-block-tag tag)  void?

  tag : txexpr-tag?
Adds a tag to project-block-tags so that block-txexpr? will report it as a block, and decode will process it with block-txexpr-proc rather than inline-txexpr-proc.

Pollen tries to do the right thing without being told. But this is the rare case where you have to be explicit. If you introduce a tag into your markup that you want treated as a block, you must use this function to identify it, or you will get spooky behavior later on.

For instance, detect-paragraphs knows that block elements in the markup shouldn’t be wrapped in a p tag. So if you introduce a new block element called bloq without registering it as a block, misbehavior will follow:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (p (bloq "But not me.")))

; Wrong: bloq should not be wrapped

But once you register bloq as a block, order is restored:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (register-block-tag 'bloq)
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (bloq "But not me."))

; Right: bloq is treated as a block

If you find the idea of registering block tags unbearable, good news. The project-block-tags include the standard HTML block tags by default. So if you just want to use things like div and p and h1–h6, you’ll get the right behavior for free.

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (div "But not me.")))

'(body (p "I want to be a paragraph.") (div "But not me."))

procedure

(block-txexpr? v)  boolean?

  v : any/c
Predicate that tests whether v is a tagged X-expression, and if so, whether the tag is among the project-block-tags. If not, it is treated as inline. To adjust how this test works, use register-block-tag.

11.2.2 Typography

An assortment of typography & layout functions, designed to be used with decode. These aren’t hard to write. So if you like these, use them. If not, make your own.

procedure

(whitespace? v)  boolean?

  v : any/c
A predicate that returns #t for any stringlike v that’s entirely whitespace, but also the empty string, as well as lists and vectors that are made only of whitespace? members. Following the regexp-match convention, whitespace? does not return #t for a nonbreaking space. If you prefer that behavior, use whitespace/nbsp?.

Examples:

> (whitespace? "\n\n   ")

#t

> (whitespace? (string->symbol "\n\n   "))

#t

> (whitespace? "")

#t

> (whitespace? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace? nonbreaking-space)

#f

procedure

(whitespace/nbsp? v)  boolean?

  v : any/c
Like whitespace?, but also returns #t for nonbreaking spaces.

Examples:

> (whitespace/nbsp? "\n\n   ")

#t

> (whitespace/nbsp? (string->symbol "\n\n   "))

#t

> (whitespace/nbsp? "")

#t

> (whitespace/nbsp? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace/nbsp? nonbreaking-space)

#t

procedure

(smart-quotes str)  string?

  str : string?
Convert straight quotes in str to curly according to American English conventions.

Examples:

> (define tricky-string
  "\"Why,\" she could've asked, \"are we in O‘ahu watching 'Mame'?\"")
> (display tricky-string)

"Why," she could've asked, "are we in O‘ahu watching 'Mame'?"

> (display (smart-quotes tricky-string))

“Why,” she could’ve asked, “are we in O‘ahu watching ‘Mame’?”

procedure

(smart-dashes str)  string?

  str : string?
In str, convert three hyphens to an em dash, and two hyphens to an en dash, and remove surrounding spaces.

Examples:

> (define tricky-string "I had a few --- OK, like 6--8 --- thin mints.")
> (display tricky-string)

I had a few --- OK, like 6--8 --- thin mints.

> (display (smart-dashes tricky-string))

I had a few—OK, like 6–8—thin mints.

; Monospaced font not great for showing dashes, but you get the idea

procedure

(detect-linebreaks tagged-xexpr-elements 
  [#:separator linebreak-sep 
  #:insert linebreak]) 
  txexpr-elements?
  tagged-xexpr-elements : txexpr-elements?
  linebreak-sep : string? = world:linebreak-separator
  linebreak : xexpr? = '(br)
Within tagged-xexpr-elements, convert occurrences of linebreak-sep ("\n" by default) to linebreak, but only if linebreak-sep does not occur between blocks (see block-txexpr?). Why? Because block-level elements automatically display on a new line, so adding linebreak would be superfluous. In that case, linebreak-sep just disappears.

Examples:

> (detect-linebreaks '(div "Two items:" "\n" (em "Eggs") "\n" (em "Bacon")))

'(div "Two items:" (br) (em "Eggs") (br) (em "Bacon"))

> (detect-linebreaks '(div "Two items:" "\n" (div "Eggs") "\n" (div "Bacon")))

'(div "Two items:" (div "Eggs") (div "Bacon"))

procedure

(detect-paragraphs elements 
  [#:separator paragraph-sep 
  #:tag paragraph-tag 
  #:linebreak-proc linebreak-proc]) 
  txexpr-elements?
  elements : txexpr-elements?
  paragraph-sep : string? = world:paragraph-separator
  paragraph-tag : symbol? = 'p
  linebreak-proc : (txexpr-elements? . -> . txexpr-elements?)
   = detect-linebreaks
Find paragraphs within elements (as denoted by paragraph-sep) and wrap them with paragraph-tag. Also handle linebreaks using detect-linebreaks.

If element is already a block-txexpr?, it will not be wrapped as a paragraph (because in that case, the wrapping would be superfluous). Thus, as a consequence, if paragraph-sep occurs between two blocks, it will be ignored (as in the example below using two sequential 'div blocks.)

The paragraph-tag argument sets the tag used to wrap paragraphs.

The linebreak-proc argument allows you to use a different linebreaking procedure other than the usual detect-linebreaks.

Examples:

> (detect-paragraphs '("First para" "\n\n" "Second para"))

'((p "First para") (p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line"))

'((p "First para") (p "Second para" (br) "Second line"))

> (detect-paragraphs '("First para" "\n\n" (div "Second block")))

'((p "First para") (div "Second block"))

> (detect-paragraphs '((div "First block") "\n\n" (div "Second block")))

'((div "First block") (div "Second block"))

> (detect-paragraphs '("First para" "\n\n" "Second para") #:tag 'ns:p)

'((ns:p "First para") (ns:p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line")
  #:linebreak-proc (λ(x) (detect-linebreaks x #:insert '(newline))))

'((p "First para") (p "Second para" (newline) "Second line"))

procedure

(wrap-hanging-quotes tx 
  [#:single-preprend single-preprender 
  #:double-preprend double-preprender]) 
  txexpr?
  tx : txexpr?
  single-preprender : txexpr-tag? = 'squo
  double-preprender : txexpr-tag? = 'dquo
Find single or double quote marks at the beginning of tx and wrap them in an X-expression with the tag single-preprender or double-preprender, respectively. The default values are 'squo and 'dquo.

Examples:

> (wrap-hanging-quotes '(p "No quote to hang."))

'(p "No quote to hang.")

> (wrap-hanging-quotes '(p "“What? We need to hang quotes?”"))

'(p (dquo "“" "What? We need to hang quotes?”"))

In pro typography, quotation marks at the beginning of a line or paragraph are often shifted into the margin slightly to make them appear more optically aligned with the left edge of the text. With a reflowable layout model like HTML, you don’t know where your line breaks will be.

This function will simply insert the 'squo and 'dquo tags, which provide hooks that let you do the actual hanging via CSS, like so (actual measurement can be refined to taste):

squo {margin-left: -0.25em;}

dquo {margin-left: -0.50em;}

Be warned: there are many edge cases this function does not handle well.

Examples:

; Argh: this edge case is not handled properly
> (wrap-hanging-quotes '(p "“" (em "What?") "We need to hang quotes?”"))

'(p "“" (em "What?") "We need to hang quotes?”")

 
\ No newline at end of file +11.2 Decode
11 Module reference
11.1 Cache
11.2 Decode
11.3 File
11.4 Pagetree
11.5 Render
11.6 Template
11.7 Tag
11.8 Top
11.9 World
11.2 Decode
On this page:
decode
decode-elements
11.2.1 Block
project-block-tags
register-block-tag
block-txexpr?
11.2.2 Typography
whitespace?
whitespace/  nbsp?
smart-quotes
smart-dashes
detect-linebreaks
detect-paragraphs
wrap-hanging-quotes
6.1.0.5

11.2 Decode

 (require pollen/decode) package: pollen

The doc export of a Pollen markup file is a simple X-expression. Decoding refers to any post-processing of this X-expression. The pollen/decode module provides tools for creating decoders.

The decode step can happen separately from the compilation of the file. But you can also attach a decoder to the markup file’s root node, so the decoding happens automatically when the markup is compiled, and thus automatically incorporated into doc. (Following this approach, you could also attach multiple decoders to different tags within doc.)

You can, of course, embed function calls within Pollen markup. But since markup is optimized for authors, decoding is useful for operations that can or should be moved out of the authoring layer.

One example is presentation and layout. For instance, detect-paragraphs is a decoder function that lets authors mark paragraphs in their source simply by using two carriage returns.

Another example is conversion of output into a particular data format. Most Pollen functions are optimized for HTML output, but one could write a decoder that targets another format.

procedure

(decode tagged-xexpr    
  [#:txexpr-tag-proc txexpr-tag-proc    
  #:txexpr-attrs-proc txexpr-attrs-proc    
  #:txexpr-elements-proc txexpr-elements-proc    
  #:block-txexpr-proc block-txexpr-proc    
  #:inline-txexpr-proc inline-txexpr-proc    
  #:string-proc string-proc    
  #:symbol-proc symbol-proc    
  #:valid-char-proc valid-char-proc    
  #:cdata-proc cdata-proc    
  #:exclude-tags tags-to-exclude])  txexpr?
  tagged-xexpr : txexpr?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Recursively process a tagged-xexpr, usually the one exported from a Pollen source file as doc.

This function doesn’t do much on its own. Rather, it provides the hooks upon which harder-working functions can be hung.

Recall from [future link: Pollen mechanics] that any tag can have a function attached to it. By default, the tagged-xexpr from a source file is tagged with root. So the typical way to use decode is to attach your decoding functions to it, and then define root to invoke your decode function. Then it will be automatically applied to every doc during compile.

For instance, here’s how decode is attached to root in Butterick’s Practical Typography. There’s not much to it —

(define (root . items)
  (decode (make-txexpr 'root '() items)
          #:txexpr-elements-proc detect-paragraphs
          #:block-txexpr-proc (compose1 hyphenate wrap-hanging-quotes)
          #:string-proc (compose1 smart-quotes smart-dashes)
          #:exclude-tags '(style script)))

The hyphenate function is not part of Pollen, but rather the hyphenate package, which you can install separately.

This illustrates another important point: even though decode presents an imposing list of arguments, you’re unlikely to use all of them at once. These represent possibilities, not requirements. For instance, let’s see what happens when decode is invoked without any of its optional arguments.

Examples:

> (define tx '(root "I wonder" (em "why") "this works."))
> (decode tx)

'(root "I wonder" (em "why") "this works.")

Right — nothing. That’s because the default value for the decoding arguments is the identity function, (λ (x) x). So all the input gets passed through intact unless another action is specified.

The *-proc arguments of decode take procedures that are applied to specific categories of elements within txexpr.

The txexpr-tag-proc argument is a procedure that handles X-expression tags.

Examples:

> (define tx '(p "I'm from a strange" (strong "namespace")))
; Tags are symbols, so a tag-proc should return a symbol
> (decode tx #:txexpr-tag-proc (λ(t) (string->symbol (format "ns:~a" t))))

'(ns:p "I'm from a strange" (ns:strong "namespace"))

The txexpr-attrs-proc argument is a procedure that handles lists of X-expression attributes. (The txexpr module, included at no extra charge with Pollen, includes useful helper functions for dealing with these attribute lists.)

Examples:

> (define tx '(p [[id "first"]] "If I only had a brain."))
; Attrs is a list, so cons is OK for simple cases
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(p ((class "PhD") (id "first")) "If I only had a brain.")

Note that txexpr-attrs-proc will change the attributes of every tagged X-expression, even those that don’t have attributes. This is useful, because sometimes you want to add attributes where none existed before. But be careful, because the behavior may make your processing function overinclusive.

Examples:

> (define tx '(div (p [[id "first"]] "If I only had a brain.")
  (p "Me too.")))
; This will insert the new attribute everywhere
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(div

  ((class "PhD"))

  (p ((class "PhD") (id "first")) "If I only had a brain.")

  (p ((class "PhD")) "Me too."))

; This will add the new attribute only to non-null attribute lists
> (decode tx #:txexpr-attrs-proc
  (λ(attrs) (if (null? attrs) attrs (cons '[class "PhD"] attrs))))

'(div (p ((class "PhD") (id "first")) "If I only had a brain.") (p "Me too."))

The txexpr-elements-proc argument is a procedure that operates on the list of elements that represents the content of each tagged X-expression. Note that each element of an X-expression is subject to two passes through the decoder: once now, as a member of the list of elements, and also later, through its type-specific decoder (i.e., string-proc, symbol-proc, and so on).

Examples:

> (define tx '(div "Double" "\n" "toil" amp "trouble"))
; Every element gets doubled ...
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es)))

'(div "Double" "Double" "\n" "\n" "toil" "toil" amp amp "trouble" "trouble")

; ... but only strings get capitalized
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es))
  #:string-proc (λ(s) (string-upcase s)))

'(div "DOUBLE" "DOUBLE" "\n" "\n" "TOIL" "TOIL" amp amp "TROUBLE" "TROUBLE")

So why do you need txexpr-elements-proc? Because some types of element decoding depend on context, thus it’s necessary to handle the elements as a group. For instance, the doubling function above, though useless, requires handling the element list as a whole, because elements are being added.

A more useful example: paragraph detection. The behavior is not merely a map across each element:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
; Context matters. Trailing whitespace is ignored ...
> (paras '(body "The first paragraph." "\n\n"))

'(body "The first paragraph.")

; ... but whitespace between strings is converted to a break.
> (paras '(body "The first paragraph." "\n\n" "And another."))

'(body (p "The first paragraph.") (p "And another."))

; A combination of both types
> (paras '(body "The first paragraph." "\n\n" "And another." "\n\n"))

'(body (p "The first paragraph.") (p "And another."))

The block-txexpr-proc argument and the inline-txexpr-proc arguments are procedures that operate on tagged X-expressions. If the X-expression meets the block-txexpr? test, it’s processed by block-txexpr-proc. Otherwise, it’s inline, so it’s processed by inline-txexpr-proc. (Careful, however — these aren’t mutually exclusive, because block-txexpr-proc operates on all the elements of a block, including other tagged X-expressions within.)

Of course, if you want block and inline elements to be handled the same way, you can set block-txexpr-proc and inline-txexpr-proc to be the same procedure.

Examples:

> (define tx '(div "Please" (em "mind the gap") (h1 "Tuesdays only")))
> (define add-ns (λ(tx) (make-txexpr
      (string->symbol (format "ns:~a" (get-tag tx)))
      (get-attrs tx)
      (get-elements tx))))
; div and h1 are block elements, so this will only affect them
> (decode tx #:block-txexpr-proc add-ns)

'(ns:div "Please" (em "mind the gap") (ns:h1 "Tuesdays only"))

; em is an inline element, so this will only affect it
> (decode tx #:inline-txexpr-proc add-ns)

'(div "Please" (ns:em "mind the gap") (h1 "Tuesdays only"))

; this will affect all elements
> (decode tx #:block-txexpr-proc add-ns #:inline-txexpr-proc add-ns)

'(ns:div "Please" (ns:em "mind the gap") (ns:h1 "Tuesdays only"))

The string-proc, symbol-proc, valid-char-proc, and cdata-proc arguments are procedures that operate on X-expressions that are strings, symbols, valid-chars, and CDATA, respectively. Deliberately, the output contracts for these procedures accept any kind of X-expression (meaning, the procedure can change the X-expression type).

Examples:

; A div with string, entity, character, and cdata elements
> (define tx `(div "Moe" amp 62 ,(cdata #f #f "3 > 2;")))
> (define rulify (λ(x) '(hr)))
; The rulify function is selectively applied to each
> (print (decode tx #:string-proc rulify))

'(div (hr) amp 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:symbol-proc rulify))

'(div "Moe" (hr) 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:valid-char-proc rulify))

'(div "Moe" amp (hr) #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:cdata-proc rulify))

'(div "Moe" amp 62 (hr))

Finally, the tags-to-exclude argument is a list of tags that will be exempted from decoding. Though you could get the same result by testing the input within the individual decoding functions, that’s tedious and potentially slower.

Examples:

> (define tx '(p "I really think" (em "italics") "should be lowercase."))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(p "I REALLY THINK" (em "ITALICS") "SHOULD BE LOWERCASE.")

> (decode tx #:string-proc (λ(s) (string-upcase s)) #:exclude-tags '(em))

'(p "I REALLY THINK" (em "italics") "SHOULD BE LOWERCASE.")

The tags-to-exclude argument is useful if you’re decoding source that’s destined to become HTML. According to the HTML spec, material within a <style> or <script> block needs to be preserved literally. In this example, if the CSS and JavaScript blocks are capitalized, they won’t work. So exclude '(style script), and problem solved.

Examples:

> (define tx '(body (h1 [[class "Red"]] "Let's visit Planet Telex.")
  (style [[type "text/css"]] ".Red {color: green;}")
  (script [[type "text/javascript"]] "var area = h * w;")))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".RED {COLOR: GREEN;}")

  (script ((type "text/javascript")) "VAR AREA = H * W;"))

> (decode tx #:string-proc (λ(s) (string-upcase s))
  #:exclude-tags '(style script))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".Red {color: green;}")

  (script ((type "text/javascript")) "var area = h * w;"))

procedure

(decode-elements elements 
  [#:txexpr-tag-proc txexpr-tag-proc 
  #:txexpr-attrs-proc txexpr-attrs-proc 
  #:txexpr-elements-proc txexpr-elements-proc 
  #:block-txexpr-proc block-txexpr-proc 
  #:inline-txexpr-proc inline-txexpr-proc 
  #:string-proc string-proc 
  #:symbol-proc symbol-proc 
  #:valid-char-proc valid-char-proc 
  #:cdata-proc cdata-proc 
  #:exclude-tags tags-to-exclude]) 
  txexpr-elements?
  elements : txexpr-elements?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Identical to decode, but takes txexpr-elements? as input rather than a whole tagged X-expression, and likewise returns txexpr-elements? rather than a tagged X-expression. A convenience variant for use inside tag functions.

11.2.1 Block

Because it’s convenient, Pollen puts tagged X-expressions into two categories: block and inline. Why is it convenient? When using decode, you often want to treat the two categories differently. Not that you have to. But this is how you can.

parameter

(project-block-tags)  (listof txexpr-tag?)

(project-block-tags block-tags)  void?
  block-tags : (listof txexpr-tag?)
A parameter that defines the set of tags that decode will treat as blocks. This parameter is initialized with the HTML block tags, namely:

(address article aside audio blockquote body canvas dd div dl fieldset figcaption figure footer form h1 h2 h3 h4 h5 h6 header hgroup noscript ol output p pre section table tfoot ul video)

procedure

(register-block-tag tag)  void?

  tag : txexpr-tag?
Adds a tag to project-block-tags so that block-txexpr? will report it as a block, and decode will process it with block-txexpr-proc rather than inline-txexpr-proc.

Pollen tries to do the right thing without being told. But this is the rare case where you have to be explicit. If you introduce a tag into your markup that you want treated as a block, you must use this function to identify it, or you will get spooky behavior later on.

For instance, detect-paragraphs knows that block elements in the markup shouldn’t be wrapped in a p tag. So if you introduce a new block element called bloq without registering it as a block, misbehavior will follow:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (p (bloq "But not me.")))

; Wrong: bloq should not be wrapped

But once you register bloq as a block, order is restored:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (register-block-tag 'bloq)
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (bloq "But not me."))

; Right: bloq is treated as a block

If you find the idea of registering block tags unbearable, good news. The project-block-tags include the standard HTML block tags by default. So if you just want to use things like div and p and h1–h6, you’ll get the right behavior for free.

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (div "But not me.")))

'(body (p "I want to be a paragraph.") (div "But not me."))

procedure

(block-txexpr? v)  boolean?

  v : any/c
Predicate that tests whether v is a tagged X-expression, and if so, whether the tag is among the project-block-tags. If not, it is treated as inline. To adjust how this test works, use register-block-tag.

11.2.2 Typography

An assortment of typography & layout functions, designed to be used with decode. These aren’t hard to write. So if you like these, use them. If not, make your own.

procedure

(whitespace? v)  boolean?

  v : any/c
A predicate that returns #t for any stringlike v that’s entirely whitespace, but also the empty string, as well as lists and vectors that are made only of whitespace? members. Following the regexp-match convention, whitespace? does not return #t for a nonbreaking space. If you prefer that behavior, use whitespace/nbsp?.

Examples:

> (whitespace? "\n\n   ")

#t

> (whitespace? (string->symbol "\n\n   "))

#t

> (whitespace? "")

#t

> (whitespace? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace? nonbreaking-space)

#f

procedure

(whitespace/nbsp? v)  boolean?

  v : any/c
Like whitespace?, but also returns #t for nonbreaking spaces.

Examples:

> (whitespace/nbsp? "\n\n   ")

#t

> (whitespace/nbsp? (string->symbol "\n\n   "))

#t

> (whitespace/nbsp? "")

#t

> (whitespace/nbsp? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace/nbsp? nonbreaking-space)

#t

procedure

(smart-quotes str)  string?

  str : string?
Convert straight quotes in str to curly according to American English conventions.

Examples:

> (define tricky-string
  "\"Why,\" she could've asked, \"are we in O‘ahu watching 'Mame'?\"")
> (display tricky-string)

"Why," she could've asked, "are we in O‘ahu watching 'Mame'?"

> (display (smart-quotes tricky-string))

“Why,” she could’ve asked, “are we in O‘ahu watching ‘Mame’?”

procedure

(smart-dashes str)  string?

  str : string?
In str, convert three hyphens to an em dash, and two hyphens to an en dash, and remove surrounding spaces.

Examples:

> (define tricky-string "I had a few --- OK, like 6--8 --- thin mints.")
> (display tricky-string)

I had a few --- OK, like 6--8 --- thin mints.

> (display (smart-dashes tricky-string))

I had a few—OK, like 6–8—thin mints.

; Monospaced font not great for showing dashes, but you get the idea

procedure

(detect-linebreaks tagged-xexpr-elements 
  [#:separator linebreak-sep 
  #:insert linebreak]) 
  txexpr-elements?
  tagged-xexpr-elements : txexpr-elements?
  linebreak-sep : string? = world:linebreak-separator
  linebreak : xexpr? = '(br)
Within tagged-xexpr-elements, convert occurrences of linebreak-sep ("\n" by default) to linebreak, but only if linebreak-sep does not occur between blocks (see block-txexpr?). Why? Because block-level elements automatically display on a new line, so adding linebreak would be superfluous. In that case, linebreak-sep just disappears.

Examples:

> (detect-linebreaks '(div "Two items:" "\n" (em "Eggs") "\n" (em "Bacon")))

'(div "Two items:" (br) (em "Eggs") (br) (em "Bacon"))

> (detect-linebreaks '(div "Two items:" "\n" (div "Eggs") "\n" (div "Bacon")))

'(div "Two items:" (div "Eggs") (div "Bacon"))

procedure

(detect-paragraphs elements 
  [#:separator paragraph-sep 
  #:tag paragraph-tag 
  #:linebreak-proc linebreak-proc]) 
  txexpr-elements?
  elements : txexpr-elements?
  paragraph-sep : string? = world:paragraph-separator
  paragraph-tag : symbol? = 'p
  linebreak-proc : (txexpr-elements? . -> . txexpr-elements?)
   = detect-linebreaks
Find paragraphs within elements (as denoted by paragraph-sep) and wrap them with paragraph-tag. Also handle linebreaks using detect-linebreaks.

If element is already a block-txexpr?, it will not be wrapped as a paragraph (because in that case, the wrapping would be superfluous). Thus, as a consequence, if paragraph-sep occurs between two blocks, it will be ignored (as in the example below using two sequential 'div blocks.)

The paragraph-tag argument sets the tag used to wrap paragraphs.

The linebreak-proc argument allows you to use a different linebreaking procedure other than the usual detect-linebreaks.

Examples:

> (detect-paragraphs '("First para" "\n\n" "Second para"))

'((p "First para") (p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line"))

'((p "First para") (p "Second para" (br) "Second line"))

> (detect-paragraphs '("First para" "\n\n" (div "Second block")))

'((p "First para") (div "Second block"))

> (detect-paragraphs '((div "First block") "\n\n" (div "Second block")))

'((div "First block") (div "Second block"))

> (detect-paragraphs '("First para" "\n\n" "Second para") #:tag 'ns:p)

'((ns:p "First para") (ns:p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line")
  #:linebreak-proc (λ(x) (detect-linebreaks x #:insert '(newline))))

'((p "First para") (p "Second para" (newline) "Second line"))

procedure

(wrap-hanging-quotes tx 
  [#:single-preprend single-preprender 
  #:double-preprend double-preprender]) 
  txexpr?
  tx : txexpr?
  single-preprender : txexpr-tag? = 'squo
  double-preprender : txexpr-tag? = 'dquo
Find single or double quote marks at the beginning of tx and wrap them in an X-expression with the tag single-preprender or double-preprender, respectively. The default values are 'squo and 'dquo.

Examples:

> (wrap-hanging-quotes '(p "No quote to hang."))

'(p "No quote to hang.")

> (wrap-hanging-quotes '(p "“What? We need to hang quotes?”"))

'(p (dquo "“" "What? We need to hang quotes?”"))

In pro typography, quotation marks at the beginning of a line or paragraph are often shifted into the margin slightly to make them appear more optically aligned with the left edge of the text. With a reflowable layout model like HTML, you don’t know where your line breaks will be.

This function will simply insert the 'squo and 'dquo tags, which provide hooks that let you do the actual hanging via CSS, like so (actual measurement can be refined to taste):

squo {margin-left: -0.25em;}

dquo {margin-left: -0.50em;}

Be warned: there are many edge cases this function does not handle well.

Examples:

; Argh: this edge case is not handled properly
> (wrap-hanging-quotes '(p "“" (em "What?") "We need to hang quotes?”"))

'(p "“" (em "What?") "We need to hang quotes?”")

 
\ No newline at end of file diff --git a/doc/Pagetree.html b/doc/Pagetree.html index 89f4383..9ba8a98 100644 --- a/doc/Pagetree.html +++ b/doc/Pagetree.html @@ -1,2 +1,2 @@ -11.4 Pagetree
11 Module reference
11.1 Cache
11.2 Decode
11.3 File
11.4 Pagetree
11.5 Render
11.6 Template
11.7 Tag
11.8 Top
11.9 World
On this page:
11.4.1 Making pagetrees with a source file
11.4.2 Making pagetrees by hand
11.4.3 Using pagetrees for navigation
11.4.4 Using index.ptree in the dashboard
11.4.5 Using pagetrees with raco pollen render
11.4.6 Functions
11.4.6.1 Predicates & validation
pagetree?
validate-pagetree
pagenode?
pagenodeish?
->pagenode
11.4.6.2 Navigation
current-pagetree
parent
children
siblings
previous
previous*
next
next*
11.4.6.3 Utilities
pagetree->list
in-pagetree?
path->pagenode
6.1.0.5

11.4 Pagetree

 (require pollen/pagetree) package: pollen

Books and other long documents are usually organized in a structured way — at minimum they have a sequence of pages, but more often they have sections with subsequences within. Individual pages in a Pollen project don’t know anything about how they’re connected to other pages. In theory, you could maintain this information within the source files. But this would be a poor use of human energy.

Instead, use a pagetree. A pagetree is a simple abstraction for defining & working with sequences of pagenodes. Typically these pagenodes will be the names of output files in your project.

“So it’s a list of web-page filenames?” Sort of. When I think of a web page, I think of an actual file on a disk. Keeping with Pollen’s orientation toward dynamic rendering, pagenodes may — and often do — refer to files that don’t yet exist. Moreover, by referring to output names rather than source names, you retain the flexibility to change the kind of source associated with a particular pagenode (e.g., from preprocessor source to Pollen markup).

Pagetrees can be flat or hierarchical. A flat pagetree is just a list of pagenodes. A hierarchical pagetree can also contain recursively nested lists of pagenodes. But you needn’t pay attention to this distinction, as the pagetree functions don’t care which kind you use. Neither do I.

Pagetrees surface throughout the Pollen system. They’re primarily used for navigation — for instance, calculating “previous,” “next,” or “up” links for a given page. A special pagetree, index.ptree, is used by the project server to order the files in a dashboard. Pagetrees can also be used to define batches of files for certain operations, for instance raco pollen render. You might find other uses for them too.

11.4.1 Making pagetrees with a source file

A pagetree source file either starts with #lang pollen and uses the .ptree extension, or starts with #lang pollen/ptree and then can have any file extension.

Unlike other Pollen source files, since the pagetree source is not rendered into an output format, the rest of the filename is up to you.

Here’s a flat pagetree. Each line is considered a single pagenode (blank lines are ignored). Notice that no Pollen command syntax nor quoting is needed within the pagetree source:

"flat.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html

And here’s the output in DrRacket:

'(pagetree-root index.html introduction.html main_argument.html conclusion.html)

Keeping with usual Pollen policy, this is an X-expression. The pagetree-root is just an arbitrary tag that contains the pagetree.

Upgrading to a hierarchical pagetree is simple. The same basic rule applies — one pagenode per line. But this time, you add Pollen command syntax: a lozenge in front of a pagenode marks it as the top of a nested list, and the sub-pagenodes of that list go between { curly braces }, like so:

"hierarchical.ptree"
#lang pollen
 
toc.html
first-chapter.html{
    foreword.html
    introduction.html}
second-chapter.html{
    main-argument.html{
        facts.html
        analysis.html}
    conclusion.html}
bibliography.html

The output of our hierarchical pagetree:

'(pagetree-root toc.html (first-chapter.html foreword.html introduction.html) (second-chapter.html (main-argument.html facts.html analysis.html) conclusion.html) bibliography.html)

One advantage of using a source file is that when you run it in DrRacket, it will automatically be checked using validate-pagetree, which insures that every element in the pagetree meets pagenode?, and that all the pagenodes are unique.

This pagetree has a duplicate pagenode, so it won’t run:

"duplicate-pagenode.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html
index.html

Instead, you’ll get an error:

validate-pagetree: members-unique? failed because item isn’t unique: (index.html)

11.4.2 Making pagetrees by hand

Experienced programmers may want to know that because a pagetree is just an X-expression, you can synthesize a pagetree using any Pollen or Racket tools for making X-expressions. For example, here’s some Racket code that generates the same pagetree as the flat.ptree source file above:

"make-flat-ptree.rkt"
#lang racket
(require pollen/pagetree)
(define node-names '(index introduction main_argument conclusion))
(define pt `(pagetree-root
  ,@(map (λ(n) (string->symbol (format "~a.html" n))) node-names)))
(if (pagetree? pt) pt "Oops, not a pagetree")

Note that you need to take more care when building a pagetree by hand. Pagenodes are symbols, not strings, thus the use of string->symbol is mandatory. One benefit of using a pagetree source file is that it takes care of this housekeeping for you.

11.4.3 Using pagetrees for navigation

Typically you’ll call the pagetree-navigation functions from inside templates, using the special variable here as the starting point. For more on this technique, see Pagetree navigation.

11.4.4 Using index.ptree in the dashboard

When you’re using the project server to view the files in a directory, the server will first look for a file called index.ptree. If it finds this pagetree file, it will use it to build the dashboard. If not, then it will synthesize a pagetree using a directory listing. For more on this technique, see Using the dashboard.

11.4.5 Using pagetrees with raco pollen render

The raco pollen render command is used to regenerate an output file from its source. If you pass a pagetree to raco pollen render, it will automatically render each file listed in the pagetree.

For instance, many projects have auxiliary pages that don’t really belong in the main navigational flow. You can collect these pages in a separate pagetree:

"utility.ptree"
#lang pollen
 
404-error.html
terms-of-service.html
webmaster.html
[... and so on]

Thus, when you’re using pagetree-navigation functions within a template, you can use your main pagetree, and restrict the navigation to the main editorial content. But when you render the project, you can pass both pagetrees to raco pollen render.

For more on this technique, see raco pollen render.

11.4.6 Functions
11.4.6.1 Predicates & validation

procedure

(pagetree? possible-pagetree)  boolean?

  possible-pagetree : any/c
Test whether possible-pagetree is a valid pagetree. It must be a txexpr? where all elements are pagenode?, and each is unique within possible-pagetree (not counting the root node).

Examples:

> (pagetree? '(root index.html))

#t

> (pagetree? '(root duplicate.html duplicate.html))

#f

> (pagetree? '(root index.html "string.html"))

#f

> (define nested-ptree '(root 1.html 2.html (3.html 3a.html 3b.html)))
> (pagetree? nested-ptree)

#t

> (pagetree? `(root index.html ,nested-ptree (subsection.html more.html)))

#t

; Nesting a subtree twice creates duplication
> (pagetree? `(root index.html ,nested-ptree (subsection.html ,nested-ptree)))

#f

procedure

(validate-pagetree possible-pagetree)  pagetree?

  possible-pagetree : any/c
Like pagetree?, but raises a descriptive error if possible-pagetree is invalid, and otherwise returns possible-pagetree itself.

Examples:

> (validate-pagetree '(root (mama.html son.html daughter.html) uncle.html))

'(root (mama.html son.html daughter.html) uncle.html)

> (validate-pagetree `(root (,+ son.html daughter.html) uncle.html))

#f

> (validate-pagetree '(root (mama.html son.html son.html) mama.html))

validate-pagetree: members-unique? failed because items

aren’t unique: (son.html mama.html)

procedure

(pagenode? possible-pagenode)  boolean?

  possible-pagenode : any/c
Test whether possible-pagenode is a valid pagenode. A pagenode can be any symbol? that is not whitespace/nbsp? Every leaf of a pagetree is a pagenode. In practice, your pagenodes will likely be names of output files.

Pagenodes are symbols (rather than strings) so that pagetrees will be valid tagged X-expressions, which is a more convenient format for validation & processing.

Examples:

; Three symbols, the third one annoying but valid
> (map pagenode? '(symbol index.html |   silly   |))

'(#t #t #t)

; A number, a string, a txexpr, and a whitespace symbol
> (map pagenode? '(9.999 "index.html" (p "Hello") |    |))

'(#f #f #f #f)

procedure

(pagenodeish? v)  boolean?

  v : any/c
Return #t if v can be converted with ->pagenode.

Example:

> (map pagenodeish? '(9.999 "index.html" |    |))

'(#t #t #f)

procedure

(->pagenode v)  pagenode?

  v : pagenodeish?
Convert v to a pagenode.

Examples:

> (map pagenodeish? '(symbol 9.999 "index.html" |  silly  |))

'(#t #t #t #t)

> (map ->pagenode '(symbol 9.999 "index.html" |  silly  |))

'(symbol |9.999| index.html |  silly  |)

11.4.6.2 Navigation

parameter

(current-pagetree)  pagetree?

(current-pagetree pagetree)  void?
  pagetree : pagetree?
A parameter that defines the default pagetree used by pagetree navigation functions (e.g., parent-pagenode, chidren, et al.) if another is not explicitly specified. Initialized to #f.

procedure

(parent p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the parent pagenode of p within pagetree. Return #f if there isn’t one.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (parent 'son.html)

'mama.html

> (parent "mama.html")

'root

> (parent (parent 'son.html))

'root

> (parent (parent (parent 'son.html)))

#f

procedure

(children p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the child pagenodes of p within pagetree. Return #f if there aren’t any.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (children 'mama.html)

'(son.html daughter.html)

> (children 'uncle.html)

#f

> (children 'root)

'(mama.html uncle.html)

> (map children (children 'root))

'((son.html daughter.html) #f)

procedure

(siblings p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the sibling pagenodes of p within pagetree. The list will include p itself. But the function will still return #f if pagetree is #f.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (siblings 'son.html)

'(son.html daughter.html)

> (siblings 'daughter.html)

'(son.html daughter.html)

> (siblings 'mama.html)

'(mama.html uncle.html)

procedure

(previous p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(previous* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately before p. For previous*, return all the pagenodes before p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (previous 'daughter.html)

'son.html

> (previous 'son.html)

'mama.html

> (previous (previous 'daughter.html))

'mama.html

> (previous 'mama.html)

#f

> (previous* 'daughter.html)

'(mama.html son.html)

> (previous* 'uncle.html)

'(mama.html son.html daughter.html)

procedure

(next p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(next* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately after p. For next*, return all the pagenodes after p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (next 'son.html)

'daughter.html

> (next 'daughter.html)

'uncle.html

> (next (next 'son.html))

'uncle.html

> (next 'uncle.html)

#f

> (next* 'mama.html)

'(son.html daughter.html uncle.html)

> (next* 'daughter.html)

'(uncle.html)

11.4.6.3 Utilities

procedure

(pagetree->list pagetree)  list?

  pagetree : pagetree?
Convert pagetree to a simple list. Equivalent to a pre-order depth-first traversal of pagetree.

procedure

(in-pagetree? pagenode [pagetree])  boolean?

  pagenode : pagenode?
  pagetree : pagetree? = (current-pagetree)
Report whether pagenode is in pagetree.

procedure

(path->pagenode p)  pagenode?

  p : pathish?
Convert path p to a pagenode — meaning, make it relative to current-project-root, run it through ->output-path, and convert it to a symbol. Does not tell you whether the resultant pagenode actually exists in the current pagetree (for that, use in-pagetree?).

 
\ No newline at end of file +11.4 Pagetree
11 Module reference
11.1 Cache
11.2 Decode
11.3 File
11.4 Pagetree
11.5 Render
11.6 Template
11.7 Tag
11.8 Top
11.9 World
On this page:
11.4.1 Making pagetrees with a source file
11.4.2 Making pagetrees by hand
11.4.3 Using pagetrees for navigation
11.4.4 Using index.ptree in the dashboard
11.4.5 Using pagetrees with raco pollen render
11.4.6 Functions
11.4.6.1 Predicates & validation
pagetree?
validate-pagetree
pagenode?
pagenodeish?
->pagenode
11.4.6.2 Navigation
current-pagetree
parent
children
siblings
previous
previous*
next
next*
11.4.6.3 Utilities
pagetree->list
in-pagetree?
path->pagenode
6.1.0.5

11.4 Pagetree

 (require pollen/pagetree) package: pollen

Books and other long documents are usually organized in a structured way — at minimum they have a sequence of pages, but more often they have sections with subsequences within. Individual pages in a Pollen project don’t know anything about how they’re connected to other pages. In theory, you could maintain this information within the source files. But this would be a poor use of human energy.

Instead, use a pagetree. A pagetree is a simple abstraction for defining & working with sequences of pagenodes. Typically these pagenodes will be the names of output files in your project.

“So it’s a list of web-page filenames?” Sort of. When I think of a web page, I think of an actual file on a disk. Keeping with Pollen’s orientation toward dynamic rendering, pagenodes may — and often do — refer to files that don’t yet exist. Moreover, by referring to output names rather than source names, you retain the flexibility to change the kind of source associated with a particular pagenode (e.g., from preprocessor source to Pollen markup).

Pagetrees can be flat or hierarchical. A flat pagetree is just a list of pagenodes. A hierarchical pagetree can also contain recursively nested lists of pagenodes. But you needn’t pay attention to this distinction, as the pagetree functions don’t care which kind you use. Neither do I.

Pagetrees surface throughout the Pollen system. They’re primarily used for navigation — for instance, calculating “previous,” “next,” or “up” links for a given page. A special pagetree, index.ptree, is used by the project server to order the files in a dashboard. Pagetrees can also be used to define batches of files for certain operations, for instance raco pollen render. You might find other uses for them too.

11.4.1 Making pagetrees with a source file

A pagetree source file either starts with #lang pollen and uses the .ptree extension, or starts with #lang pollen/ptree and then can have any file extension.

Unlike other Pollen source files, since the pagetree source is not rendered into an output format, the rest of the filename is up to you.

Here’s a flat pagetree. Each line is considered a single pagenode (blank lines are ignored). Notice that no Pollen command syntax nor quoting is needed within the pagetree source:

"flat.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html

And here’s the output in DrRacket:

'(pagetree-root index.html introduction.html main_argument.html conclusion.html)

Keeping with usual Pollen policy, this is an X-expression. The pagetree-root is just an arbitrary tag that contains the pagetree.

Upgrading to a hierarchical pagetree is simple. The same basic rule applies — one pagenode per line. But this time, you add Pollen command syntax: a lozenge in front of a pagenode marks it as the top of a nested list, and the sub-pagenodes of that list go between { curly braces }, like so:

"hierarchical.ptree"
#lang pollen
 
toc.html
first-chapter.html{
    foreword.html
    introduction.html}
second-chapter.html{
    main-argument.html{
        facts.html
        analysis.html}
    conclusion.html}
bibliography.html

The output of our hierarchical pagetree:

'(pagetree-root toc.html (first-chapter.html foreword.html introduction.html) (second-chapter.html (main-argument.html facts.html analysis.html) conclusion.html) bibliography.html)

One advantage of using a source file is that when you run it in DrRacket, it will automatically be checked using validate-pagetree, which insures that every element in the pagetree meets pagenode?, and that all the pagenodes are unique.

This pagetree has a duplicate pagenode, so it won’t run:

"duplicate-pagenode.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html
index.html

Instead, you’ll get an error:

validate-pagetree: members-unique? failed because item isn’t unique: (index.html)

Pagenodes can refer to files in subdirectories. Just write the pagenode as a path relative to the directory where the pagetree lives:

"flat.ptree"
#lang pollen
 
foreword.html
facts-intro.html{
    facts/brennan.html
    facts/dale.html
}
analysis/intro.html{
    analysis/fancy-sauce/part-1.html
    analysis/fancy-sauce/part-2.html
}
conclusion.html
11.4.2 Making pagetrees by hand

Experienced programmers may want to know that because a pagetree is just an X-expression, you can synthesize a pagetree using any Pollen or Racket tools for making X-expressions. For example, here’s some Racket code that generates the same pagetree as the flat.ptree source file above:

"make-flat-ptree.rkt"
#lang racket
(require pollen/pagetree)
(define node-names '(index introduction main_argument conclusion))
(define pt `(pagetree-root
  ,@(map (λ(n) (string->symbol (format "~a.html" n))) node-names)))
(if (pagetree? pt) pt "Oops, not a pagetree")

Note that you need to take more care when building a pagetree by hand. Pagenodes are symbols, not strings, thus the use of string->symbol is mandatory. One benefit of using a pagetree source file is that it takes care of this housekeeping for you.

11.4.3 Using pagetrees for navigation

Typically you’ll call the pagetree-navigation functions from inside templates, using the special variable here as the starting point. For more on this technique, see Pagetree navigation.

11.4.4 Using index.ptree in the dashboard

When you’re using the project server to view the files in a directory, the server will first look for a file called index.ptree. If it finds this pagetree file, it will use it to build the dashboard. If not, then it will synthesize a pagetree using a directory listing. For more on this technique, see Using the dashboard.

11.4.5 Using pagetrees with raco pollen render

The raco pollen render command is used to regenerate an output file from its source. If you pass a pagetree to raco pollen render, it will automatically render each file listed in the pagetree.

For instance, many projects have auxiliary pages that don’t really belong in the main navigational flow. You can collect these pages in a separate pagetree:

"utility.ptree"
#lang pollen
 
404-error.html
terms-of-service.html
webmaster.html
[... and so on]

Thus, when you’re using pagetree-navigation functions within a template, you can use your main pagetree, and restrict the navigation to the main editorial content. But when you render the project, you can pass both pagetrees to raco pollen render.

For more on this technique, see raco pollen render.

11.4.6 Functions
11.4.6.1 Predicates & validation

procedure

(pagetree? possible-pagetree)  boolean?

  possible-pagetree : any/c
Test whether possible-pagetree is a valid pagetree. It must be a txexpr? where all elements are pagenode?, and each is unique within possible-pagetree (not counting the root node).

Examples:

> (pagetree? '(root index.html))

#t

> (pagetree? '(root duplicate.html duplicate.html))

#f

> (pagetree? '(root index.html "string.html"))

#f

> (define nested-ptree '(root 1.html 2.html (3.html 3a.html 3b.html)))
> (pagetree? nested-ptree)

#t

> (pagetree? `(root index.html ,nested-ptree (subsection.html more.html)))

#t

; Nesting a subtree twice creates duplication
> (pagetree? `(root index.html ,nested-ptree (subsection.html ,nested-ptree)))

#f

procedure

(validate-pagetree possible-pagetree)  pagetree?

  possible-pagetree : any/c
Like pagetree?, but raises a descriptive error if possible-pagetree is invalid, and otherwise returns possible-pagetree itself.

Examples:

> (validate-pagetree '(root (mama.html son.html daughter.html) uncle.html))

'(root (mama.html son.html daughter.html) uncle.html)

> (validate-pagetree `(root (,+ son.html daughter.html) uncle.html))

#f

> (validate-pagetree '(root (mama.html son.html son.html) mama.html))

validate-pagetree: members-unique? failed because items

aren’t unique: (son.html mama.html)

procedure

(pagenode? possible-pagenode)  boolean?

  possible-pagenode : any/c
Test whether possible-pagenode is a valid pagenode. A pagenode can be any symbol? that is not whitespace/nbsp? Every leaf of a pagetree is a pagenode. In practice, your pagenodes will likely be names of output files.

Pagenodes are symbols (rather than strings) so that pagetrees will be valid tagged X-expressions, which is a more convenient format for validation & processing.

Examples:

; Three symbols, the third one annoying but valid
> (map pagenode? '(symbol index.html |   silly   |))

'(#t #t #t)

; A number, a string, a txexpr, and a whitespace symbol
> (map pagenode? '(9.999 "index.html" (p "Hello") |    |))

'(#f #f #f #f)

procedure

(pagenodeish? v)  boolean?

  v : any/c
Return #t if v can be converted with ->pagenode.

Example:

> (map pagenodeish? '(9.999 "index.html" |    |))

'(#t #t #f)

procedure

(->pagenode v)  pagenode?

  v : pagenodeish?
Convert v to a pagenode.

Examples:

> (map pagenodeish? '(symbol 9.999 "index.html" |  silly  |))

'(#t #t #t #t)

> (map ->pagenode '(symbol 9.999 "index.html" |  silly  |))

'(symbol |9.999| index.html |  silly  |)

11.4.6.2 Navigation

parameter

(current-pagetree)  pagetree?

(current-pagetree pagetree)  void?
  pagetree : pagetree?
A parameter that defines the default pagetree used by pagetree navigation functions (e.g., parent-pagenode, chidren, et al.) if another is not explicitly specified. Initialized to #f.

procedure

(parent p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the parent pagenode of p within pagetree. Return #f if there isn’t one.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (parent 'son.html)

'mama.html

> (parent "mama.html")

'root

> (parent (parent 'son.html))

'root

> (parent (parent (parent 'son.html)))

#f

procedure

(children p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the child pagenodes of p within pagetree. Return #f if there aren’t any.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (children 'mama.html)

'(son.html daughter.html)

> (children 'uncle.html)

#f

> (children 'root)

'(mama.html uncle.html)

> (map children (children 'root))

'((son.html daughter.html) #f)

procedure

(siblings p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the sibling pagenodes of p within pagetree. The list will include p itself. But the function will still return #f if pagetree is #f.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (siblings 'son.html)

'(son.html daughter.html)

> (siblings 'daughter.html)

'(son.html daughter.html)

> (siblings 'mama.html)

'(mama.html uncle.html)

procedure

(previous p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(previous* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately before p. For previous*, return all the pagenodes before p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (previous 'daughter.html)

'son.html

> (previous 'son.html)

'mama.html

> (previous (previous 'daughter.html))

'mama.html

> (previous 'mama.html)

#f

> (previous* 'daughter.html)

'(mama.html son.html)

> (previous* 'uncle.html)

'(mama.html son.html daughter.html)

procedure

(next p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(next* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately after p. For next*, return all the pagenodes after p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (next 'son.html)

'daughter.html

> (next 'daughter.html)

'uncle.html

> (next (next 'son.html))

'uncle.html

> (next 'uncle.html)

#f

> (next* 'mama.html)

'(son.html daughter.html uncle.html)

> (next* 'daughter.html)

'(uncle.html)

11.4.6.3 Utilities

procedure

(pagetree->list pagetree)  list?

  pagetree : pagetree?
Convert pagetree to a simple list. Equivalent to a pre-order depth-first traversal of pagetree.

procedure

(in-pagetree? pagenode [pagetree])  boolean?

  pagenode : pagenode?
  pagetree : pagetree? = (current-pagetree)
Report whether pagenode is in pagetree.

procedure

(path->pagenode p)  pagenode?

  p : pathish?
Convert path p to a pagenode — meaning, make it relative to current-project-root, run it through ->output-path, and convert it to a symbol. Does not tell you whether the resultant pagenode actually exists in the current pagetree (for that, use in-pagetree?).

 
\ No newline at end of file diff --git a/doc/big-picture.html b/doc/big-picture.html index 0260957..367dcac 100644 --- a/doc/big-picture.html +++ b/doc/big-picture.html @@ -1,3 +1,3 @@ -4 The big picture
On this page:
4.1 The book is a program
4.2 One language, multiple dialects
4.3 Development environment
4.4 A special data structure for HTML
4.5 Pollen command syntax
4.6 The preprocessor
4.7 Templated source files
4.8 Pagetrees
6.1.0.5

4 The big picture

A summary of the key components & concepts of the Pollen publishing system and how they fit together. If you’ve completed the Quick tour, this will lend some context to what you saw. The next tutorials will make more sense if you read this first.

4.1 The book is a program

This is the core design principle of Pollen. Consistent with this principle, Pollen adopts the habits of software development in its functionality, workflow, and project management.

4.2 One language, multiple dialects

4.3 Development environment

The Pollen development environment has three main pieces: the DrRacket code editor, the project server, and the command-line tool.

4.4 A special data structure for HTML

Unlike other programming languages, Pollen (and Racket) internally represent HTML with something called an X-expression. An X-expression is simply a list that represents what in HTML is called an element, meaning a thing with an opening tag, a closing tag, and content in between. Like HTML elements, X-expressions can be nested. Unlike HTML elements, X-expressions have no closing tag, they use parentheses to denote the start and end, and text elements are put inside quotes.

For example, consider this HTML element:

<body><h1>Hello world</h1><p>Nice to <i>see</i> you.</p></body>

As a Racket X-expression, this would be written:

(body (h1 "Hello world") (p "Nice to " (i "see") " you."))

More will be said about X-expressions. But a couple advantages should be evident already. First, without the redundant angle brackets, the X-expression is more readable than the equivalent HTML. Second, an X-expression is preferable to representing HTML as a simple string, because it preserves the internal structure of the element.

4.5 Pollen command syntax

As mentioned above, a Pollen source file is not code with text embedded in it, but rather text with code embedded. (See ◊ command overview for more.)