From b9613b57f0f83df8300bbdaac8bba41d622fc2e1 Mon Sep 17 00:00:00 2001 From: Matthew Butterick Date: Thu, 11 Sep 2014 18:42:47 -0700 Subject: [PATCH] render docs --- doc/Backstory.html | 2 +- doc/Decode.html | 2 +- doc/Pagetree.html | 2 +- doc/big-picture.html | 2 +- doc/dashboard.png | Bin 41812 -> 22829 bytes doc/doc-index.html | 2 +- doc/first-tutorial.html | 2 +- doc/index.html | 2 +- doc/mb.css | 1 + doc/quick-tour.html | 4 ++-- doc/reader.html | 2 +- doc/second-tutorial.html | 4 ++-- doc/third-tutorial.html | 2 +- 13 files changed, 14 insertions(+), 13 deletions(-) diff --git a/doc/Backstory.html b/doc/Backstory.html index 5080b89..71e59e4 100644 --- a/doc/Backstory.html +++ b/doc/Backstory.html @@ -1,2 +1,2 @@ -3 Backstory
6.1.0.5

3 Backstory

I created Pollen to overcome limitations & frustrations I repeatedly encountered with existing web-publishing tools.

If you agree with my characterization of those problems, then you’ll probably like the solution that Pollen offers. If not, you probably won’t.

3.1 Web development and its discontents

I made my first web page in 1994, shortly after the web was invented. I opened my text editor (at the time, BBEdit), pecked out <html><body>Hello world</body></html>, then loaded it in Mosaic. So did a million other nerds.

If you weren’t around then, you didn’t miss much. Everything about the web was horrible: the web browsers, the computers running the browsers, the dial-up connections feeding the browsers, and of course HTML itself. At that point, the desktop-software experience was already slick and refined. By comparison, using the web felt like banging rocks together.

That’s no longer true. The web is now 20 years old. During that time, most parts of the web have improved dramatically — for instance, the connections are faster, the browsers are more sophisticated, the screens have more pixels.

But one part has not improved: the way we make web pages. Over the years, tools promising to simplify HTML development have come and mostly gone — from PageMill to Dreamweaver to WordPress to Jekyll. Meanwhile, true web jocks have remained loyal to the original HTML power tool: the humble text editor.

In one way, this makes sense. Web pages are mostly made of text-based data — HTML, CSS, JavaScript, and so on — and the simplest way to mainpulate this data is with a text editor. While HTML and CSS are not programming languages, they lend themselves to semantic and logical structure that’s most easily expressed by editing them as text. Furthermore, text-based editing makes debugging and performance improvements easier.

But text-based editing is also limited. Though the underlying description of a web page is notionally human-readable, it’s largely optimized to be readable by other software — namely, web browsers. HTML markup in particular is verbose and easily mistyped. And isn’t it fatally dull to manage all the boilerplate, like surrounding every paragraph with <p>...</p>? Yes, it is.

For these reasons, much of web development should lend itself to abstraction & automation. Abstraction means consolidating repetitve, complex patterns into simpler, parameterized forms. Automation means avoiding the manual drudgery of generating the output files. But in practice, tools that enable this abstraction & automation have been slow to arrive, and most have come hobbled with unacceptable deficiencies.

3.2 The better idea: a programming model

Parallel with my HTML education, I also goofed around with various programming languages — C, C++, Perl, Java, PHP, JavaScript, Python. Unlike HTML, programming languages excel at abstraction and automation. This seemed like the obvious direction for web development to go.

What distinguishes the text-editing model from the programming model? It’s a matter of direct vs. indirect manipulation of output. The text-editing model treats HTML as something to be written directly with a text editor. Whereas the programming model treats HTML — or whatever the output is — as the result of compiling a set of source files, which are written in a programming language. The costs of working indirectly via the programming language are offset by the benefits of abstraction & automation.

On the early web, the text-editing model was appealingly precise and quick. On small projects, it worked well enough. But as projects grew, the text-editing model was going to lose steam. I wasn’t the only one to notice. Shortly after those million nerds made their first web page by hand, many of them set about devising ways to apply a programming model to web development.

3.3 “Now you have two problems”

What followed was a steady stream of products, frameworks, tools, and content management systems that claimed to bring a programming model to web development. Some were better than others. But none of them displaced the text editor as the preferred tool of web developers.

Why not? All these tools promised a great leap forward in solving the web-development problem. In practice, they simply redistributed the pain. I needn’t bore you with enumerating the deficiencies of specific tools, because they’ve tended to fail in the same thematic ways:

  • No native data structure for HTML. Core to any programming model is data structures. Good data structures make processing easy; bad ones make it hard. Even though HTML has a well documented format, rarely has it been handled within a programming system with a native, intuitive data structure. Instead, it’s either been treated as a string (wrong), a tree (also wrong), or some magical parsed object. This has made working with HTML in programming environments needlessly difficult.

  • Mandatory separation of code, presentation, and content. This principle has often been held out as an ideal in web development. But it’s also counterintuitive, because an HTML page naturally contains all three. If you want to separate them, your tools should let you. But if you don’t, your tools shouldn’t make you.

  • Compromised template languages. Seems like every programming language has at least 10 templating systems for HTML, all of which require you to learn a new “template language” that offers the worst of both worlds: fewer features and different syntax than the underlying language.

  • Steep learning curves. Web programmers have often chided designers for not knowing how to code. But programming-based web-development tools have often had a high initial learning curve that requires you to throw out your existing workflow. Programmers built these tools — no surprise that programmers have been more comfortable with them.

I’ve tried a lot of these tools over the years. Some I liked. Some I didn’t. Invariably, however, whenever I could still make do with hand-editing an HTML project, I would. After trying to cajole the web framework du jour into doing my bidding, it was relaxing to trade off some efficiency for control.

3.4 Rethinking the solution for digital books

In 2008, I launched a website called Typography for Lawyers. Initially, I’d conceived of it as a book. Then I thought “no one’s going to publish that.” So it became a website, that I aimed to make as book-like as possible. But hand-editing wasn’t going to be enough.

So I used WordPress. The major chore became scraping out all the crap that typically lives in blog templates. Largely because of this, people liked the site, because it was simpler & cleaner than the usual WordPress website.

Eventually, a publisher offered to release it as a paperback. Later came the inevitable request to make it into a Kindle book. As a fan of typography, I hate the Kindle. The layout controls are coarse, and so is the reading experience. But I didn’t run and hide. Basically a Kindle book is a little website made with 1995-era HTML. So I coded up some tools in Perl to convert my book to Kindle format while preserving the formatting and images as well as possible.

At that point, I noticed I had converted Typography for Lawyers into web format twice, using two different sets of tools. Before someone asked me to do it a third time, I started thinking about how I might create source code for the book that allowed me to render it into different formats.

This was the beginning of the Pollen project.

I wrote the initial version of Pollen in Python. I devised a simplified markup-notation language for the source files. This language was compiled into XML-ish data structures using ply (Python lex/yacc). These structures were parsed into trees using LXML. The trees were combined with templates made in Chameleon. These templates were rendered and previewed with the Bottle web server.

Did it work? Sort of. Source code went in; web pages came out. But it was also complicated and fragile. Moreover, though the automation was there, there wasn’t yet enough abstraction at the source layer. I started thinking about how I could add a source preprocessor.

3.5 Enter Racket

I had come across Racket while researching languages suitable for HTML/XML processing. I had unexpectedly learned about the secret kinship of XML and Lisp: though XML is not a programming language, it uses a variant of Lisp syntax. Thus Lisp languages are particularly adept at handling XMLish structures. That was interesting.

After comparing some of the Lisp & Scheme variants, Racket stood out because it had a text-based dialect called Scribble. Scribble could be used to embed code within textual content. That was interesting too. Among other things, this meant Scribble could be used as a general-purpose preprocessor. So I thought I’d see if I could add it to Pollen.

It worked. So well, in fact, that I started thinking about whether I could reimplement other parts of Pollen in Racket. Then I started thinking about reimplementing all of it in Racket.

So I did. And here we are.

3.6 What is Pollen?

Pollen is a publishing system built on top of Scribble and Racket. So far I’ve optimized Pollen for digital books, because that’s mainly what I use it for. But it can be used for small projects too.

As a publishing system, Pollen includes:

  • A programming language. The Pollen language is a variant of Scribble, with specific “dialects” tailored to different kinds of source files. You don’t need to use the programming features to do useful work, but they’re available when you need them.

  • A set of tools & libraries. Pollen targets HTML output. So it includes a variety of tools that cure common HTML annoyances, including a CSS preprocessor.

  • A development environment. Pollen works with the DrRacket IDE. It also includes a project web server so you can dynamically preview and revise your publication.

Pollen addresses the deficiencies I experienced with other tools:

  • Yes, we have a native data structure for HTML. Racket represents HTML structures as X-expressions, which are a variant of the standard Racket data structure, called S-expressions. In other words, not only is there a native representation for HTML, everything in the language is represented this way.

  • Flexible blending of code, presentation, and content. Pollen is a text-based language. So a Pollen source file might have no code at all. But as a dialect of Scribble & Racket, if you want to mix code with content, you can.

  • No template language. It’s not necessary, because you can use the whole Racket language, and all the usual Racket syntax, in every Pollen file.

  • Shallow learning curve. You don’t need to do a lot of setup and configuration to start doing useful work with Pollen. Programmers and non-programmers can easily collaborate. Yes, I concede that if you plan to get serious, you’ll need to learn some Racket. I don’t think you’ll regret it.

 
\ No newline at end of file +3 Backstory
6.1.0.5

3 Backstory

I created Pollen to overcome limitations & frustrations I repeatedly encountered with existing web-publishing tools.

If you agree with my characterization of those problems, then you’ll probably like the solution that Pollen offers. If not, you probably won’t.

3.1 Web development and its discontents

I made my first web page in 1994, shortly after the web was invented. I opened my text editor (at the time, BBEdit), pecked out <html><body>Hello world</body></html>, then loaded it in Mosaic. So did a million other nerds.

If you weren’t around then, you didn’t miss much. Everything about the web was horrible: the web browsers, the computers running the browsers, the dial-up connections feeding the browsers, and of course HTML itself. At that point, the desktop-software experience was already slick and refined. By comparison, using the web felt like banging rocks together.

That’s no longer true. The web is now more than 20 years old. During that time, most parts of the web have improved dramatically — for instance, the connections are faster, the browsers are more sophisticated, the screens have more pixels.

But one part has not improved: the way we make web pages. Over the years, tools promising to simplify HTML development have come and mostly gone — from PageMill to Dreamweaver to WordPress to Jekyll. Meanwhile, true web jocks have remained loyal to the original HTML power tool: the humble text editor.

In one way, this makes sense. Web pages are mostly made of text-based data — HTML, CSS, JavaScript, and so on — and the simplest way to mainpulate this data is with a text editor. While HTML and CSS are not programming languages, they lend themselves to semantic and logical structure that’s most easily expressed by editing them as text. Furthermore, text-based editing makes debugging and performance improvements easier.

But text-based editing is also limited. Though the underlying description of a web page is notionally human-readable, it’s optimized to be readable by other software — namely, web browsers. HTML markup in particular is verbose and easily mistyped. And isn’t it fatally dull to manage all the boilerplate, like surrounding every paragraph with <p>...</p>? Yes, it is.

For these reasons, much of web development should lend itself to abstraction & automation. Abstraction means consolidating repetitve, complex patterns into simpler, parameterized forms. Automation means avoiding the manual drudgery of generating the output files. But in practice, tools that enable this abstraction & automation have been slow to arrive, and most have come hobbled with unacceptable deficiencies.

3.2 The better idea: a programming model

Parallel with my HTML education, I also goofed around with various programming languages — C, C++, Perl, Java, PHP, JavaScript, Python. Unlike HTML, programming languages excel at abstraction and automation. This seemed like the obvious direction for web development to go.

What distinguishes the text-editing model from the programming model? It’s a matter of direct vs. indirect manipulation of output. The text-editing model treats HTML as something to be written directly with a text editor. Whereas the programming model treats HTML — or whatever the output is — as the result of compiling a set of source files, which are written in a programming language. The costs of working indirectly via the programming language are offset by the benefits of abstraction & automation.

On the early web, the text-editing model was appealingly precise and quick. On small projects, it worked well enough. But as projects grew, the text-editing model was going to lose steam. I wasn’t the only one to notice. Shortly after those million nerds made their first web page by hand, many of them set about devising ways to apply a programming model to web development.

3.3 “Now you have two problems”

What followed was a steady stream of products, frameworks, tools, and content management systems that claimed to bring a programming model to web development. Some were better than others. But none of them displaced the text editor as the preferred tool of web developers.

Why not? All these tools promised a great leap forward in solving the web-development problem. In practice, they simply redistributed the pain. I needn’t bore you with enumerating the deficiencies of specific tools, because they’ve tended to fail in the same thematic ways:

  • No native data structure for HTML. Core to any programming model is data structures. Good data structures make processing easy; bad ones make it hard. Even though HTML has a well documented format, rarely has it been handled within a programming system with a native, intuitive data structure. Instead, it’s either been treated as a string (wrong), a tree (also wrong), or some magical parsed object. This has made working with HTML in programming environments needlessly difficult.

  • Mandatory separation of code, presentation, and content. This principle has often been held out as an ideal in web development. But it’s also counterintuitive, because an HTML page naturally contains all three. If you want to separate them, your tools should let you. But if you don’t, your tools shouldn’t make you.

  • Compromised template languages. It seems like every programming language has at least 10 templating systems for HTML, all of which require you to learn a new “template language” that offers the worst of both worlds: fewer features and different syntax than the underlying language.

  • Steep learning curves. Web programmers have often chided designers for not knowing how to code. But programming-based web-development tools have often had a high initial learning curve that requires you to throw out your existing workflow. Programmers built these tools — no surprise that programmers have been more comfortable with them.

I’ve tried a lot of these tools over the years. Some I liked. Some I didn’t. Invariably, however, whenever I could still make do with hand-editing an HTML project, I would. After trying to cajole the web framework du jour into doing my bidding, it was relaxing to trade off some efficiency for control.

3.4 Rethinking the solution for digital books

In 2008, I launched a website called Typography for Lawyers. Initially, I’d conceived of it as a book. Then I thought “no one’s going to publish that.” So it became a website, that I aimed to make as book-like as possible. But hand-editing wasn’t going to be enough.

So I used WordPress. The major chore became scraping out all the crap that typically lives in blog templates. Largely because of this, people liked the site, because it was simpler & cleaner than the usual WordPress website.

Eventually, a publisher offered to release it as a paperback, which came out in 2010.

Later came the inevitable request to make it into a Kindle book. As a fan of typography, I hate the Kindle. The layout controls are coarse, and so is the reading experience. But I didn’t run and hide. Basically a Kindle book is a little website made with 1995-era HTML. So I coded up some tools in Perl to convert my book to Kindle format while preserving the formatting and images as well as possible.

At that point, I noticed I had converted Typography for Lawyers into web format twice, using two different sets of tools. Before someone asked me to do it a third time, I started thinking about how I might create source code for the book that allowed me to render it into different formats.

That was the beginning of the Pollen project.

I wrote the initial version of Pollen in Python. I devised a simplified markup-notation language for the source files. This language was compiled into XML-ish data structures using ply (Python lex/yacc). These structures were parsed into trees using LXML. The trees were combined with templates made in Chameleon. These templates were rendered and previewed with the Bottle web server.

Did it work? Sort of. Source code went in; web pages came out. But it was also complicated and fragile. Moreover, though the automation was there, there wasn’t yet enough abstraction at the source layer. I started thinking about how I could add a source preprocessor.

3.5 Enter Racket

I had come across Racket while researching languages suitable for HTML/XML processing. I had unexpectedly learned about the secret kinship of XML and Lisp: though XML is not a programming language, it uses a variant of Lisp syntax. Thus Lisp languages are particularly adept at handling XMLish structures. That was interesting.

After comparing some of the Lisp & Scheme variants, Racket stood out because it had a text-based dialect called Scribble. Scribble could be used to embed code within textual content. That was interesting too. Among other things, this meant Scribble could be used as a general-purpose preprocessor. So I thought I’d see if I could add it to Pollen.

It worked. So well, in fact, that I started thinking about whether I could reimplement other parts of Pollen in Racket. Then I started thinking about reimplementing all of it in Racket.

So I did. And here we are.

3.6 What is Pollen?

Pollen is a publishing system built on top of Scribble and Racket. So far, I’ve optimized Pollen for digital books, because that’s mainly what I use it for. But it can be used for small projects too.

As a publishing system, Pollen includes:

  • A programming language. The Pollen language is a variant of Scribble, with specific dialects tailored to different kinds of source files. You don’t need to use the programming features to do useful work, but they’re available when you need them.

  • A set of tools & libraries. Pollen targets HTML output. So it includes a variety of tools that cure common HTML annoyances, including a CSS preprocessor.

  • A development environment. Pollen works with the DrRacket IDE. It also includes a project web server so you can dynamically preview and revise your publication.

Pollen addresses the deficiencies I experienced with other tools:

  • Yes, we have a native data structure for HTML. Racket represents HTML structures as X-expressions, which are a variant of the standard Racket data structure, called S-expressions. In other words, not only is there a native representation for HTML, everything in the language is represented this way.

  • Flexible blending of code, presentation, and content. Pollen is a text-based language. So a Pollen source file might have no code at all. But as a dialect of Scribble & Racket, if you want to mix code with content, you can.

  • No template language. It’s not necessary, because you can use the whole Racket language, and all the usual Racket syntax, in every Pollen file.

  • Shallow learning curve. You don’t need to do a lot of setup and configuration to start doing useful work with Pollen. Programmers and non-programmers can easily collaborate. Yes, I concede that if you plan to get serious, you’ll need to learn some Racket. I don’t think you’ll regret it.

3.7 Further reading

In Why Racket? Why Lisp?, I explain why Racket was the right tool for this job.

 
\ No newline at end of file diff --git a/doc/Decode.html b/doc/Decode.html index 4c8b407..de176bf 100644 --- a/doc/Decode.html +++ b/doc/Decode.html @@ -1,2 +1,2 @@ -11.2 Decode
6.1.0.5

11.2 Decode

 (require pollen/decode) package: pollen

The doc export of a Pollen markup file is a simple X-expression. Decoding refers to any post-processing of this X-expression. The pollen/decode module provides tools for creating decoders.

The decode step can happen separately from the compilation of the file. But you can also attach a decoder to the markup file’s root node, so the decoding happens automatically when the markup is compiled, and thus automatically incorporated into doc. (Following this approach, you could also attach multiple decoders to different tags within doc.)

You can, of course, embed function calls within Pollen markup. But since markup is optimized for authors, decoding is useful for operations that can or should be moved out of the authoring layer.

One example is presentation and layout. For instance, detect-paragraphs is a decoder function that lets authors mark paragraphs in their source simply by using two carriage returns.

Another example is conversion of output into a particular data format. Most Pollen functions are optimized for HTML output, but one could write a decoder that targets another format.

procedure

(decode tagged-xexpr    
  [#:txexpr-tag-proc txexpr-tag-proc    
  #:txexpr-attrs-proc txexpr-attrs-proc    
  #:txexpr-elements-proc txexpr-elements-proc    
  #:block-txexpr-proc block-txexpr-proc    
  #:inline-txexpr-proc inline-txexpr-proc    
  #:string-proc string-proc    
  #:symbol-proc symbol-proc    
  #:valid-char-proc valid-char-proc    
  #:cdata-proc cdata-proc    
  #:exclude-tags tags-to-exclude])  txexpr?
  tagged-xexpr : txexpr?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Recursively process a tagged-xexpr, usually the one exported from a Pollen source file as doc.

This function doesn’t do much on its own. Rather, it provides the hooks upon which harder-working functions can be hung.

Recall from [future link: Pollen mechanics] that any tag can have a function attached to it. By default, the tagged-xexpr from a source file is tagged with root. So the typical way to use decode is to attach your decoding functions to it, and then define root to invoke your decode function. Then it will be automatically applied to every doc during compile.

For instance, here’s how decode is attached to root in Butterick’s Practical Typography. There’s not much to it —

(define (root . items)
  (decode (make-txexpr 'root '() items)
          #:txexpr-elements-proc detect-paragraphs
          #:block-txexpr-proc (compose1 hyphenate wrap-hanging-quotes)
          #:string-proc (compose1 smart-quotes smart-dashes)
          #:exclude-tags '(style script)))

The hyphenate function is not part of Pollen, but rather the hyphenate package, which you can install separately.

This illustrates another important point: even though decode presents an imposing list of arguments, you’re unlikely to use all of them at once. These represent possibilities, not requirements. For instance, let’s see what happens when decode is invoked without any of its optional arguments.

Examples:

> (define tx '(root "I wonder" (em "why") "this works."))
> (decode tx)

'(root "I wonder" (em "why") "this works.")

Right — nothing. That’s because the default value for the decoding arguments is the identity function, (λ (x) x). So all the input gets passed through intact unless another action is specified.

The *-proc arguments of decode take procedures that are applied to specific categories of elements within txexpr.

The txexpr-tag-proc argument is a procedure that handles X-expression tags.

Examples:

> (define tx '(p "I'm from a strange" (strong "namespace")))
; Tags are symbols, so a tag-proc should return a symbol
> (decode tx #:txexpr-tag-proc (λ(t) (string->symbol (format "ns:~a" t))))

'(ns:p "I'm from a strange" (ns:strong "namespace"))

The txexpr-attrs-proc argument is a procedure that handles lists of X-expression attributes. (The txexpr module, included at no extra charge with Pollen, includes useful helper functions for dealing with these attribute lists.)

Examples:

> (define tx '(p [[id "first"]] "If I only had a brain."))
; Attrs is a list, so cons is OK for simple cases
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(p ((class "PhD") (id "first")) "If I only had a brain.")

Note that txexpr-attrs-proc will change the attributes of every tagged X-expression, even those that don’t have attributes. This is useful, because sometimes you want to add attributes where none existed before. But be careful, because the behavior may make your processing function overinclusive.

Examples:

> (define tx '(div (p [[id "first"]] "If I only had a brain.")
  (p "Me too.")))
; This will insert the new attribute everywhere
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(div

  ((class "PhD"))

  (p ((class "PhD") (id "first")) "If I only had a brain.")

  (p ((class "PhD")) "Me too."))

; This will add the new attribute only to non-null attribute lists
> (decode tx #:txexpr-attrs-proc
  (λ(attrs) (if (null? attrs) attrs (cons '[class "PhD"] attrs))))

'(div (p ((class "PhD") (id "first")) "If I only had a brain.") (p "Me too."))

The txexpr-elements-proc argument is a procedure that operates on the list of elements that represents the content of each tagged X-expression. Note that each element of an X-expression is subject to two passes through the decoder: once now, as a member of the list of elements, and also later, through its type-specific decoder (i.e., string-proc, symbol-proc, and so on).

Examples:

> (define tx '(div "Double" "\n" "toil" amp "trouble"))
; Every element gets doubled ...
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es)))

'(div "Double" "Double" "\n" "\n" "toil" "toil" amp amp "trouble" "trouble")

; ... but only strings get capitalized
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es))
  #:string-proc (λ(s) (string-upcase s)))

'(div "DOUBLE" "DOUBLE" "\n" "\n" "TOIL" "TOIL" amp amp "TROUBLE" "TROUBLE")

So why do you need txexpr-elements-proc? Because some types of element decoding depend on context, thus it’s necessary to handle the elements as a group. For instance, the doubling function above, though useless, requires handling the element list as a whole, because elements are being added.

A more useful example: paragraph detection. The behavior is not merely a map across each element:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
; Context matters. Trailing whitespace is ignored ...
> (paras '(body "The first paragraph." "\n\n"))

'(body "The first paragraph.")

; ... but whitespace between strings is converted to a break.
> (paras '(body "The first paragraph." "\n\n" "And another."))

'(body (p "The first paragraph.") (p "And another."))

; A combination of both types
> (paras '(body "The first paragraph." "\n\n" "And another." "\n\n"))

'(body (p "The first paragraph.") (p "And another."))

The block-txexpr-proc argument and the inline-txexpr-proc arguments are procedures that operate on tagged X-expressions. If the X-expression meets the block-txexpr? test, it is processed by block-txexpr-proc. Otherwise, it is processed by inline-txexpr-proc. Thus every tagged X-expression will be handled by one or the other. Of course, if you want block and inline elements to be handled the same way, you can set block-txexpr-proc and inline-txexpr-proc to be the same procedure.

Examples:

> (define tx '(div "Please" (em "mind the gap") (h1 "Tuesdays only")))
> (define add-ns (λ(tx) (make-txexpr
      (string->symbol (format "ns:~a" (get-tag tx)))
      (get-attrs tx)
      (get-elements tx))))
; div and h1 are block elements, so this will only affect them
> (decode tx #:block-txexpr-proc add-ns)

'(ns:div "Please" (em "mind the gap") (ns:h1 "Tuesdays only"))

; em is an inline element, so this will only affect it
> (decode tx #:inline-txexpr-proc add-ns)

'(div "Please" (ns:em "mind the gap") (h1 "Tuesdays only"))

; this will affect all elements
> (decode tx #:block-txexpr-proc add-ns #:inline-txexpr-proc add-ns)

'(ns:div "Please" (ns:em "mind the gap") (ns:h1 "Tuesdays only"))

The string-proc, symbol-proc, valid-char-proc, and cdata-proc arguments are procedures that operate on X-expressions that are strings, symbols, valid-chars, and CDATA, respectively. Deliberately, the output contracts for these procedures accept any kind of X-expression (meaning, the procedure can change the X-expression type).

Examples:

; A div with string, entity, character, and cdata elements
> (define tx `(div "Moe" amp 62 ,(cdata #f #f "3 > 2;")))
> (define rulify (λ(x) '(hr)))
; The rulify function is selectively applied to each
> (print (decode tx #:string-proc rulify))

'(div (hr) amp 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:symbol-proc rulify))

'(div "Moe" (hr) 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:valid-char-proc rulify))

'(div "Moe" amp (hr) #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:cdata-proc rulify))

'(div "Moe" amp 62 (hr))

Finally, the tags-to-exclude argument is a list of tags that will be exempted from decoding. Though you could get the same result by testing the input within the individual decoding functions, that’s tedious and potentially slower.

Examples:

> (define tx '(p "I really think" (em "italics") "should be lowercase."))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(p "I REALLY THINK" (em "ITALICS") "SHOULD BE LOWERCASE.")

> (decode tx #:string-proc (λ(s) (string-upcase s)) #:exclude-tags '(em))

'(p "I REALLY THINK" (em "italics") "SHOULD BE LOWERCASE.")

The tags-to-exclude argument is useful if you’re decoding source that’s destined to become HTML. According to the HTML spec, material within a <style> or <script> block needs to be preserved literally. In this example, if the CSS and JavaScript blocks are capitalized, they won’t work. So exclude '(style script), and problem solved.

Examples:

> (define tx '(body (h1 [[class "Red"]] "Let's visit Planet Telex.")
  (style [[type "text/css"]] ".Red {color: green;}")
  (script [[type "text/javascript"]] "var area = h * w;")))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".RED {COLOR: GREEN;}")

  (script ((type "text/javascript")) "VAR AREA = H * W;"))

> (decode tx #:string-proc (λ(s) (string-upcase s))
  #:exclude-tags '(style script))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".Red {color: green;}")

  (script ((type "text/javascript")) "var area = h * w;"))

procedure

(decode-elements elements 
  [#:txexpr-tag-proc txexpr-tag-proc 
  #:txexpr-attrs-proc txexpr-attrs-proc 
  #:txexpr-elements-proc txexpr-elements-proc 
  #:block-txexpr-proc block-txexpr-proc 
  #:inline-txexpr-proc inline-txexpr-proc 
  #:string-proc string-proc 
  #:symbol-proc symbol-proc 
  #:valid-char-proc valid-char-proc 
  #:cdata-proc cdata-proc 
  #:exclude-tags tags-to-exclude]) 
  txexpr-elements?
  elements : txexpr-elements?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Identical to decode, but takes txexpr-elements? as input rather than a whole tagged X-expression, and likewise returns txexpr-elements? rather than a tagged X-expression. A convenience variant for use inside tag functions.

11.2.1 Block

Because it’s convenient, Pollen puts tagged X-expressions into two categories: block and inline. Why is it convenient? When using decode, you often want to treat the two categories differently. Not that you have to. But this is how you can.

parameter

(project-block-tags)  (listof txexpr-tag?)

(project-block-tags block-tags)  void?
  block-tags : (listof txexpr-tag?)
A parameter that defines the set of tags that decode will treat as blocks. This parameter is initialized with the HTML block tags, namely:

(address article aside audio blockquote body canvas dd div dl fieldset figcaption figure footer form h1 h2 h3 h4 h5 h6 header hgroup noscript ol output p pre section table tfoot ul video)

procedure

(register-block-tag tag)  void?

  tag : txexpr-tag?
Adds a tag to project-block-tags so that block-txexpr? will report it as a block, and decode will process it with block-txexpr-proc rather than inline-txexpr-proc.

Pollen tries to do the right thing without being told. But this is the rare case where you have to be explicit. If you introduce a tag into your markup that you want treated as a block, you must use this function to identify it, or you will get spooky behavior later on.

For instance, detect-paragraphs knows that block elements in the markup shouldn’t be wrapped in a p tag. So if you introduce a new block element called bloq without registering it as a block, misbehavior will follow:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (p (bloq "But not me.")))

; Wrong: bloq should not be wrapped

But once you register bloq as a block, order is restored:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (register-block-tag 'bloq)
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (bloq "But not me."))

; Right: bloq is treated as a block

If you find the idea of registering block tags unbearable, good news. The project-block-tags include the standard HTML block tags by default. So if you just want to use things like div and p and h1–h6, you’ll get the right behavior for free.

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (div "But not me.")))

'(body (p "I want to be a paragraph.") (div "But not me."))

procedure

(block-txexpr? v)  boolean?

  v : any/c
Predicate that tests whether v is a tagged X-expression, and if so, whether the tag is among the project-block-tags. If not, it is treated as inline. To adjust how this test works, use register-block-tag.

11.2.2 Typography

An assortment of typography & layout functions, designed to be used with decode. These aren’t hard to write. So if you like these, use them. If not, make your own.

procedure

(whitespace? v)  boolean?

  v : any/c
A predicate that returns #t for any stringlike v that’s entirely whitespace, but also the empty string, as well as lists and vectors that are made only of whitespace? members. Following the regexp-match convention, whitespace? does not return #t for a nonbreaking space. If you prefer that behavior, use whitespace/nbsp?.

Examples:

> (whitespace? "\n\n   ")

#t

> (whitespace? (string->symbol "\n\n   "))

#t

> (whitespace? "")

#t

> (whitespace? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace? nonbreaking-space)

#f

procedure

(whitespace/nbsp? v)  boolean?

  v : any/c
Like whitespace?, but also returns #t for nonbreaking spaces.

Examples:

> (whitespace/nbsp? "\n\n   ")

#t

> (whitespace/nbsp? (string->symbol "\n\n   "))

#t

> (whitespace/nbsp? "")

#t

> (whitespace/nbsp? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace/nbsp? nonbreaking-space)

#t

procedure

(smart-quotes str)  string?

  str : string?
Convert straight quotes in str to curly according to American English conventions.

Examples:

> (define tricky-string
  "\"Why,\" she could've asked, \"are we in O‘ahu watching 'Mame'?\"")
> (display tricky-string)

"Why," she could've asked, "are we in O‘ahu watching 'Mame'?"

> (display (smart-quotes tricky-string))

“Why,” she could’ve asked, “are we in O‘ahu watching ‘Mame’?”

procedure

(smart-dashes str)  string?

  str : string?
In str, convert three hyphens to an em dash, and two hyphens to an en dash, and remove surrounding spaces.

Examples:

> (define tricky-string "I had a few --- OK, like 6--8 --- thin mints.")
> (display tricky-string)

I had a few --- OK, like 6--8 --- thin mints.

> (display (smart-dashes tricky-string))

I had a few—OK, like 6–8—thin mints.

; Monospaced font not great for showing dashes, but you get the idea

procedure

(detect-linebreaks tagged-xexpr-elements 
  [#:separator linebreak-sep 
  #:insert linebreak]) 
  txexpr-elements?
  tagged-xexpr-elements : txexpr-elements?
  linebreak-sep : string? = world:linebreak-separator
  linebreak : xexpr? = '(br)
Within tagged-xexpr-elements, convert occurrences of linebreak-sep ("\n" by default) to linebreak, but only if linebreak-sep does not occur between blocks (see block-txexpr?). Why? Because block-level elements automatically display on a new line, so adding linebreak would be superfluous. In that case, linebreak-sep just disappears.

Examples:

> (detect-linebreaks '(div "Two items:" "\n" (em "Eggs") "\n" (em "Bacon")))

'(div "Two items:" (br) (em "Eggs") (br) (em "Bacon"))

> (detect-linebreaks '(div "Two items:" "\n" (div "Eggs") "\n" (div "Bacon")))

'(div "Two items:" (div "Eggs") (div "Bacon"))

procedure

(detect-paragraphs elements 
  [#:separator paragraph-sep 
  #:tag paragraph-tag 
  #:linebreak-proc linebreak-proc]) 
  txexpr-elements?
  elements : txexpr-elements?
  paragraph-sep : string? = world:paragraph-separator
  paragraph-tag : symbol? = 'p
  linebreak-proc : (txexpr-elements? . -> . txexpr-elements?)
   = detect-linebreaks
Find paragraphs within elements (as denoted by paragraph-sep) and wrap them with paragraph-tag. Also handle linebreaks using detect-linebreaks.

If element is already a block-txexpr?, it will not be wrapped as a paragraph (because in that case, the wrapping would be superfluous). Thus, as a consequence, if paragraph-sep occurs between two blocks, it will be ignored (as in the example below using two sequential 'div blocks.)

The paragraph-tag argument sets the tag used to wrap paragraphs.

The linebreak-proc argument allows you to use a different linebreaking procedure other than the usual detect-linebreaks.

Examples:

> (detect-paragraphs '("First para" "\n\n" "Second para"))

'((p "First para") (p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line"))

'((p "First para") (p "Second para" (br) "Second line"))

> (detect-paragraphs '("First para" "\n\n" (div "Second block")))

'((p "First para") (div "Second block"))

> (detect-paragraphs '((div "First block") "\n\n" (div "Second block")))

'((div "First block") (div "Second block"))

> (detect-paragraphs '("First para" "\n\n" "Second para") #:tag 'ns:p)

'((ns:p "First para") (ns:p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line")
  #:linebreak-proc (λ(x) (detect-linebreaks x #:insert '(newline))))

'((p "First para") (p "Second para" (newline) "Second line"))

procedure

(wrap-hanging-quotes tx 
  [#:single-preprend single-preprender 
  #:double-preprend double-preprender]) 
  txexpr?
  tx : txexpr?
  single-preprender : txexpr-tag? = 'squo
  double-preprender : txexpr-tag? = 'dquo
Find single or double quote marks at the beginning of tx and wrap them in an X-expression with the tag single-preprender or double-preprender, respectively. The default values are 'squo and 'dquo.

Examples:

> (wrap-hanging-quotes '(p "No quote to hang."))

'(p "No quote to hang.")

> (wrap-hanging-quotes '(p "“What? We need to hang quotes?”"))

'(p (dquo "“" "What? We need to hang quotes?”"))

In pro typography, quotation marks at the beginning of a line or paragraph are often shifted into the margin slightly to make them appear more optically aligned with the left edge of the text. With a reflowable layout model like HTML, you don’t know where your line breaks will be.

This function will simply insert the 'squo and 'dquo tags, which provide hooks that let you do the actual hanging via CSS, like so (actual measurement can be refined to taste):

squo {margin-left: -0.25em;}

dquo {margin-left: -0.50em;}

Be warned: there are many edge cases this function does not handle well.

Examples:

; Argh: this edge case is not handled properly
> (wrap-hanging-quotes '(p "“" (em "What?") "We need to hang quotes?”"))

'(p "“" (em "What?") "We need to hang quotes?”")

 
\ No newline at end of file +11.2 Decode
6.1.0.5

11.2 Decode

 (require pollen/decode) package: pollen

The doc export of a Pollen markup file is a simple X-expression. Decoding refers to any post-processing of this X-expression. The pollen/decode module provides tools for creating decoders.

The decode step can happen separately from the compilation of the file. But you can also attach a decoder to the markup file’s root node, so the decoding happens automatically when the markup is compiled, and thus automatically incorporated into doc. (Following this approach, you could also attach multiple decoders to different tags within doc.)

You can, of course, embed function calls within Pollen markup. But since markup is optimized for authors, decoding is useful for operations that can or should be moved out of the authoring layer.

One example is presentation and layout. For instance, detect-paragraphs is a decoder function that lets authors mark paragraphs in their source simply by using two carriage returns.

Another example is conversion of output into a particular data format. Most Pollen functions are optimized for HTML output, but one could write a decoder that targets another format.

procedure

(decode tagged-xexpr    
  [#:txexpr-tag-proc txexpr-tag-proc    
  #:txexpr-attrs-proc txexpr-attrs-proc    
  #:txexpr-elements-proc txexpr-elements-proc    
  #:block-txexpr-proc block-txexpr-proc    
  #:inline-txexpr-proc inline-txexpr-proc    
  #:string-proc string-proc    
  #:symbol-proc symbol-proc    
  #:valid-char-proc valid-char-proc    
  #:cdata-proc cdata-proc    
  #:exclude-tags tags-to-exclude])  txexpr?
  tagged-xexpr : txexpr?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Recursively process a tagged-xexpr, usually the one exported from a Pollen source file as doc.

This function doesn’t do much on its own. Rather, it provides the hooks upon which harder-working functions can be hung.

Recall from [future link: Pollen mechanics] that any tag can have a function attached to it. By default, the tagged-xexpr from a source file is tagged with root. So the typical way to use decode is to attach your decoding functions to it, and then define root to invoke your decode function. Then it will be automatically applied to every doc during compile.

For instance, here’s how decode is attached to root in Butterick’s Practical Typography. There’s not much to it —

(define (root . items)
  (decode (make-txexpr 'root '() items)
          #:txexpr-elements-proc detect-paragraphs
          #:block-txexpr-proc (compose1 hyphenate wrap-hanging-quotes)
          #:string-proc (compose1 smart-quotes smart-dashes)
          #:exclude-tags '(style script)))

The hyphenate function is not part of Pollen, but rather the hyphenate package, which you can install separately.

This illustrates another important point: even though decode presents an imposing list of arguments, you’re unlikely to use all of them at once. These represent possibilities, not requirements. For instance, let’s see what happens when decode is invoked without any of its optional arguments.

Examples:

> (define tx '(root "I wonder" (em "why") "this works."))
> (decode tx)

'(root "I wonder" (em "why") "this works.")

Right — nothing. That’s because the default value for the decoding arguments is the identity function, (λ (x) x). So all the input gets passed through intact unless another action is specified.

The *-proc arguments of decode take procedures that are applied to specific categories of elements within txexpr.

The txexpr-tag-proc argument is a procedure that handles X-expression tags.

Examples:

> (define tx '(p "I'm from a strange" (strong "namespace")))
; Tags are symbols, so a tag-proc should return a symbol
> (decode tx #:txexpr-tag-proc (λ(t) (string->symbol (format "ns:~a" t))))

'(ns:p "I'm from a strange" (ns:strong "namespace"))

The txexpr-attrs-proc argument is a procedure that handles lists of X-expression attributes. (The txexpr module, included at no extra charge with Pollen, includes useful helper functions for dealing with these attribute lists.)

Examples:

> (define tx '(p [[id "first"]] "If I only had a brain."))
; Attrs is a list, so cons is OK for simple cases
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(p ((class "PhD") (id "first")) "If I only had a brain.")

Note that txexpr-attrs-proc will change the attributes of every tagged X-expression, even those that don’t have attributes. This is useful, because sometimes you want to add attributes where none existed before. But be careful, because the behavior may make your processing function overinclusive.

Examples:

> (define tx '(div (p [[id "first"]] "If I only had a brain.")
  (p "Me too.")))
; This will insert the new attribute everywhere
> (decode tx #:txexpr-attrs-proc (λ(attrs) (cons '[class "PhD"] attrs)))

'(div

  ((class "PhD"))

  (p ((class "PhD") (id "first")) "If I only had a brain.")

  (p ((class "PhD")) "Me too."))

; This will add the new attribute only to non-null attribute lists
> (decode tx #:txexpr-attrs-proc
  (λ(attrs) (if (null? attrs) attrs (cons '[class "PhD"] attrs))))

'(div (p ((class "PhD") (id "first")) "If I only had a brain.") (p "Me too."))

The txexpr-elements-proc argument is a procedure that operates on the list of elements that represents the content of each tagged X-expression. Note that each element of an X-expression is subject to two passes through the decoder: once now, as a member of the list of elements, and also later, through its type-specific decoder (i.e., string-proc, symbol-proc, and so on).

Examples:

> (define tx '(div "Double" "\n" "toil" amp "trouble"))
; Every element gets doubled ...
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es)))

'(div "Double" "Double" "\n" "\n" "toil" "toil" amp amp "trouble" "trouble")

; ... but only strings get capitalized
> (decode tx #:txexpr-elements-proc (λ(es) (append-map (λ(e) `(,e ,e)) es))
  #:string-proc (λ(s) (string-upcase s)))

'(div "DOUBLE" "DOUBLE" "\n" "\n" "TOIL" "TOIL" amp amp "TROUBLE" "TROUBLE")

So why do you need txexpr-elements-proc? Because some types of element decoding depend on context, thus it’s necessary to handle the elements as a group. For instance, the doubling function above, though useless, requires handling the element list as a whole, because elements are being added.

A more useful example: paragraph detection. The behavior is not merely a map across each element:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
; Context matters. Trailing whitespace is ignored ...
> (paras '(body "The first paragraph." "\n\n"))

'(body "The first paragraph.")

; ... but whitespace between strings is converted to a break.
> (paras '(body "The first paragraph." "\n\n" "And another."))

'(body (p "The first paragraph.") (p "And another."))

; A combination of both types
> (paras '(body "The first paragraph." "\n\n" "And another." "\n\n"))

'(body (p "The first paragraph.") (p "And another."))

The block-txexpr-proc argument and the inline-txexpr-proc arguments are procedures that operate on tagged X-expressions. If the X-expression meets the block-txexpr? test, it’s processed by block-txexpr-proc. Otherwise, it’s inline, so it’s processed by inline-txexpr-proc. (Careful, however — these aren’t mutually exclusive, because block-txexpr-proc operates on all the elements of a block, including other tagged X-expressions within.)

Of course, if you want block and inline elements to be handled the same way, you can set block-txexpr-proc and inline-txexpr-proc to be the same procedure.

Examples:

> (define tx '(div "Please" (em "mind the gap") (h1 "Tuesdays only")))
> (define add-ns (λ(tx) (make-txexpr
      (string->symbol (format "ns:~a" (get-tag tx)))
      (get-attrs tx)
      (get-elements tx))))
; div and h1 are block elements, so this will only affect them
> (decode tx #:block-txexpr-proc add-ns)

'(ns:div "Please" (em "mind the gap") (ns:h1 "Tuesdays only"))

; em is an inline element, so this will only affect it
> (decode tx #:inline-txexpr-proc add-ns)

'(div "Please" (ns:em "mind the gap") (h1 "Tuesdays only"))

; this will affect all elements
> (decode tx #:block-txexpr-proc add-ns #:inline-txexpr-proc add-ns)

'(ns:div "Please" (ns:em "mind the gap") (ns:h1 "Tuesdays only"))

The string-proc, symbol-proc, valid-char-proc, and cdata-proc arguments are procedures that operate on X-expressions that are strings, symbols, valid-chars, and CDATA, respectively. Deliberately, the output contracts for these procedures accept any kind of X-expression (meaning, the procedure can change the X-expression type).

Examples:

; A div with string, entity, character, and cdata elements
> (define tx `(div "Moe" amp 62 ,(cdata #f #f "3 > 2;")))
> (define rulify (λ(x) '(hr)))
; The rulify function is selectively applied to each
> (print (decode tx #:string-proc rulify))

'(div (hr) amp 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:symbol-proc rulify))

'(div "Moe" (hr) 62 #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:valid-char-proc rulify))

'(div "Moe" amp (hr) #(struct:cdata #f #f "3 > 2;"))

> (print (decode tx #:cdata-proc rulify))

'(div "Moe" amp 62 (hr))

Finally, the tags-to-exclude argument is a list of tags that will be exempted from decoding. Though you could get the same result by testing the input within the individual decoding functions, that’s tedious and potentially slower.

Examples:

> (define tx '(p "I really think" (em "italics") "should be lowercase."))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(p "I REALLY THINK" (em "ITALICS") "SHOULD BE LOWERCASE.")

> (decode tx #:string-proc (λ(s) (string-upcase s)) #:exclude-tags '(em))

'(p "I REALLY THINK" (em "italics") "SHOULD BE LOWERCASE.")

The tags-to-exclude argument is useful if you’re decoding source that’s destined to become HTML. According to the HTML spec, material within a <style> or <script> block needs to be preserved literally. In this example, if the CSS and JavaScript blocks are capitalized, they won’t work. So exclude '(style script), and problem solved.

Examples:

> (define tx '(body (h1 [[class "Red"]] "Let's visit Planet Telex.")
  (style [[type "text/css"]] ".Red {color: green;}")
  (script [[type "text/javascript"]] "var area = h * w;")))
> (decode tx #:string-proc (λ(s) (string-upcase s)))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".RED {COLOR: GREEN;}")

  (script ((type "text/javascript")) "VAR AREA = H * W;"))

> (decode tx #:string-proc (λ(s) (string-upcase s))
  #:exclude-tags '(style script))

'(body

  (h1 ((class "Red")) "LET'S VISIT PLANET TELEX.")

  (style ((type "text/css")) ".Red {color: green;}")

  (script ((type "text/javascript")) "var area = h * w;"))

procedure

(decode-elements elements 
  [#:txexpr-tag-proc txexpr-tag-proc 
  #:txexpr-attrs-proc txexpr-attrs-proc 
  #:txexpr-elements-proc txexpr-elements-proc 
  #:block-txexpr-proc block-txexpr-proc 
  #:inline-txexpr-proc inline-txexpr-proc 
  #:string-proc string-proc 
  #:symbol-proc symbol-proc 
  #:valid-char-proc valid-char-proc 
  #:cdata-proc cdata-proc 
  #:exclude-tags tags-to-exclude]) 
  txexpr-elements?
  elements : txexpr-elements?
  txexpr-tag-proc : (txexpr-tag? . -> . txexpr-tag?)
   = (λ(tag) tag)
  txexpr-attrs-proc : (txexpr-attrs? . -> . txexpr-attrs?)
   = (λ(attrs) attrs)
  txexpr-elements-proc : (txexpr-elements? . -> . txexpr-elements?)
   = (λ(elements) elements)
  block-txexpr-proc : (block-txexpr? . -> . xexpr?) = (λ(tx) tx)
  inline-txexpr-proc : (txexpr? . -> . xexpr?) = (λ(tx) tx)
  string-proc : (string? . -> . xexpr?) = (λ(str) str)
  symbol-proc : (symbol? . -> . xexpr?) = (λ(sym) sym)
  valid-char-proc : (valid-char? . -> . xexpr?) = (λ(vc) vc)
  cdata-proc : (cdata? . -> . xexpr?) = (λ(cdata) cdata)
  tags-to-exclude : (listof symbol?) = null
Identical to decode, but takes txexpr-elements? as input rather than a whole tagged X-expression, and likewise returns txexpr-elements? rather than a tagged X-expression. A convenience variant for use inside tag functions.

11.2.1 Block

Because it’s convenient, Pollen puts tagged X-expressions into two categories: block and inline. Why is it convenient? When using decode, you often want to treat the two categories differently. Not that you have to. But this is how you can.

parameter

(project-block-tags)  (listof txexpr-tag?)

(project-block-tags block-tags)  void?
  block-tags : (listof txexpr-tag?)
A parameter that defines the set of tags that decode will treat as blocks. This parameter is initialized with the HTML block tags, namely:

(address article aside audio blockquote body canvas dd div dl fieldset figcaption figure footer form h1 h2 h3 h4 h5 h6 header hgroup noscript ol output p pre section table tfoot ul video)

procedure

(register-block-tag tag)  void?

  tag : txexpr-tag?
Adds a tag to project-block-tags so that block-txexpr? will report it as a block, and decode will process it with block-txexpr-proc rather than inline-txexpr-proc.

Pollen tries to do the right thing without being told. But this is the rare case where you have to be explicit. If you introduce a tag into your markup that you want treated as a block, you must use this function to identify it, or you will get spooky behavior later on.

For instance, detect-paragraphs knows that block elements in the markup shouldn’t be wrapped in a p tag. So if you introduce a new block element called bloq without registering it as a block, misbehavior will follow:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (p (bloq "But not me.")))

; Wrong: bloq should not be wrapped

But once you register bloq as a block, order is restored:

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (register-block-tag 'bloq)
> (paras '(body "I want to be a paragraph." "\n\n" (bloq "But not me.")))

'(body (p "I want to be a paragraph.") (bloq "But not me."))

; Right: bloq is treated as a block

If you find the idea of registering block tags unbearable, good news. The project-block-tags include the standard HTML block tags by default. So if you just want to use things like div and p and h1–h6, you’ll get the right behavior for free.

Examples:

> (define (paras tx) (decode tx #:txexpr-elements-proc detect-paragraphs))
> (paras '(body "I want to be a paragraph." "\n\n" (div "But not me.")))

'(body (p "I want to be a paragraph.") (div "But not me."))

procedure

(block-txexpr? v)  boolean?

  v : any/c
Predicate that tests whether v is a tagged X-expression, and if so, whether the tag is among the project-block-tags. If not, it is treated as inline. To adjust how this test works, use register-block-tag.

11.2.2 Typography

An assortment of typography & layout functions, designed to be used with decode. These aren’t hard to write. So if you like these, use them. If not, make your own.

procedure

(whitespace? v)  boolean?

  v : any/c
A predicate that returns #t for any stringlike v that’s entirely whitespace, but also the empty string, as well as lists and vectors that are made only of whitespace? members. Following the regexp-match convention, whitespace? does not return #t for a nonbreaking space. If you prefer that behavior, use whitespace/nbsp?.

Examples:

> (whitespace? "\n\n   ")

#t

> (whitespace? (string->symbol "\n\n   "))

#t

> (whitespace? "")

#t

> (whitespace? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace? nonbreaking-space)

#f

procedure

(whitespace/nbsp? v)  boolean?

  v : any/c
Like whitespace?, but also returns #t for nonbreaking spaces.

Examples:

> (whitespace/nbsp? "\n\n   ")

#t

> (whitespace/nbsp? (string->symbol "\n\n   "))

#t

> (whitespace/nbsp? "")

#t

> (whitespace/nbsp? '("" "  " "\n\n\n" " \n"))

#t

> (define nonbreaking-space (format "~a" #\u00A0))
> (whitespace/nbsp? nonbreaking-space)

#t

procedure

(smart-quotes str)  string?

  str : string?
Convert straight quotes in str to curly according to American English conventions.

Examples:

> (define tricky-string
  "\"Why,\" she could've asked, \"are we in O‘ahu watching 'Mame'?\"")
> (display tricky-string)

"Why," she could've asked, "are we in O‘ahu watching 'Mame'?"

> (display (smart-quotes tricky-string))

“Why,” she could’ve asked, “are we in O‘ahu watching ‘Mame’?”

procedure

(smart-dashes str)  string?

  str : string?
In str, convert three hyphens to an em dash, and two hyphens to an en dash, and remove surrounding spaces.

Examples:

> (define tricky-string "I had a few --- OK, like 6--8 --- thin mints.")
> (display tricky-string)

I had a few --- OK, like 6--8 --- thin mints.

> (display (smart-dashes tricky-string))

I had a few—OK, like 6–8—thin mints.

; Monospaced font not great for showing dashes, but you get the idea

procedure

(detect-linebreaks tagged-xexpr-elements 
  [#:separator linebreak-sep 
  #:insert linebreak]) 
  txexpr-elements?
  tagged-xexpr-elements : txexpr-elements?
  linebreak-sep : string? = world:linebreak-separator
  linebreak : xexpr? = '(br)
Within tagged-xexpr-elements, convert occurrences of linebreak-sep ("\n" by default) to linebreak, but only if linebreak-sep does not occur between blocks (see block-txexpr?). Why? Because block-level elements automatically display on a new line, so adding linebreak would be superfluous. In that case, linebreak-sep just disappears.

Examples:

> (detect-linebreaks '(div "Two items:" "\n" (em "Eggs") "\n" (em "Bacon")))

'(div "Two items:" (br) (em "Eggs") (br) (em "Bacon"))

> (detect-linebreaks '(div "Two items:" "\n" (div "Eggs") "\n" (div "Bacon")))

'(div "Two items:" (div "Eggs") (div "Bacon"))

procedure

(detect-paragraphs elements 
  [#:separator paragraph-sep 
  #:tag paragraph-tag 
  #:linebreak-proc linebreak-proc]) 
  txexpr-elements?
  elements : txexpr-elements?
  paragraph-sep : string? = world:paragraph-separator
  paragraph-tag : symbol? = 'p
  linebreak-proc : (txexpr-elements? . -> . txexpr-elements?)
   = detect-linebreaks
Find paragraphs within elements (as denoted by paragraph-sep) and wrap them with paragraph-tag. Also handle linebreaks using detect-linebreaks.

If element is already a block-txexpr?, it will not be wrapped as a paragraph (because in that case, the wrapping would be superfluous). Thus, as a consequence, if paragraph-sep occurs between two blocks, it will be ignored (as in the example below using two sequential 'div blocks.)

The paragraph-tag argument sets the tag used to wrap paragraphs.

The linebreak-proc argument allows you to use a different linebreaking procedure other than the usual detect-linebreaks.

Examples:

> (detect-paragraphs '("First para" "\n\n" "Second para"))

'((p "First para") (p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line"))

'((p "First para") (p "Second para" (br) "Second line"))

> (detect-paragraphs '("First para" "\n\n" (div "Second block")))

'((p "First para") (div "Second block"))

> (detect-paragraphs '((div "First block") "\n\n" (div "Second block")))

'((div "First block") (div "Second block"))

> (detect-paragraphs '("First para" "\n\n" "Second para") #:tag 'ns:p)

'((ns:p "First para") (ns:p "Second para"))

> (detect-paragraphs '("First para" "\n\n" "Second para" "\n" "Second line")
  #:linebreak-proc (λ(x) (detect-linebreaks x #:insert '(newline))))

'((p "First para") (p "Second para" (newline) "Second line"))

procedure

(wrap-hanging-quotes tx 
  [#:single-preprend single-preprender 
  #:double-preprend double-preprender]) 
  txexpr?
  tx : txexpr?
  single-preprender : txexpr-tag? = 'squo
  double-preprender : txexpr-tag? = 'dquo
Find single or double quote marks at the beginning of tx and wrap them in an X-expression with the tag single-preprender or double-preprender, respectively. The default values are 'squo and 'dquo.

Examples:

> (wrap-hanging-quotes '(p "No quote to hang."))

'(p "No quote to hang.")

> (wrap-hanging-quotes '(p "“What? We need to hang quotes?”"))

'(p (dquo "“" "What? We need to hang quotes?”"))

In pro typography, quotation marks at the beginning of a line or paragraph are often shifted into the margin slightly to make them appear more optically aligned with the left edge of the text. With a reflowable layout model like HTML, you don’t know where your line breaks will be.

This function will simply insert the 'squo and 'dquo tags, which provide hooks that let you do the actual hanging via CSS, like so (actual measurement can be refined to taste):

squo {margin-left: -0.25em;}

dquo {margin-left: -0.50em;}

Be warned: there are many edge cases this function does not handle well.

Examples:

; Argh: this edge case is not handled properly
> (wrap-hanging-quotes '(p "“" (em "What?") "We need to hang quotes?”"))

'(p "“" (em "What?") "We need to hang quotes?”")

 
\ No newline at end of file diff --git a/doc/Pagetree.html b/doc/Pagetree.html index 89f4383..9ba8a98 100644 --- a/doc/Pagetree.html +++ b/doc/Pagetree.html @@ -1,2 +1,2 @@ -11.4 Pagetree
6.1.0.5

11.4 Pagetree

 (require pollen/pagetree) package: pollen

Books and other long documents are usually organized in a structured way — at minimum they have a sequence of pages, but more often they have sections with subsequences within. Individual pages in a Pollen project don’t know anything about how they’re connected to other pages. In theory, you could maintain this information within the source files. But this would be a poor use of human energy.

Instead, use a pagetree. A pagetree is a simple abstraction for defining & working with sequences of pagenodes. Typically these pagenodes will be the names of output files in your project.

“So it’s a list of web-page filenames?” Sort of. When I think of a web page, I think of an actual file on a disk. Keeping with Pollen’s orientation toward dynamic rendering, pagenodes may — and often do — refer to files that don’t yet exist. Moreover, by referring to output names rather than source names, you retain the flexibility to change the kind of source associated with a particular pagenode (e.g., from preprocessor source to Pollen markup).

Pagetrees can be flat or hierarchical. A flat pagetree is just a list of pagenodes. A hierarchical pagetree can also contain recursively nested lists of pagenodes. But you needn’t pay attention to this distinction, as the pagetree functions don’t care which kind you use. Neither do I.

Pagetrees surface throughout the Pollen system. They’re primarily used for navigation — for instance, calculating “previous,” “next,” or “up” links for a given page. A special pagetree, index.ptree, is used by the project server to order the files in a dashboard. Pagetrees can also be used to define batches of files for certain operations, for instance raco pollen render. You might find other uses for them too.

11.4.1 Making pagetrees with a source file

A pagetree source file either starts with #lang pollen and uses the .ptree extension, or starts with #lang pollen/ptree and then can have any file extension.

Unlike other Pollen source files, since the pagetree source is not rendered into an output format, the rest of the filename is up to you.

Here’s a flat pagetree. Each line is considered a single pagenode (blank lines are ignored). Notice that no Pollen command syntax nor quoting is needed within the pagetree source:

"flat.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html

And here’s the output in DrRacket:

'(pagetree-root index.html introduction.html main_argument.html conclusion.html)

Keeping with usual Pollen policy, this is an X-expression. The pagetree-root is just an arbitrary tag that contains the pagetree.

Upgrading to a hierarchical pagetree is simple. The same basic rule applies — one pagenode per line. But this time, you add Pollen command syntax: a lozenge in front of a pagenode marks it as the top of a nested list, and the sub-pagenodes of that list go between { curly braces }, like so:

"hierarchical.ptree"
#lang pollen
 
toc.html
first-chapter.html{
    foreword.html
    introduction.html}
second-chapter.html{
    main-argument.html{
        facts.html
        analysis.html}
    conclusion.html}
bibliography.html

The output of our hierarchical pagetree:

'(pagetree-root toc.html (first-chapter.html foreword.html introduction.html) (second-chapter.html (main-argument.html facts.html analysis.html) conclusion.html) bibliography.html)

One advantage of using a source file is that when you run it in DrRacket, it will automatically be checked using validate-pagetree, which insures that every element in the pagetree meets pagenode?, and that all the pagenodes are unique.

This pagetree has a duplicate pagenode, so it won’t run:

"duplicate-pagenode.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html
index.html

Instead, you’ll get an error:

validate-pagetree: members-unique? failed because item isn’t unique: (index.html)

11.4.2 Making pagetrees by hand

Experienced programmers may want to know that because a pagetree is just an X-expression, you can synthesize a pagetree using any Pollen or Racket tools for making X-expressions. For example, here’s some Racket code that generates the same pagetree as the flat.ptree source file above:

"make-flat-ptree.rkt"
#lang racket
(require pollen/pagetree)
(define node-names '(index introduction main_argument conclusion))
(define pt `(pagetree-root
  ,@(map (λ(n) (string->symbol (format "~a.html" n))) node-names)))
(if (pagetree? pt) pt "Oops, not a pagetree")

Note that you need to take more care when building a pagetree by hand. Pagenodes are symbols, not strings, thus the use of string->symbol is mandatory. One benefit of using a pagetree source file is that it takes care of this housekeeping for you.

11.4.3 Using pagetrees for navigation

Typically you’ll call the pagetree-navigation functions from inside templates, using the special variable here as the starting point. For more on this technique, see Pagetree navigation.

11.4.4 Using index.ptree in the dashboard

When you’re using the project server to view the files in a directory, the server will first look for a file called index.ptree. If it finds this pagetree file, it will use it to build the dashboard. If not, then it will synthesize a pagetree using a directory listing. For more on this technique, see Using the dashboard.

11.4.5 Using pagetrees with raco pollen render

The raco pollen render command is used to regenerate an output file from its source. If you pass a pagetree to raco pollen render, it will automatically render each file listed in the pagetree.

For instance, many projects have auxiliary pages that don’t really belong in the main navigational flow. You can collect these pages in a separate pagetree:

"utility.ptree"
#lang pollen
 
404-error.html
terms-of-service.html
webmaster.html
[... and so on]

Thus, when you’re using pagetree-navigation functions within a template, you can use your main pagetree, and restrict the navigation to the main editorial content. But when you render the project, you can pass both pagetrees to raco pollen render.

For more on this technique, see raco pollen render.

11.4.6 Functions
11.4.6.1 Predicates & validation

procedure

(pagetree? possible-pagetree)  boolean?

  possible-pagetree : any/c
Test whether possible-pagetree is a valid pagetree. It must be a txexpr? where all elements are pagenode?, and each is unique within possible-pagetree (not counting the root node).

Examples:

> (pagetree? '(root index.html))

#t

> (pagetree? '(root duplicate.html duplicate.html))

#f

> (pagetree? '(root index.html "string.html"))

#f

> (define nested-ptree '(root 1.html 2.html (3.html 3a.html 3b.html)))
> (pagetree? nested-ptree)

#t

> (pagetree? `(root index.html ,nested-ptree (subsection.html more.html)))

#t

; Nesting a subtree twice creates duplication
> (pagetree? `(root index.html ,nested-ptree (subsection.html ,nested-ptree)))

#f

procedure

(validate-pagetree possible-pagetree)  pagetree?

  possible-pagetree : any/c
Like pagetree?, but raises a descriptive error if possible-pagetree is invalid, and otherwise returns possible-pagetree itself.

Examples:

> (validate-pagetree '(root (mama.html son.html daughter.html) uncle.html))

'(root (mama.html son.html daughter.html) uncle.html)

> (validate-pagetree `(root (,+ son.html daughter.html) uncle.html))

#f

> (validate-pagetree '(root (mama.html son.html son.html) mama.html))

validate-pagetree: members-unique? failed because items

aren’t unique: (son.html mama.html)

procedure

(pagenode? possible-pagenode)  boolean?

  possible-pagenode : any/c
Test whether possible-pagenode is a valid pagenode. A pagenode can be any symbol? that is not whitespace/nbsp? Every leaf of a pagetree is a pagenode. In practice, your pagenodes will likely be names of output files.

Pagenodes are symbols (rather than strings) so that pagetrees will be valid tagged X-expressions, which is a more convenient format for validation & processing.

Examples:

; Three symbols, the third one annoying but valid
> (map pagenode? '(symbol index.html |   silly   |))

'(#t #t #t)

; A number, a string, a txexpr, and a whitespace symbol
> (map pagenode? '(9.999 "index.html" (p "Hello") |    |))

'(#f #f #f #f)

procedure

(pagenodeish? v)  boolean?

  v : any/c
Return #t if v can be converted with ->pagenode.

Example:

> (map pagenodeish? '(9.999 "index.html" |    |))

'(#t #t #f)

procedure

(->pagenode v)  pagenode?

  v : pagenodeish?
Convert v to a pagenode.

Examples:

> (map pagenodeish? '(symbol 9.999 "index.html" |  silly  |))

'(#t #t #t #t)

> (map ->pagenode '(symbol 9.999 "index.html" |  silly  |))

'(symbol |9.999| index.html |  silly  |)

11.4.6.2 Navigation

parameter

(current-pagetree)  pagetree?

(current-pagetree pagetree)  void?
  pagetree : pagetree?
A parameter that defines the default pagetree used by pagetree navigation functions (e.g., parent-pagenode, chidren, et al.) if another is not explicitly specified. Initialized to #f.

procedure

(parent p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the parent pagenode of p within pagetree. Return #f if there isn’t one.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (parent 'son.html)

'mama.html

> (parent "mama.html")

'root

> (parent (parent 'son.html))

'root

> (parent (parent (parent 'son.html)))

#f

procedure

(children p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the child pagenodes of p within pagetree. Return #f if there aren’t any.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (children 'mama.html)

'(son.html daughter.html)

> (children 'uncle.html)

#f

> (children 'root)

'(mama.html uncle.html)

> (map children (children 'root))

'((son.html daughter.html) #f)

procedure

(siblings p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the sibling pagenodes of p within pagetree. The list will include p itself. But the function will still return #f if pagetree is #f.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (siblings 'son.html)

'(son.html daughter.html)

> (siblings 'daughter.html)

'(son.html daughter.html)

> (siblings 'mama.html)

'(mama.html uncle.html)

procedure

(previous p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(previous* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately before p. For previous*, return all the pagenodes before p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (previous 'daughter.html)

'son.html

> (previous 'son.html)

'mama.html

> (previous (previous 'daughter.html))

'mama.html

> (previous 'mama.html)

#f

> (previous* 'daughter.html)

'(mama.html son.html)

> (previous* 'uncle.html)

'(mama.html son.html daughter.html)

procedure

(next p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(next* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately after p. For next*, return all the pagenodes after p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (next 'son.html)

'daughter.html

> (next 'daughter.html)

'uncle.html

> (next (next 'son.html))

'uncle.html

> (next 'uncle.html)

#f

> (next* 'mama.html)

'(son.html daughter.html uncle.html)

> (next* 'daughter.html)

'(uncle.html)

11.4.6.3 Utilities

procedure

(pagetree->list pagetree)  list?

  pagetree : pagetree?
Convert pagetree to a simple list. Equivalent to a pre-order depth-first traversal of pagetree.

procedure

(in-pagetree? pagenode [pagetree])  boolean?

  pagenode : pagenode?
  pagetree : pagetree? = (current-pagetree)
Report whether pagenode is in pagetree.

procedure

(path->pagenode p)  pagenode?

  p : pathish?
Convert path p to a pagenode — meaning, make it relative to current-project-root, run it through ->output-path, and convert it to a symbol. Does not tell you whether the resultant pagenode actually exists in the current pagetree (for that, use in-pagetree?).

 
\ No newline at end of file +11.4 Pagetree
6.1.0.5

11.4 Pagetree

 (require pollen/pagetree) package: pollen

Books and other long documents are usually organized in a structured way — at minimum they have a sequence of pages, but more often they have sections with subsequences within. Individual pages in a Pollen project don’t know anything about how they’re connected to other pages. In theory, you could maintain this information within the source files. But this would be a poor use of human energy.

Instead, use a pagetree. A pagetree is a simple abstraction for defining & working with sequences of pagenodes. Typically these pagenodes will be the names of output files in your project.

“So it’s a list of web-page filenames?” Sort of. When I think of a web page, I think of an actual file on a disk. Keeping with Pollen’s orientation toward dynamic rendering, pagenodes may — and often do — refer to files that don’t yet exist. Moreover, by referring to output names rather than source names, you retain the flexibility to change the kind of source associated with a particular pagenode (e.g., from preprocessor source to Pollen markup).

Pagetrees can be flat or hierarchical. A flat pagetree is just a list of pagenodes. A hierarchical pagetree can also contain recursively nested lists of pagenodes. But you needn’t pay attention to this distinction, as the pagetree functions don’t care which kind you use. Neither do I.

Pagetrees surface throughout the Pollen system. They’re primarily used for navigation — for instance, calculating “previous,” “next,” or “up” links for a given page. A special pagetree, index.ptree, is used by the project server to order the files in a dashboard. Pagetrees can also be used to define batches of files for certain operations, for instance raco pollen render. You might find other uses for them too.

11.4.1 Making pagetrees with a source file

A pagetree source file either starts with #lang pollen and uses the .ptree extension, or starts with #lang pollen/ptree and then can have any file extension.

Unlike other Pollen source files, since the pagetree source is not rendered into an output format, the rest of the filename is up to you.

Here’s a flat pagetree. Each line is considered a single pagenode (blank lines are ignored). Notice that no Pollen command syntax nor quoting is needed within the pagetree source:

"flat.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html

And here’s the output in DrRacket:

'(pagetree-root index.html introduction.html main_argument.html conclusion.html)

Keeping with usual Pollen policy, this is an X-expression. The pagetree-root is just an arbitrary tag that contains the pagetree.

Upgrading to a hierarchical pagetree is simple. The same basic rule applies — one pagenode per line. But this time, you add Pollen command syntax: a lozenge in front of a pagenode marks it as the top of a nested list, and the sub-pagenodes of that list go between { curly braces }, like so:

"hierarchical.ptree"
#lang pollen
 
toc.html
first-chapter.html{
    foreword.html
    introduction.html}
second-chapter.html{
    main-argument.html{
        facts.html
        analysis.html}
    conclusion.html}
bibliography.html

The output of our hierarchical pagetree:

'(pagetree-root toc.html (first-chapter.html foreword.html introduction.html) (second-chapter.html (main-argument.html facts.html analysis.html) conclusion.html) bibliography.html)

One advantage of using a source file is that when you run it in DrRacket, it will automatically be checked using validate-pagetree, which insures that every element in the pagetree meets pagenode?, and that all the pagenodes are unique.

This pagetree has a duplicate pagenode, so it won’t run:

"duplicate-pagenode.ptree"
#lang pollen
 
index.html
introduction.html
main_argument.html
conclusion.html
index.html

Instead, you’ll get an error:

validate-pagetree: members-unique? failed because item isn’t unique: (index.html)

Pagenodes can refer to files in subdirectories. Just write the pagenode as a path relative to the directory where the pagetree lives:

"flat.ptree"
#lang pollen
 
foreword.html
facts-intro.html{
    facts/brennan.html
    facts/dale.html
}
analysis/intro.html{
    analysis/fancy-sauce/part-1.html
    analysis/fancy-sauce/part-2.html
}
conclusion.html
11.4.2 Making pagetrees by hand

Experienced programmers may want to know that because a pagetree is just an X-expression, you can synthesize a pagetree using any Pollen or Racket tools for making X-expressions. For example, here’s some Racket code that generates the same pagetree as the flat.ptree source file above:

"make-flat-ptree.rkt"
#lang racket
(require pollen/pagetree)
(define node-names '(index introduction main_argument conclusion))
(define pt `(pagetree-root
  ,@(map (λ(n) (string->symbol (format "~a.html" n))) node-names)))
(if (pagetree? pt) pt "Oops, not a pagetree")

Note that you need to take more care when building a pagetree by hand. Pagenodes are symbols, not strings, thus the use of string->symbol is mandatory. One benefit of using a pagetree source file is that it takes care of this housekeeping for you.

11.4.3 Using pagetrees for navigation

Typically you’ll call the pagetree-navigation functions from inside templates, using the special variable here as the starting point. For more on this technique, see Pagetree navigation.

11.4.4 Using index.ptree in the dashboard

When you’re using the project server to view the files in a directory, the server will first look for a file called index.ptree. If it finds this pagetree file, it will use it to build the dashboard. If not, then it will synthesize a pagetree using a directory listing. For more on this technique, see Using the dashboard.

11.4.5 Using pagetrees with raco pollen render

The raco pollen render command is used to regenerate an output file from its source. If you pass a pagetree to raco pollen render, it will automatically render each file listed in the pagetree.

For instance, many projects have auxiliary pages that don’t really belong in the main navigational flow. You can collect these pages in a separate pagetree:

"utility.ptree"
#lang pollen
 
404-error.html
terms-of-service.html
webmaster.html
[... and so on]

Thus, when you’re using pagetree-navigation functions within a template, you can use your main pagetree, and restrict the navigation to the main editorial content. But when you render the project, you can pass both pagetrees to raco pollen render.

For more on this technique, see raco pollen render.

11.4.6 Functions
11.4.6.1 Predicates & validation

procedure

(pagetree? possible-pagetree)  boolean?

  possible-pagetree : any/c
Test whether possible-pagetree is a valid pagetree. It must be a txexpr? where all elements are pagenode?, and each is unique within possible-pagetree (not counting the root node).

Examples:

> (pagetree? '(root index.html))

#t

> (pagetree? '(root duplicate.html duplicate.html))

#f

> (pagetree? '(root index.html "string.html"))

#f

> (define nested-ptree '(root 1.html 2.html (3.html 3a.html 3b.html)))
> (pagetree? nested-ptree)

#t

> (pagetree? `(root index.html ,nested-ptree (subsection.html more.html)))

#t

; Nesting a subtree twice creates duplication
> (pagetree? `(root index.html ,nested-ptree (subsection.html ,nested-ptree)))

#f

procedure

(validate-pagetree possible-pagetree)  pagetree?

  possible-pagetree : any/c
Like pagetree?, but raises a descriptive error if possible-pagetree is invalid, and otherwise returns possible-pagetree itself.

Examples:

> (validate-pagetree '(root (mama.html son.html daughter.html) uncle.html))

'(root (mama.html son.html daughter.html) uncle.html)

> (validate-pagetree `(root (,+ son.html daughter.html) uncle.html))

#f

> (validate-pagetree '(root (mama.html son.html son.html) mama.html))

validate-pagetree: members-unique? failed because items

aren’t unique: (son.html mama.html)

procedure

(pagenode? possible-pagenode)  boolean?

  possible-pagenode : any/c
Test whether possible-pagenode is a valid pagenode. A pagenode can be any symbol? that is not whitespace/nbsp? Every leaf of a pagetree is a pagenode. In practice, your pagenodes will likely be names of output files.

Pagenodes are symbols (rather than strings) so that pagetrees will be valid tagged X-expressions, which is a more convenient format for validation & processing.

Examples:

; Three symbols, the third one annoying but valid
> (map pagenode? '(symbol index.html |   silly   |))

'(#t #t #t)

; A number, a string, a txexpr, and a whitespace symbol
> (map pagenode? '(9.999 "index.html" (p "Hello") |    |))

'(#f #f #f #f)

procedure

(pagenodeish? v)  boolean?

  v : any/c
Return #t if v can be converted with ->pagenode.

Example:

> (map pagenodeish? '(9.999 "index.html" |    |))

'(#t #t #f)

procedure

(->pagenode v)  pagenode?

  v : pagenodeish?
Convert v to a pagenode.

Examples:

> (map pagenodeish? '(symbol 9.999 "index.html" |  silly  |))

'(#t #t #t #t)

> (map ->pagenode '(symbol 9.999 "index.html" |  silly  |))

'(symbol |9.999| index.html |  silly  |)

11.4.6.2 Navigation

parameter

(current-pagetree)  pagetree?

(current-pagetree pagetree)  void?
  pagetree : pagetree?
A parameter that defines the default pagetree used by pagetree navigation functions (e.g., parent-pagenode, chidren, et al.) if another is not explicitly specified. Initialized to #f.

procedure

(parent p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the parent pagenode of p within pagetree. Return #f if there isn’t one.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (parent 'son.html)

'mama.html

> (parent "mama.html")

'root

> (parent (parent 'son.html))

'root

> (parent (parent (parent 'son.html)))

#f

procedure

(children p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the child pagenodes of p within pagetree. Return #f if there aren’t any.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (children 'mama.html)

'(son.html daughter.html)

> (children 'uncle.html)

#f

> (children 'root)

'(mama.html uncle.html)

> (map children (children 'root))

'((son.html daughter.html) #f)

procedure

(siblings p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Find the sibling pagenodes of p within pagetree. The list will include p itself. But the function will still return #f if pagetree is #f.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (siblings 'son.html)

'(son.html daughter.html)

> (siblings 'daughter.html)

'(son.html daughter.html)

> (siblings 'mama.html)

'(mama.html uncle.html)

procedure

(previous p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(previous* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately before p. For previous*, return all the pagenodes before p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (previous 'daughter.html)

'son.html

> (previous 'son.html)

'mama.html

> (previous (previous 'daughter.html))

'mama.html

> (previous 'mama.html)

#f

> (previous* 'daughter.html)

'(mama.html son.html)

> (previous* 'uncle.html)

'(mama.html son.html daughter.html)

procedure

(next p [pagetree])  (or/c #f pagenode?)

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)

procedure

(next* p [pagetree])  (or/c #f (listof pagenode?))

  p : (or/c #f pagenodeish?)
  pagetree : pagetree? = (current-pagetree)
Return the pagenode immediately after p. For next*, return all the pagenodes after p, in sequence. In both cases, return #f if there aren’t any pagenodes. The root pagenode is ignored.

Examples:

> (current-pagetree '(root (mama.html son.html daughter.html) uncle.html))
> (next 'son.html)

'daughter.html

> (next 'daughter.html)

'uncle.html

> (next (next 'son.html))

'uncle.html

> (next 'uncle.html)

#f

> (next* 'mama.html)

'(son.html daughter.html uncle.html)

> (next* 'daughter.html)

'(uncle.html)

11.4.6.3 Utilities

procedure

(pagetree->list pagetree)  list?

  pagetree : pagetree?
Convert pagetree to a simple list. Equivalent to a pre-order depth-first traversal of pagetree.

procedure

(in-pagetree? pagenode [pagetree])  boolean?

  pagenode : pagenode?
  pagetree : pagetree? = (current-pagetree)
Report whether pagenode is in pagetree.

procedure

(path->pagenode p)  pagenode?

  p : pathish?
Convert path p to a pagenode — meaning, make it relative to current-project-root, run it through ->output-path, and convert it to a symbol. Does not tell you whether the resultant pagenode actually exists in the current pagetree (for that, use in-pagetree?).

 
\ No newline at end of file diff --git a/doc/big-picture.html b/doc/big-picture.html index 0260957..367dcac 100644 --- a/doc/big-picture.html +++ b/doc/big-picture.html @@ -1,3 +1,3 @@ -4 The big picture
6.1.0.5

4 The big picture

A summary of the key components & concepts of the Pollen publishing system and how they fit together. If you’ve completed the Quick tour, this will lend some context to what you saw. The next tutorials will make more sense if you read this first.

4.1 The book is a program

This is the core design principle of Pollen. Consistent with this principle, Pollen adopts the habits of software development in its functionality, workflow, and project management.

  • You are a programmer. Don’t panic. But let’s just admit it — if your book is a program, then you are, in part, programming it. You don’t have to know any programming to start using Pollen. But you’ll have to be willing to learn a few programming ideas. (Those who have programmed other template-based HTML generators may have to forget a few things.)

  • A Pollen project consists of source files + static files. A source file is a file that can be compiled to produce certain output. A static file is usable as it stands (e.g., an SVG file or webfont). Generally, the textual content of your book will live in source files, and other elements will be static files.

  • Source control is a good idea. Because Pollen projects are software projects, they can be easily managed with systems for source control and collaboration, like GitHub. If you’re a writer at heart, don’t fear these systems — the learning curve is repaid by revision & edit tracking that’s much easier than it is with Word or PDF files.

4.2 One language, multiple dialects

  • Everything is Racket. The Pollen system is built entirely in the Racket programming language. Some of your source files will be in Racket. Others will be in one of the Pollen language dialects. But under the hood, everything becomes Racket code. So if you plan to do any serious work in Pollen, you’ll want to learn some basics about Racket too (for instance Quick: An Introduction to Racket with Pictures).

  • The Pollen language is based on Scribble. Scribble is a variant of the Racket language that flips the usual programming syntax: instead of code with embedded textual content, a Scribble source file is text with embedded code (an idea borrowed from TeX). The Pollen language is adapted from Scribble. So most things that are true about Scribble are also true about Pollen (see Scribble: The Racket Documentation Tool).

  • The Pollen language is divided into dialects. The Pollen dialects share a common syntax and structure. But they’re different in details that makes them better adapted to certain types of source files (for instance, one dialect of Pollen understands Markdown; the others don’t). Use whichever suits the task at hand.

4.3 Development environment

The Pollen development environment has three main pieces: the DrRacket code editor, the project server, and the command-line tool.

  • Edit source files with DrRacket. DrRacket is Racket’s GUI code editor. Sure, you can also use a generic text editor. But DrRacket lets you immediately run your source and see if it works.

  • Preview & test web pages with the Pollen project server. Pollen has a built-in development web server called the project server. After you start the project server, you can preview your web pages within any web browser, allowing you to test them with maximum accuracy.

  • Write the docs. The project server can recognize and render Scribble files, so you can use it as a previewing tool while you’re writing your documentation.

  • Render & deploy from the command line. Your Pollen project ultimately gets rendered to a set of static files (usually HTML and related assets). This can be controlled from the command line, so you can integrate it into other scripts.

4.4 A special data structure for HTML

Unlike other programming languages, Pollen (and Racket) internally represent HTML with something called an X-expression. An X-expression is simply a list that represents what in HTML is called an element, meaning a thing with an opening tag, a closing tag, and content in between. Like HTML elements, X-expressions can be nested. Unlike HTML elements, X-expressions have no closing tag, they use parentheses to denote the start and end, and text elements are put inside quotes.

For example, consider this HTML element:

<body><h1>Hello world</h1><p>Nice to <i>see</i> you.</p></body>

As a Racket X-expression, this would be written:

(body (h1 "Hello world") (p "Nice to " (i "see") " you."))

More will be said about X-expressions. But a couple advantages should be evident already. First, without the redundant angle brackets, the X-expression is more readable than the equivalent HTML. Second, an X-expression is preferable to representing HTML as a simple string, because it preserves the internal structure of the element.

4.5 Pollen command syntax

As mentioned above, a Pollen source file is not code with text embedded in it, but rather text with code embedded. (See ◊ command overview for more.)