main
Matthew Butterick 11 years ago
parent bc483a027e
commit 32a4b428cf

@ -1,4 +1,5 @@
#lang racket/base
(require (for-syntax racket/base))
(require racket/string racket/list racket/contract racket/vector)
(require "patterns.rkt" "exceptions.rkt" tagged-xexpr xml)
@ -14,15 +15,15 @@
;;; (also in the public domain)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(provide (contract-out
[hyphenate
((xexpr?) ((or/c char? string?) #:exceptions (listof exception-word?) #:min-length (or/c integer? #f)) . ->* . string?)])
(contract-out
[hyphenatef
((xexpr? procedure?) ((or/c char? string?) #:exceptions (listof exception-word?) #:min-length (or/c integer? #f)) . ->* . string?)])
(contract-out
[unhyphenate
((xexpr?) ((or/c char? string?)) . ->* . string?)]))
(define-syntax (define+provide/contract stx)
(syntax-case stx ()
[(_ (proc arg ... . rest-arg) contract body ...)
#'(define+provide/contract proc contract
(λ(arg ... . rest-arg) body ...))]
[(_ name contract body ...)
#'(begin
(provide (contract-out [name contract]))
(define name body ...))]))
;; global data, define now but set! them later (because they're potentially big & slow)
(define exceptions #f)
@ -157,10 +158,12 @@
(if (char? joiner) (format "~a" joiner) joiner))
;; Hyphenate using a filter procedure.
;; Theoretically possible to do this externally,
;; but it would just mean doing the regexp-replace twice.
(define (hyphenatef x proc [joiner default-joiner] #:exceptions [extra-exceptions '()] #:min-length [min-length default-min-length])
(define+provide/contract (hyphenatef x proc [joiner default-joiner]
#:exceptions [extra-exceptions '()]
#:min-length [min-length default-min-length])
((xexpr? procedure?) ((or/c char? string?)
#:exceptions (listof exception-word?)
#:min-length (or/c integer? #f)) . ->* . string?)
;; set up module data
;; todo?: change set! to parameterize
@ -181,10 +184,17 @@
;; Default hyphenate function.
(define (hyphenate x [joiner default-joiner] #:exceptions [extra-exceptions '()] #:min-length [min-length default-min-length])
(define+provide/contract (hyphenate x [joiner default-joiner]
#:exceptions [extra-exceptions '()]
#:min-length [min-length default-min-length])
((xexpr/c) ((or/c char? string?)
#:exceptions (listof exception-word?)
#:min-length (or/c integer? #f)) . ->* . string?)
(hyphenatef x (λ(x) #t) joiner #:exceptions extra-exceptions #:min-length min-length))
(define (unhyphenate x [joiner default-joiner])
(define+provide/contract (unhyphenate x [joiner default-joiner])
((xexpr/c) ((or/c char? string?)) . ->* . string?)
(define (remove-hyphens text)
(string-replace text (joiner->string joiner) ""))

@ -14,12 +14,12 @@ A simple hyphenation engine that uses the KnuthLiang hyphenation algorithm or
I originally put together this module to handle hyphenation for my web-based book @link["http://practicaltypography.com"]{Butterick's Practical Typography} (which I made with Racket & Scribble). Though support for CSS-based hyphenation in web browsers is @link["http://caniuse.com/#search=hyphen"]{still iffy}, soft hyphens work reliably well. But putting them into the text manually is a drag. Thus a module was born.
@section{Installation & updates}
@section{Installation}
At the command line:
@verbatim{raco pkg install hyphenate}
After that, you can update the package from the command line:
After that, you can update the package like so:
@verbatim{raco pkg update hyphenate}
@ -27,15 +27,14 @@ After that, you can update the package from the command line:
@defmodule[hyphenate]
@defproc[
(hyphenate
[text string?]
[xexpr xexpr/c]
[joiner (or/c char? string?) (integer->char #x00AD)]
[#:exceptions exceptions (listof string?) empty]
[#:min-length length (or/c integer? false?) 5])
string?]
Hyphenate @racket[_text] by calculating hyphenation points and inserting @racket[_joiner] at those points. By default, @racket[_joiner] is the soft hyphen (Unicode 00AD = decimal 173). Words shorter than @racket[#:min-length] @racket[_length] will not be hyphenated. To hyphenate words of any length, use @racket[#:min-length] @racket[#f].
Hyphenate @racket[_xexpr] by calculating hyphenation points and inserting @racket[_joiner] at those points. By default, @racket[_joiner] is the soft hyphen (Unicode 00AD = decimal 173). Words shorter than @racket[#:min-length] @racket[_length] will not be hyphenated. To hyphenate words of any length, use @racket[#:min-length] @racket[#f].
@margin-note{The REPL displays a soft hyphen as \u00AD. But in ordinary use, you'll only see a soft hyphen when it appears at the end of a line or page as part of a hyphenated word. Otherwise it's not displayed. In most of the examples here, I use a standard hyphen for clarity.}
@ -80,18 +79,24 @@ For this reason, certain words can't be hyphenated algorithmically, because the
This is the right result. If you used @italic{adder} to mean the machine, it would be hyphenated @italic{add-er}; if you meant the snake, it would be @italic{ad-der}. Better to avoid hyphenation than to hyphenate incorrectly.
You can send HTML-style X-expressions through @racket[hyphenate]. It will recursively hyphenate the text strings, while leaving the tags and attributes alone.
@examples[#:eval my-eval
(hyphenate '(p "polymorphically" (em "formatted" (strong "snowmen"))))
]
Don't send raw HTML through @racket[hyphenate]. It can't distinguish HTML tags and attributes from textual content, so it will hyphenate everything, which will goof up your file.
Don't send raw HTML through @racket[hyphenate]. It can't distinguish HTML tags and attributes from textual content, so everything will be hyphenated, thus goofing up your file. But you can easily convert HTML to an X-expression, hyphenate it, and then convert back.
@examples[#:eval my-eval
(hyphenate "<body style=\"background: yellow\">Hello world</body>")
(define html "<body style=\"background: yellow\">Hello snowman</body>")
(hyphenate html)
(xexpr->string (hyphenate (string->xexpr html)))
]
Instead, send your textual content through @racket[hyphenate] @italic{before} you put it into your HTML template. Or convert your HTML to an X-expression and process it selectively (e.g., with @racket[match]).
@defproc[
(hyphenatef
[text string?]
[xexpr xexpr/c]
[pred procedure?]
[joiner (or/c char? string?) (integer->char \#x00AD)]
[#:exceptions exceptions (listof string?) empty]
@ -121,10 +126,10 @@ It's possible to do fancier kinds of hyphenation restrictions that take account
@defproc[
(unhyphenate
[text string?]
[xexpr xexpr/c]
[joiner (or/c char? string?) @(integer->char #x00AD)])
string?]
Remove @racket[_joiner] from @racket[_text] using @racket[string-replace].
Remove @racket[_joiner] from @racket[_xexpr].
A side effect of using @racket[hyphenate] is that soft hyphens (or whatever the @racket[_joiner] is) will be embedded in the output text. If you need to support copying of text, for instance in a GUI application, you'll probably want to strip out the hyphenation before the copied text is moved to the clipboard.

Loading…
Cancel
Save