You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
typesetting/hyphenate/scribblings/hyphenate.html

2 lines
21 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"/><title>Hyphenate</title><link rel="stylesheet" type="text/css" href="scribble.css" title="default"/><link rel="stylesheet" type="text/css" href="racket.css" title="default"/><link rel="stylesheet" type="text/css" href="manual-style.css" title="default"/><link rel="stylesheet" type="text/css" href="manual-racket.css" title="default"/><script type="text/javascript" src="scribble-common.js"></script><!--[if IE 6]><style type="text/css">.SIEHidden { overflow: hidden; }</style><![endif]--></head><body id="scribble-racket-lang-org"><div class="tocset"><div class="tocview"><div class="tocviewlist tocviewlisttopspace"><div class="tocviewtitle"><table cellspacing="0" cellpadding="0"><tr><td style="width: 1em;"><a href="javascript:void(0);" title="Expand/Collapse" class="tocviewtoggle" onclick="TocviewToggle(this,&quot;tocview_0&quot;);">&#9658;</a></td><td></td><td><a href="file:///Users/mb/git/hyphenate/scribblings/hyphenate.html" class="tocviewselflink" data-pltdoc="x">Hyphenate</a></td></tr></table></div><div class="tocviewsublistonly" style="display: none;" id="tocview_0"><table cellspacing="0" cellpadding="0"><tr><td align="right">1&nbsp;</td><td><a href="file:///Users/mb/git/hyphenate/scribblings/hyphenate.html#%28part._.How_to_use_it%29" class="tocviewlink" data-pltdoc="x">How to use it</a></td></tr><tr><td align="right">2&nbsp;</td><td><a href="file:///Users/mb/git/hyphenate/scribblings/hyphenate.html#%28part._.Interface%29" class="tocviewlink" data-pltdoc="x">Interface</a></td></tr></table></div></div></div><div class="tocsub"><table class="tocsublist" cellspacing="0"><tr><td><span class="tocsublinknumber"></span><a href="#%28part._.Hyphenate%29" class="tocsubseclink" data-pltdoc="x">Hyphenate</a></td></tr><tr><td><span class="tocsublinknumber">1<tt>&nbsp;</tt></span><a href="#%28part._.How_to_use_it%29" class="tocsubseclink" data-pltdoc="x">How to use it</a></td></tr><tr><td><span class="tocsublinknumber">2<tt>&nbsp;</tt></span><a href="#%28part._.Interface%29" class="tocsubseclink" data-pltdoc="x">Interface</a></td></tr><tr><td><a href="#%28def._%28%28lib._hyphenate%2Fmain..rkt%29._hyphenate%29%29" class="tocsubnonseclink" data-pltdoc="x"><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span></a></td></tr><tr><td><a href="#%28def._%28%28lib._hyphenate%2Fmain..rkt%29._hyphenatef%29%29" class="tocsubnonseclink" data-pltdoc="x"><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenatef</span></span></span></a></td></tr><tr><td><a href="#%28def._%28%28lib._hyphenate%2Fmain..rkt%29._unhyphenate%29%29" class="tocsubnonseclink" data-pltdoc="x"><span class="RktSym"><span class="badlink"><span class="RktValLink">unhyphenate</span></span></span></a></td></tr></table></div></div><div class="maincolumn"><div class="main"><div class="versionbox"><span class="versionNoNav">6.0.0.1</span></div><h2><a name="(part._.Hyphenate)"></a>Hyphenate</h2><div class="SAuthorListBox"><span class="SAuthorList"><p class="author">Matthew Butterick (mb@mbtype.com)</p></span></div><p><table cellspacing="0" class="defmodule"><tr><td align="left"><span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">require</span></span></span><span class="stt"> </span><font class="badlink"><span class="RktModLink"><span class="RktSym">hyphenate</span></span></font><span class="RktPn">)</span></td><td align="right"><span class="RpackageSpec"><span class="Smaller">&nbsp;package:</span> <span class="stt">hyphenate</span></span></td></tr></table></p><p>A simple hyphenation module that uses the Knuth&#8211;Liang hyphenation algorithm and patterns originally developed for TeX. This implementation was ported from Ned Batchelder&rsquo;s <a href="http://nedbatchelder.com/code/modules/hyphenate.html">Python version</a>.</p><p>I originally developed this module to handle hyphenation for my web-based book <a href="http://practicaltypography.com">Butterick&rsquo;s Practical Typography</a>. Even though support for CSS-based hyphenation is still iffy among web browsers, soft hyphens work reliably.</p><h3>1<tt>&nbsp;</tt><a name="(part._.How_to_use_it)"></a>How to use it</h3><h3>2<tt>&nbsp;</tt><a name="(part._.Interface)"></a>Interface</h3><p><div class="SIntrapara"><blockquote class="SVInsetFlow"><table cellspacing="0" class="boxed RBoxed"><tr><td><blockquote class="SubFlow"><div class="RBackgroundLabel SIEHidden"><div class="RBackgroundLabelInner"><p>procedure</p></div></div><table cellspacing="0" class="prototype RForeground"><tr><td><span class="RktPn">(</span><a name="(def._((lib._hyphenate/main..rkt)._hyphenate))"></a><span title="Provided from: hyphenate | Package: hyphenate"><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span></span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktVar">text</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span>[</td><td><span class="RktVar">joiner</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktPn">#:exceptions</span><span class="hspace">&nbsp;</span><span class="RktVar">exceptions</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktPn">#:min-length</span><span class="hspace">&nbsp;</span><span class="RktVar">length</span>]<span class="RktPn">)</span></td><td><span class="hspace">&nbsp;</span></td><td>&rarr;</td><td><span class="hspace">&nbsp;</span></td><td><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span></td></tr></table></blockquote></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">text</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">joiner</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">or/c</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">char?</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">integer-&gt;char</span></span></span><span class="hspace">&nbsp;</span><span class="RktVal">173</span><span class="RktPn">)</span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">exceptions</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">listof</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">empty</span></span></span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">length</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">or/c</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">integer?</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">false?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktVal">5</span></td></tr></table></blockquote></div><div class="SIntrapara">Hyphenate <span class="RktVar">text</span> by calculating hyphenation points and inserting <span class="RktVar">joiner</span> at those points. By default, <span class="RktVar">joiner</span> is the soft hyphen. Words shorter than <span class="RktVar">length</span> will not be hyphenated. To hyphenate words of any length, use <span class="RktPn">#:min-length</span> <span class="RktVal">#f</span>.</div></p><blockquote class="refpara"><blockquote class="refcolumn"><blockquote class="refcontent"><p>The REPL will display a soft hyphen as #\u00AD. But in ordinary use, you only see a soft hyphen when it appears at the end of a line or page as part of a hyphenated word. Otherwise it&rsquo;s invisible.</p></blockquote></blockquote></blockquote><p>Using the <span class="RktPn">#:exceptions</span> keyword, you can pass hyphenation exceptions as a list of words with regular hyphen characters (<span class="RktVal">"-"</span>) marking the permissible hyphenation points. If an exception word contains no hyphens, that word will never be hyphenated.</p><p><table cellspacing="0" class="RktBlk"><tr><td><p>Examples:</p></td></tr><tr><td><blockquote class="SCodeFlow"><table cellspacing="0" class="RktBlk"><tr><td><span class="stt">&gt; </span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span><span class="hspace">&nbsp;</span><span class="RktVal">"polymorphism"</span><span class="hspace">&nbsp;</span><span class="RktVal">#\-</span><span class="RktPn">)</span></td></tr><tr><td><p><span class="RktErr">hyphenate: undefined;</span></p></td></tr><tr><td><p><span class="RktErr"></span><span class="hspace">&nbsp;</span><span class="RktErr">cannot reference undefined identifier</span></p></td></tr><tr><td><span class="stt">&gt; </span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span><span class="hspace">&nbsp;</span><span class="RktVal">"polymorphism"</span><span class="hspace">&nbsp;</span><span class="RktVal">#\-</span><span class="hspace">&nbsp;</span><span class="RktPn">#:exceptions</span><span class="hspace">&nbsp;</span><span class="RktVal">'</span><span class="RktVal">(</span><span class="RktVal">"polymo-rphism"</span><span class="RktVal">)</span><span class="RktPn">)</span></td></tr><tr><td><p><span class="RktErr">hyphenate: undefined;</span></p></td></tr><tr><td><p><span class="RktErr"></span><span class="hspace">&nbsp;</span><span class="RktErr">cannot reference undefined identifier</span></p></td></tr><tr><td><span class="stt">&gt; </span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span><span class="hspace">&nbsp;</span><span class="RktVal">"polymorphism"</span><span class="hspace">&nbsp;</span><span class="RktVal">#\-</span><span class="hspace">&nbsp;</span><span class="RktPn">#:exceptions</span><span class="hspace">&nbsp;</span><span class="RktVal">'</span><span class="RktVal">(</span><span class="RktVal">"polymorphism"</span><span class="RktVal">)</span><span class="RktPn">)</span></td></tr><tr><td><p><span class="RktErr">hyphenate: undefined;</span></p></td></tr><tr><td><p><span class="RktErr"></span><span class="hspace">&nbsp;</span><span class="RktErr">cannot reference undefined identifier</span></p></td></tr></table></blockquote></td></tr></table></p><p>Knuth &amp; Liang were sufficiently confident about their algorithm that they originally released it with only 14 exceptions: <span style="font-style: italic">associate[s], declination, obligatory, philanthropic, present[s], project[s], reciprocity, recognizance, reformation, retribution</span>, and <span style="font-style: italic">table</span>. While their bravado is admirable, it&rsquo;s easy to discover words they missed.</p><p>Don&rsquo;t send raw HTML through <span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span>. It can&rsquo;t distinguish HTML tags and attributes from textual content, but it will hyphenate them anyhow, which will break the markup. Run your textual content through <span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span> before you put it into your page template. Or convert your HTML to an X-expression and process it selectively.</p><p><div class="SIntrapara"><blockquote class="SVInsetFlow"><table cellspacing="0" class="boxed RBoxed"><tr><td><blockquote class="SubFlow"><div class="RBackgroundLabel SIEHidden"><div class="RBackgroundLabelInner"><p>procedure</p></div></div><table cellspacing="0" class="prototype RForeground"><tr><td><span class="RktPn">(</span><a name="(def._((lib._hyphenate/main..rkt)._hyphenatef))"></a><span title="Provided from: hyphenate | Package: hyphenate"><span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenatef</span></span></span></span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktVar">text</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktVar">pred</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span>[</td><td><span class="RktVar">joiner</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktPn">#:exceptions</span><span class="hspace">&nbsp;</span><span class="RktVar">exceptions</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td></tr><tr><td><span class="hspace">&nbsp;</span></td><td><span class="hspace">&nbsp;</span></td><td><span class="RktPn">#:min-length</span><span class="hspace">&nbsp;</span><span class="RktVar">length</span>]<span class="RktPn">)</span></td><td><span class="hspace">&nbsp;</span></td><td>&rarr;</td><td><span class="hspace">&nbsp;</span></td><td><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span></td></tr></table></blockquote></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">text</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">pred</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">procedure?</span></span></span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">joiner</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">or/c</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">char?</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">integer-&gt;char</span></span></span><span class="hspace">&nbsp;</span><span class="RktVal">173</span><span class="RktPn">)</span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">exceptions</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">listof</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">empty</span></span></span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">length</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">or/c</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">integer?</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">false?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktVal">5</span></td></tr></table></blockquote></div><div class="SIntrapara">Like <span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span>, but only words matching <span class="RktVar">pred</span> are hyphenated. Convenient if you want to filter out, say, capitalized words.</div></p><p><div class="SIntrapara"><blockquote class="SVInsetFlow"><table cellspacing="0" class="boxed RBoxed"><tr><td><blockquote class="SubFlow"><div class="RBackgroundLabel SIEHidden"><div class="RBackgroundLabelInner"><p>procedure</p></div></div><p class="RForeground"><span class="RktPn">(</span><a name="(def._((lib._hyphenate/main..rkt)._unhyphenate))"></a><span title="Provided from: hyphenate | Package: hyphenate"><span class="RktSym"><span class="badlink"><span class="RktValLink">unhyphenate</span></span></span></span><span class="hspace">&nbsp;</span><span class="RktVar">text</span><span class="hspace">&nbsp;</span>[<span class="RktVar">joiner</span>]<span class="RktPn">)</span><span class="hspace">&nbsp;</span>&rarr;<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span></p></blockquote></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">text</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span></td></tr><tr><td><span class="hspace">&nbsp;&nbsp;</span><span class="RktVar">joiner</span><span class="hspace">&nbsp;</span>:<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">or/c</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">char?</span></span></span><span class="hspace">&nbsp;</span><span class="RktSym"><span class="badlink"><span class="RktValLink">string?</span></span></span><span class="RktPn">)</span><span class="hspace">&nbsp;</span>=<span class="hspace">&nbsp;</span><span class="RktPn">(</span><span class="RktSym"><span class="badlink"><span class="RktValLink">integer-&gt;char</span></span></span><span class="hspace">&nbsp;</span><span class="RktVal">173</span><span class="RktPn">)</span></td></tr></table></blockquote></div><div class="SIntrapara">Remove <span class="RktVar">joiner</span> from <span class="RktVar">text</span>. Essentially equivalent to (<span class="RktSym"><span class="badlink"><span class="RktValLink">string-replace</span></span></span> <span class="RktVar">text</span> <span class="RktVar">joiner</span> "").</div></p><p>A side effect of using <span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span> is that soft hyphens (or whatever the <span class="RktVar">joiner</span> is) are being embedded in the output <span class="RktVar">text</span>. If you&rsquo;re building an application that needs to support, for instance, copying of text in a graphical interface, you probably want to strip out the hyphenation before the copied text is moved to the clipboard.</p><p>Keep in mind, however, that <span class="RktSym"><span class="badlink"><span class="RktValLink">unhyphenate</span></span></span> won&rsquo;t produce the input originally passed to <span class="RktSym"><span class="badlink"><span class="RktValLink">hyphenate</span></span></span> if the <span class="RktVar">joiner</span> was part of the original input <span class="RktVar">text</span>.</p></div></div><div id="contextindicator">&nbsp;</div></body></html>