Approaches for rendering math statically #84

I'd like to render math entries statically during the build process, such that reading an article with math equations doesn't depend on the user enabling Javascript and MathJax (or KaTeX).

KaTeX already has facilities for this workflow, but is written in JS. Unfortunately I am unable to get a proper interoperability between those two languages going (with the help of subprocess). This would be the most direct way to go about it, but starting an entire node runtime for every single snippet of math will quickly degrade the performance. Hence, I am trying to find alternatives to calling out to node.

The, to me, most promising one seems to be to collect all math formulas and then call out to node/KaTeX once to get the corresponding HTML elements, thus building up a hash map. I think it should be possible with pollen, but I am unsure of how to go about this. Are there any examples that do the same or something similar? I think something like bibliography management could be close enough.

I'd like to render math entries statically during the build process, such that reading an article with math equations doesn't depend on the user enabling Javascript and MathJax (or KaTeX). KaTeX already has facilities for this workflow, but is written in JS. Unfortunately I am unable to get a proper interoperability between those two languages going (with the help of subprocess). This would be the most direct way to go about it, but starting an entire node runtime for every single snippet of math will quickly degrade the performance. Hence, I am trying to find alternatives to calling out to `node`. The, to me, most promising one seems to be to collect all math formulas and then call out to node/KaTeX once to get the corresponding HTML elements, thus building up a hash map. I think it should be possible with pollen, but I am unsure of how to go about this. Are there any examples that do the same or something similar? I think something like bibliography management could be close enough.

I'm also curious about this. I assume that you'd have to parse the HTML output from Node/KaTeX back to X-expressions as well, since (as far as I know) it's not possible to specify raw HTML inside an X-expr.

The, to me, most promising one seems to be to collect all math formulas and then call out to node/KaTeX once to get the corresponding HTML elements, thus building up a hash map. I think it should be possible with pollen, but I am unsure of how to go about this.

This seems like a logical approach. I’d go about it like this (all inside your root tag function):

Fetch out all the formula tags using splitf-txexpr.
Call out to KaTeX to get your converted formulas.
Replace the formulas using decode, providing a function provided for #:txexpr-proc.

The function provided for step 3 will only get one txexpr at a time, so it’s going to have to keep track of the formulas somehow. The simplest way to do this is to cheat and mutate some outer-scoped variable witin that function. Here’s one way to do it:

#lang racket

(require xml txexpr pollen/core)

(define (formula? tx) (and (txexpr? tx) (equal? 'formula (get-tag tx))))

(define (root . elems)
  (define body `(body ,@elems))
  (define-values (_ pollen-formulas) (splitf-txexpr body formula?)) ; step 1

  ; step 2 (you’ll need to define get-katex-formulas; I’ll assume it returns a list of xexprs)
  (define converted-formulas (get-katex-formulas pollen-formulas)) 

  ; Step 3
  ; This function will “pop” the first value from converted-formulas
  (define (replace-formula tx)
    (cond [(equal? 'formula (get-tag tx))
           (begin0
             (first converted-formulas)
             (set! converted-formulas (rest converted-formulas)))]
          [else tx]))
  (decode body #:txexpr-proc replace-formula))

Keeping everything inside the root function has the advantage of being “safe” for multiple renders, since everything at the module level is shared across multiple Pollen docs if you are rendering more than one at a time.

> The, to me, most promising one seems to be to collect all math formulas and then call out to node/KaTeX once to get the corresponding HTML elements, thus building up a hash map. I think it should be possible with pollen, but I am unsure of how to go about this. This seems like a logical approach. I’d go about it like this (all inside your `root` tag function): 1. Fetch out all the formula tags using [`splitf-txexpr`][1]. 2. Call out to KaTeX to get your converted formulas. 3. Replace the formulas using [`decode`][2], providing a function provided for `#:txexpr-proc`. The function provided for step 3 will only get one txexpr at a time, so it’s going to have to keep track of the formulas somehow. The simplest way to do this is to cheat and mutate some outer-scoped variable witin that function. Here’s one way to do it: ```racket #lang racket (require xml txexpr pollen/core) (define (formula? tx) (and (txexpr? tx) (equal? 'formula (get-tag tx)))) (define (root . elems) (define body `(body ,@elems)) (define-values (_ pollen-formulas) (splitf-txexpr body formula?)) ; step 1 ; step 2 (you’ll need to define get-katex-formulas; I’ll assume it returns a list of xexprs) (define converted-formulas (get-katex-formulas pollen-formulas)) ; Step 3 ; This function will “pop” the first value from converted-formulas (define (replace-formula tx) (cond [(equal? 'formula (get-tag tx)) (begin0 (first converted-formulas) (set! converted-formulas (rest converted-formulas)))] [else tx])) (decode body #:txexpr-proc replace-formula)) ``` Keeping everything inside the `root` function has the advantage of being “safe” for multiple renders, since everything at the module level is shared across multiple Pollen docs if you are rendering more than one at a time. [1]: https://docs.racket-lang.org/txexpr/index.html#%28def._%28%28lib._txexpr%2Fmain..rkt%29._splitf-txexpr%29%29 [2]: https://docs.racket-lang.org/pollen/Decode.html#%28def._%28%28lib._pollen%2Fdecode..rkt%29._decode%29%29

That's a good way to approach it, I was thinking about creating a hashmap that would then return the transformed katex-xexpr. This way the call to decode would only be a lookup in this dict and you'd get an (insignificant) amount of caching.

That's a good way to approach it, I was thinking about creating a hashmap that would then return the transformed katex-xexpr. This way the call to `decode` would only be a lookup in this dict and you'd get an (insignificant) amount of caching.

To fill in the last piece of the puzzle I have implemented what Joel outlines and the code below basically completes step 2 above. You need node installed and the file katex.js.

The txexpr can have an optional boolean attribute display that controls whether the snippet is rendered in display mode or in inline mode (default).

To circumvent the issues with pipes in subprocess (for some reason I was not able to get that to work), I have opted to just call out to system and use regular files for the JS instructions and for writing out the result.

(require xml txexpr)
;; katex math rendering
(define (get-katex-formulas math-list)
  (define (format-katex tx out)
    (fprintf out "console.log(katex.renderToString(~s, {display: ~a, throwOnError: false}));"
             (string-join (remove "\n" (get-elements tx) string=?))
             (if (attrs-have-key? tx 'display)
                 "true"
                 "false")))
  ;; js file needs to be in the same dir as katex.js -- will set cwd for node
  (let ([jspath (make-temporary-file ".rkttmp.~a.js" #f (current-directory))]
        [outpath (make-temporary-file)])
    (call-with-output-file
      jspath
      #:exists 'truncate
      (lambda (out)
        ;; leading dot is necessary so it's interpreted as a local file
        (displayln "var katex = require('./katex.js');" out)
        (map (λ (s) (format-katex s out)) math-list)
        (displayln "process.exit();" out)))

    (system (string-join (list "node" (path->string jspath) ">" (path->string outpath))))
    (define math-html (port->lines (open-input-file outpath)))
    (begin0
        (map string->xexpr math-html)
      (delete-file jspath)
      (delete-file outpath))))

Thanks again for outlining the procedure Joel!

To fill in the last piece of the puzzle I have implemented what Joel outlines and the code below basically completes step 2 above. You need `node` installed and the file `katex.js`. The txexpr can have an optional boolean attribute `display` that controls whether the snippet is rendered in display mode or in inline mode (default). To circumvent the issues with pipes in `subprocess` (for some reason I was not able to get that to work), I have opted to just call out to `system` and use regular files for the JS instructions and for writing out the result. ```racket (require xml txexpr) ;; katex math rendering (define (get-katex-formulas math-list) (define (format-katex tx out) (fprintf out "console.log(katex.renderToString(~s, {display: ~a, throwOnError: false}));" (string-join (remove "\n" (get-elements tx) string=?)) (if (attrs-have-key? tx 'display) "true" "false"))) ;; js file needs to be in the same dir as katex.js -- will set cwd for node (let ([jspath (make-temporary-file ".rkttmp.~a.js" #f (current-directory))] [outpath (make-temporary-file)]) (call-with-output-file jspath #:exists 'truncate (lambda (out) ;; leading dot is necessary so it's interpreted as a local file (displayln "var katex = require('./katex.js');" out) (map (λ (s) (format-katex s out)) math-list) (displayln "process.exit();" out))) (system (string-join (list "node" (path->string jspath) ">" (path->string outpath)))) (define math-html (port->lines (open-input-file outpath))) (begin0 (map string->xexpr math-html) (delete-file jspath) (delete-file outpath)))) ``` Thanks again for outlining the procedure Joel!

Labels Milestones

Approaches for rendering math statically #84