Better method for cross-page interactions? #69

Open
opened 4 years ago by miakramer · 3 comments
miakramer commented 4 years ago (Migrated from github.com)

I'm working on a book that's heavily cross-referenced, and I'm using LaTeX style references (so you put a ◊label{some-name} and can refer to it with ◊ref{some-name}, which will put some human-understandable text like "Subsection 3.2.1" or "Table 6.2" as the link text. I have a working system, but it required a separate compilation step (which I called assemble.rkt). Essentially, it has two modes: a quick iteration mode using the built-in server that doesn't touch anything cross-page, and a 'full' mode that compiles all pages at once.

The reason I did it this way is that it doesn't seem like metas can be modified from within pollen.rkt, I assume because they seem to be an immutable hash table that's worked out before the rest of the execution. So, loading other pages for each page quickly becomes an O(n^2) problem (since searching for labels requires traversing the document tree) if only using pollen.rkt, so I made a file that loads all of the pages first and then resolves the references. It works pretty well, but I was wondering if there's something obvious I've missed that would make this work better? I'm also using it for citations from a BibTeX database that auto-generates a bibliography page, and for a global table of contents generation. assemble.rkt works pretty well, actually, but it just feels a little inelegant since I essentially leave 'marker' elements in pollen.rkt that then are actually processed later.

I'm working on a book that's heavily cross-referenced, and I'm using LaTeX style references (so you put a `◊label{some-name}` and can refer to it with `◊ref{some-name}`, which will put some human-understandable text like "Subsection 3.2.1" or "Table 6.2" as the link text. I have a working system, but it required a separate compilation step (which I called `assemble.rkt`). Essentially, it has two modes: a quick iteration mode using the built-in server that doesn't touch anything cross-page, and a 'full' mode that compiles all pages at once. The reason I did it this way is that it doesn't seem like metas can be modified from within `pollen.rkt`, I assume because they seem to be an immutable hash table that's worked out before the rest of the execution. So, loading other pages for each page quickly becomes an O(n^2) problem (since searching for labels requires traversing the document tree) if only using `pollen.rkt`, so I made a file that loads all of the pages first and then resolves the references. It works pretty well, but I was wondering if there's something obvious I've missed that would make this work better? I'm also using it for citations from a BibTeX database that auto-generates a bibliography page, and for a global table of contents generation. `assemble.rkt` works pretty well, actually, but it just feels a little inelegant since I essentially leave 'marker' elements in `pollen.rkt` that then are _actually_ processed later.
otherjoel commented 4 years ago (Migrated from github.com)

You can update the metas from within pollen.rkt. You can't use define-meta but you can write your own function to do the same thing:

(define (set-meta key val)
  (define metas (current-metas))
  (unless (not metas)
    (current-metas (hash-set metas key val)))

current-metas is a parameter. So you can either retrieve it using the zero-argument form, or change it using the one-argument form.

Metas added in this way will be included in the cached metas for each source.

My understanding is that this didn't used to be possible, but now is.

You can update the metas from within `pollen.rkt`. You can't use `define-meta` but you can write your own function to do the same thing: ```racket (define (set-meta key val) (define metas (current-metas)) (unless (not metas) (current-metas (hash-set metas key val))) ``` `current-metas` is a [parameter](https://beautifulracket.com/explainer/parameters.html). So you can either retrieve it using the zero-argument form, or change it using the one-argument form. Metas added in this way will be included in the cached metas for each source. My understanding is that this didn't used to be possible, but now is.
otherjoel commented 4 years ago (Migrated from github.com)

In terms of walking over lots of Pollen documents, I've found you can get some decent speed improvements by collecting metas/docs into vectors rather than lists.

Something along the lines of:

(define (get-all-metas page-list)
  (for/vector ([page (in-list page-list)])
    (get-metas page)))

I was able to get a complete insert of info for 48 Pollen documents into a SQLite database down to about 1.6 seconds on my 2015 laptop with this function which uses all the tricks:

  • Separate thread for each file
  • Collect into a vector instead of a list
  • Put all INSERTs into a SQL transaction

As you can see, I'm doing some pretty heavy post-pollen.rkt processing of both docs and metas in that file. I'm going to rewrite it so that everything I need is in the metas though.

In terms of walking over lots of Pollen documents, I've found you can get some decent speed improvements by collecting metas/docs into [vectors](https://docs.racket-lang.org/guide/vectors.html?q=parameters) rather than lists. Something along the lines of: ```racket (define (get-all-metas page-list) (for/vector ([page (in-list page-list)]) (get-metas page))) ``` I was able to get a complete insert of info for 48 Pollen documents into a SQLite database down to about 1.6 seconds on my 2015 laptop with [this function](https://thelocalyarn.com/code/artifact?udc=1&ln=28-52&name=7c40c65350955235) which uses all the tricks: * Separate thread for each file * Collect into a vector instead of a list * Put all INSERTs into a SQL transaction As you can see, I'm doing some pretty heavy post-`pollen.rkt` processing of both docs and metas in that file. I'm going to rewrite it so that everything I need is in the metas though.
miakramer commented 4 years ago (Migrated from github.com)

Thanks for the tips!

Thanks for the tips!
This repo is archived. You cannot comment on issues.
No Milestone
No project
No Assignees
1 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mbutterick/pollen-users#69
Loading…
There is no content yet.