Debugging performance
#36
Open
opened 5 years ago by otherjoel
·
10 comments
Loading…
Reference in New Issue
There is no content yet.
Delete Branch '%!s(<nil>)'
Deleting a branch is permanent. It CANNOT be undone. Continue?
At some point in one of my projects in the past month or so, the total time to build the site increased by about 30 seconds. Specifically, when doing a parallel render (and this is after preheating the cache) the first batch of pages render in ≈3sec each, and the rest render in ≈1.5–3 sec each. I can’t be sure if it was a change in my code, or the new Racket version, or (less likely) a change in Pollen. I upgraded to Racket 7.6 but have stuck with the non-CS variant.
This has brought me back to the question: what is a good way to investigate performance in a Pollen project? I am trying to learn how to use DrRacket’s profiling tools, but not getting very far; there isn’t much documentation about them, and the most “interesting” functions listed in the profiler are designated
<< unknown >>
. I’ve so far tried using the profiler on a.poly.pm
source file, and on a separate.rkt
that callsrender-to-file
on that source file.Additionally, it would be nice to see messages when there is a caching miss, so I can rule that out. Maybe another command line switch?
I’ve never used the DrRacket profiler. If I notice a large drop in performance, I usually just reset the branch in git (either of the project, or Pollen itself) until I find the commit that caused the slowdown, and then investigate further.
Sometimes there is a bug. Sometimes a performance drop is a consequence of fixing another bug. For the parallel-rendering system generally, there’s definitely been a tradeoff between getting it to work reliably, and going fast (obviously, when it skips work it ought to be doing, it’s faster).
I’m sure there are plenty of dumb ideas still lingering in the code. It’s harder to test parallel processing that other parts of the system, because it introduces a new class of erratic / occasional errors. The biggest annoyances come from managing access to shared resources (e.g., a file cache) by multiple parallel operations.
I think the better way to do this is to rely on the existing logging infrastructure. For instance, standard Pollen messages are logged at level
info
. We could start emitting other messages atdebug
level (like cache miss). Then you could start your session like so:PLTSTDERR=debug@pollen raco pollen start
And the additional
debug
messages would be shown.I am pretty sure the cause of the behaviour I’m seeing is in my code and not in Pollen. It's a complicated project and I’ve made a lot of changes in the last several weeks.
Doing Advent of Code these last couple of years has been very helpful in getting a sense for where there can be big performance hits and gains in Racket overall. But I'm still hazy on where these optimization soft spots are in a Pollen project, particularly where templates are involved. For example, when a module is only needed within templates, I have a notion (which I have yet to test) that it's faster to
require
and re-provide
a module from withinpollen.rkt
than tolocal-require
it from within a template.I like that idea as well!
I’ve pushed an update that allows you to start Pollen commands in
debug
logging mode using thePLTSTDERR
environment variable, for instance:PLTSTDERR=debug@pollen raco pollen start
PLTSTDERR=debug@pollen raco pollen render
For now the only function that emits debug-level log messages is the file-cache manager (which will report every attempt to cache a file, and whether there is a cache miss). If there are other Pollen events you’re interested in seeing logged under
debug
, let me know.More broadly, the
message-debug
function inpollen/private/log
is what publishes the debug messages. I suppose it would be possible to use this within your own project code to emit debug messages. Though I think you’d probably want to create a separately named logger, so you could set individual log levels, e.g. —PLTSTDERR="debug@joel-project info@pollen" raco pollen start
Coda to my original post: I found that adding
raco make pollen.rkt
as a step prior toraco pollen setup -p
andraco pollen render -p
reduced the build time back down to where it was, erasing the 30-second slowdown I’d seen.Note that the scenarios I’m comparing all involve essentially
raco pollen reset
andrm *.html
as preparatory steps. So no cache or pre-existing output files.I suspect that, when starting with no cache,
raco pollen setup -p
starts up parallel processes that are each attempting to compilepollen.rkt
and its dependencies. And in my project at least, this causes a lot of infighting between places.Also for what it’s worth, the from-scratch build times on this project when using Racket 7.6 (non-CS) were 97 seconds without the
raco make
step and 67 seconds with it. When using the recently-released Racket 7.7 (CS variant) the difference was 157 seconds without and 97 seconds with (yikes).Thanks for that clue — though the Pollen caching system will convert necessary
rkt
files to bytecode (same asraco make
), it looks like I had set it up with the wrong compiler function so it was recompiling those files more often than it needed to. I have pushed a fix for that.As for the difference in Racket BC vs CS, CS is definitely slower (for now, though we can expect it to improve). One area of especially degenerate performance however is parallel processing — I found that for now, CS performance degrades on more than four cores. So I would recommend limiting
raco pollen setup
andrender
to four cores with the-j 4
switch.After updating Pollen, a fresh build of my project is a few seconds faster with my manual
raco make
step (64 seconds) but significantly slower without it (151 seconds)! ¯_(ツ)_/¯My 2015 MacBook has only four cores, so for me
-p
ought to be equivalent to-j 4
. But I do notice that on the latest Racket CS, telling it to use 4 cores still doesn’t actually max out all four cores like it does on Racket BC. I’ll definitely be sticking with BC for now.Strange. I’ll look into this more. The only file you’re passing to
raco make
ispollen.rkt
?Yes…if you’re curious, here is the line in my
makefile
. The process I’m testing is amake zap
followed bymake web
. If I manually do justraco make pollen.rkt
, I get.dep
and.zo
files inside ofcompiled/
forpollen.rkt
and all the files it requires.