Unexpected behavior in from/stop-before
#26
Closed
opened 3 years ago by tgbugs
·
7 comments
Loading…
Reference in New Issue
There is no content yet.
Delete Branch '%!s(<nil>)'
Deleting a branch is permanent. It CANNOT be undone. Continue?
The code block below illustrates two issues with
from/stop-before
(implementation seen here)6c161ae31d/brag-lib/brag/support.rkt (L115-L121)
.The primary issue is that
from/stop-before
only stops before the last element in a:seq
and does not correctly back up to before the first element of thestop-before
pattern.There is a secondary issue, which is possibly a matter of documentation, is that
from/stop-before
will match eof tostop-before
.I'm fairly certain that the second issue is due to the behavior of
complement
that is used in the implementation.I am not sure about the underlying cause of the first issue.
Output
I think that the implementation of complement is in
32fc3b68d1/br-parser-tools-lib/br-parser-tools/private-lex/stx.rkt (L84-L88)
.edit: Further exploration of the expanded form suggests that the issue may be in the definition of
lexer-body
here32fc3b68d1/br-parser-tools-lib/br-parser-tools/lex.rkt (L215-L283)
. If this is ultimately an issue in br-parser-tools, let me know if I should create a new issue there.aside: The expansion of
lexer
includes an unused reference tolexeme-srcloc-p
that bloats the expanded form and probably slows down compile times.As you can see,
from/stop-before
is a dumb little lexer macro. So the question is whether you can get the matching behavior you want without usingfrom/stop-before
—If so, then that should become the new implementation for
from/stop-before
.If not, then it’s just a limitation of the underlying pattern matchers in
br-parser-tools
(which is basically just a fork ofparser-tools
). In that case, the only fix is to update the documentation to describe the limitation.As I suspected. I'll dig around to see if I can find a way to implement the behavior I'm looking for and report back. If there is a lexer form that can do limited lookahead it should be possible.
As suspected, there isn't a way to fix this using lexer abbreviations alone, so the documentation of the behavior should be updated.
However, there is a way to fix the issue on the action side by using
file-position
to set the port to the correct position. The solution resolves both issues, but with some trade-offs, and I do no think that it is not sufficiently general for inclusion inbrag/support
, though maybe with a few tweaks it could be?For my narrow use-case
find-last
works because I only need to search backward to the last newline in the lexeme, however more complicated stop-before patterns would likely require running something likeregexp-match-positions
for the stop pattern over the lexeme.OK. I will update the docs. I think
token-stop-before
would make more sense as an add-on or patch for the underlyingparser-tools
package, because that’s the primary residence of those lexer matchers.Closed by
243f32a7bb
Just a note in case anyone comes across this in the future.
from/stop-before
can be used with:seq
or any pattern than can match more than a single literal token, but it only ever stops before the final element of the pattern.