- Notifications
You must be signed in to change notification settings - Fork 14
Parse to AST and use printers to go from AST to specific targets #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Make parser produce an AST by default. - Switch to lexical binding and fix some broken var references. - Reorganizes reduce functions' signatures a bit.
All tests are separated now.
Thanks! I left some initial comments, but I'll have to have another look with a fresh mind. Could you look into why the tests are failing on Travis? |
Actually im going to start with writing up an overview of all the use cases this library aims to support, that should hopefully clarify some of the design decisions in there. |
It's failing because of |
Ah I see, no just leave it in. Either it'll get in soon or we bundle it here or we get the build script to clone it from github. On Jul 7, 2017 00:01, Daniel Barreto <notifications@github.com> wrote:It's failing because of a.el, I was hoping it was going to be approved =/. I could drop it though, or should we wait? —You are receiving this because you are subscribed to this thread.Reply to this email directly, view it on GitHub, or mute the thread. |
Cloning it for the repo seems like a good idea meanwhile. Let me know when you have that overview, should be useful. btw
Couldn't find them 😕 |
clj-parse-test.el Outdated
(with-temp-buffer | ||
(insert ,test-string) | ||
(goto-char 1) | ||
(should (equal (,parse-to-fn) ,expected))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of wrapping the whole ert-deftest
in a macro I'm thinking it might be better to only have a macro for the inner part, and still use regular ert-deftest
. I'm thinking of using the same approach for the integration tests in unrepl.el.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the unrepl-deftest
macro? It does include ert-deftest
in it, I was actually copying it when I did this macro 😅, although I agree it's better if we just do the internal part. I'll fix it.
clj-parse.el Outdated
(let ((stack nil)) | ||
| ||
(while (not (eq (clj-lex-token-type token) :eof)) | ||
(while (not (eq (clj-lex-token-type (setq token (clj-parse--next))) :eof)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser should not throw away whitespace, that's the job of the reducing functions (if they choose to do so). Otherwise the parser is no longer able to generate an AST that can round-trip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I left you a comment about it in the clj-parse--next
function definition. I was on the fence about this as well.
(let ((node (pop stack))) | ||
(funcall reduceN stack node coll)) | ||
;; Syntax error | ||
(error "Syntax Error")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser should not throw syntax errors. Instead it should continue parsing and return a partial result. You can inspect the result afterwards to find syntax errors, or raise them in the reducer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say "the reducer", do you mean the reduce1
/reduceN
functions? or the "to-elisp" / "to-clojure-string" functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, reduce1
and reduceN
, which maybe should be called reduce-leaf
and reduce-node
. Or even shift-leaf
and reduce-node
, since the first one will typically "shift" a single leaf node onto the stack, whereas the second one "reduces" a number of nodes into a single node.
you can think of parsing as three layers working together: the lexer, "the parser", "the reducer". The last part is pluggable to support different outputs.
to-clojure-string
would be "the printer"
Ugh this github review thing is confusing :( they should be there now. |
That was quick! thanks. |
clj-parse.el Outdated
(setq next (clj-lex-next)) | ||
(while (eq (clj-lex-token-type next) :whitespace) | ||
(setq next (clj-parse--next))) | ||
next) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this to simplify clj-parse--ast-reduce1
, and it's ok as long as we only use this parser to produce AST, but if we want to use the clj-parse--reduce
algorithm to produce CST (not entirely sure we would want to do that) then we can get rid of it and handle whitespaces when reducing leafs.
clj-parse.el Outdated
(:lparen (clj-parse--make-node :list subnodes)) | ||
(:lbracket (clj-parse--make-node :vector subnodes)) | ||
(:set (clj-parse--make-node :set subnodes)) | ||
(:lbrace (clj-parse--make-node :map subnodes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left maps as a collection of subnodes so that printers would transform them into whatever they need, but maybe we want kv alists in our AST instead?
clj-parse.el Outdated
(:lbracket (clj-parse--make-node :vector subnodes)) | ||
(:set (clj-parse--make-node :set subnodes)) | ||
(:lbrace (clj-parse--make-node :map subnodes)) | ||
(:discard (clj-parse--make-node :discard subnodes))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For :discard
, the subnodes
list only has one element, always. Right now the resulting node for a discard token includes this 1-sized list as the list of subnodes, but we could also get rid of the list and just add the its internal node (the discarded element). I'm guessing this depends on implementation details for the AST transversing funtions/API.
Same for :tag
nodes.
clj-parse-test.el Outdated
(with-temp-buffer | ||
(insert ,test-string) | ||
(goto-char 1) | ||
(should (equal (,parse-to-fn) ,expected))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the unrepl-deftest
macro? It does include ert-deftest
in it, I was actually copying it when I did this macro 😅, although I agree it's better if we just do the internal part. I'll fix it.
clj-parse.el Outdated
(let ((stack nil)) | ||
| ||
(while (not (eq (clj-lex-token-type token) :eof)) | ||
(while (not (eq (clj-lex-token-type (setq token (clj-parse--next))) :eof)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I left you a comment about it in the clj-parse--next
function definition. I was on the fence about this as well.
(let ((node (pop stack))) | ||
(funcall reduceN stack node coll)) | ||
;; Syntax error | ||
(error "Syntax Error")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say "the reducer", do you mean the reduce1
/reduceN
functions? or the "to-elisp" / "to-clojure-string" functions?
clj-parse-test.el Outdated
(ert-deftest clj-parse-to-elisp-simple-list () | ||
(clj-parse-eq-test clj-parse-to-elisp | ||
"(1 2 3)" | ||
'((1 2 3)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I don't know if this is the best way to indent this.
Turns out I didn't send my review either...... it is confusing. |
@plexus I did some adjustments regarding the DESIGN.md document to help land this PR into master and keep working on the rest of the implementation in further PRs. So far we have:
I rewrote tests again, and this time I went with including Regarding
|
When unsure about how to represent things like
That said discards are like whitespace or comments: they don't add any semantics and so should not be represented in the AST mode. In the whitespace aware source mode they should be present. |
This refactor includes the following:
--
convention.It also includes two printers for the AST, one that goes to elisp code (ala edn.el) and one that returns EDN/Clojure strings.