Parse to AST and use printers to go from AST to specific targets #1

volrath · 2017-07-05T18:30:51Z

This refactor includes the following:

Make parser produce an AST by default.
Switch to lexical binding and fix some broken var references.
Reorganize reduce functions a bit.
Rename some private functions to follow the -- convention.
Separate all test cases.

It also includes two printers for the AST, one that goes to elisp code (ala edn.el) and one that returns EDN/Clojure strings.

- Make parser produce an AST by default. - Switch to lexical binding and fix some broken var references. - Reorganizes reduce functions' signatures a bit.

All tests are separated now.

plexus · 2017-07-06T21:32:01Z

Thanks! I left some initial comments, but I'll have to have another look with a fresh mind.

Could you look into why the tests are failing on Travis?

plexus · 2017-07-06T21:51:27Z

Actually im going to start with writing up an overview of all the use cases this library aims to support, that should hopefully clarify some of the design decisions in there.

volrath · 2017-07-06T22:01:04Z

It's failing because of a.el, I was hoping it was going to be approved by now =/. I could drop it though, or should we wait?

lambdaisland · 2017-07-06T22:14:29Z

Ah I see, no just leave it in. Either it'll get in soon or we bundle it here or we get the build script to clone it from github. On Jul 7, 2017 00:01, Daniel Barreto <notifications@github.com> wrote:It's failing because of a.el, I was hoping it was going to be approved =/. I could drop it though, or should we wait? —You are receiving this because you are subscribed to this thread.Reply to this email directly, view it on GitHub, or mute the thread.

volrath · 2017-07-07T10:37:17Z

Cloning it for the repo seems like a good idea meanwhile. Let me know when you have that overview, should be useful.

btw

Thanks! I left some initial comments,

Couldn't find them 😕

plexus · 2017-07-06T21:18:08Z

clj-parse-test.el

+ (with-temp-buffer
+ (insert ,test-string)
+ (goto-char 1)
+ (should (equal (,parse-to-fn) ,expected)))))


Instead of wrapping the whole ert-deftest in a macro I'm thinking it might be better to only have a macro for the inner part, and still use regular ert-deftest. I'm thinking of using the same approach for the integration tests in unrepl.el.

Do you mean the unrepl-deftest macro? It does include ert-deftest in it, I was actually copying it when I did this macro 😅, although I agree it's better if we just do the internal part. I'll fix it.

plexus · 2017-07-06T21:22:52Z

clj-parse.el

+ (let ((stack nil))

- (while (not (eq (clj-lex-token-type token) :eof))
+ (while (not (eq (clj-lex-token-type (setq token (clj-parse--next))) :eof))


The parser should not throw away whitespace, that's the job of the reducing functions (if they choose to do so). Otherwise the parser is no longer able to generate an AST that can round-trip.

Alright, I left you a comment about it in the clj-parse--next function definition. I was on the fence about this as well.

plexus · 2017-07-06T21:27:18Z

clj-parse.el

+ (let ((node (pop stack)))
+ (funcall reduceN stack node coll))
+ ;; Syntax error
+ (error "Syntax Error"))))


The parser should not throw syntax errors. Instead it should continue parsing and return a partial result. You can inspect the result afterwards to find syntax errors, or raise them in the reducer.

When you say "the reducer", do you mean the reduce1/reduceN functions? or the "to-elisp" / "to-clojure-string" functions?

yes, reduce1 and reduceN, which maybe should be called reduce-leaf and reduce-node. Or even shift-leaf and reduce-node, since the first one will typically "shift" a single leaf node onto the stack, whereas the second one "reduces" a number of nodes into a single node.

you can think of parsing as three layers working together: the lexer, "the parser", "the reducer". The last part is pluggable to support different outputs.

to-clojure-string would be "the printer"

plexus · 2017-07-07T10:38:29Z

Ugh this github review thing is confusing :( they should be there now.

volrath · 2017-07-07T10:40:29Z

That was quick! thanks.

volrath · 2017-07-05T18:34:39Z

clj-parse.el

+ (setq next (clj-lex-next))
+ (while (eq (clj-lex-token-type next) :whitespace)
+ (setq next (clj-parse--next)))
+ next)


I made this to simplify clj-parse--ast-reduce1, and it's ok as long as we only use this parser to produce AST, but if we want to use the clj-parse--reduce algorithm to produce CST (not entirely sure we would want to do that) then we can get rid of it and handle whitespaces when reducing leafs.

volrath · 2017-07-05T18:36:50Z

clj-parse.el

+ (:lparen (clj-parse--make-node :list subnodes))
+ (:lbracket (clj-parse--make-node :vector subnodes))
+ (:set (clj-parse--make-node :set subnodes))
+ (:lbrace (clj-parse--make-node :map subnodes))


I left maps as a collection of subnodes so that printers would transform them into whatever they need, but maybe we want kv alists in our AST instead?

volrath · 2017-07-06T08:37:53Z

clj-parse.el

+ (:lbracket (clj-parse--make-node :vector subnodes))
+ (:set (clj-parse--make-node :set subnodes))
+ (:lbrace (clj-parse--make-node :map subnodes))
+ (:discard (clj-parse--make-node :discard subnodes)))


For :discard, the subnodes list only has one element, always. Right now the resulting node for a discard token includes this 1-sized list as the list of subnodes, but we could also get rid of the list and just add the its internal node (the discarded element). I'm guessing this depends on implementation details for the AST transversing funtions/API.

Same for :tag nodes.

volrath · 2017-07-07T10:47:19Z

clj-parse-test.el

+ (with-temp-buffer
+ (insert ,test-string)
+ (goto-char 1)
+ (should (equal (,parse-to-fn) ,expected)))))


Do you mean the unrepl-deftest macro? It does include ert-deftest in it, I was actually copying it when I did this macro 😅, although I agree it's better if we just do the internal part. I'll fix it.

volrath · 2017-07-07T10:49:46Z

clj-parse.el

+ (let ((stack nil))

- (while (not (eq (clj-lex-token-type token) :eof))
+ (while (not (eq (clj-lex-token-type (setq token (clj-parse--next))) :eof))


Alright, I left you a comment about it in the clj-parse--next function definition. I was on the fence about this as well.

volrath · 2017-07-07T10:53:10Z

clj-parse.el

+ (let ((node (pop stack)))
+ (funcall reduceN stack node coll))
+ ;; Syntax error
+ (error "Syntax Error"))))


When you say "the reducer", do you mean the reduce1/reduceN functions? or the "to-elisp" / "to-clojure-string" functions?

volrath · 2017-07-07T11:13:06Z

clj-parse-test.el

+(ert-deftest clj-parse-to-elisp-simple-list ()
+ (clj-parse-eq-test clj-parse-to-elisp
+ "(1 2 3)"
+ '((1 2 3))))


tbh I don't know if this is the best way to indent this.

volrath · 2017-07-07T11:14:25Z

Turns out I didn't send my review either...... it is confusing.

volrath · 2017-07-09T10:55:11Z

@plexus I did some adjustments regarding the DESIGN.md document to help land this PR into master and keep working on the rest of the implementation in further PRs. So far we have:

clj-parse-ast
clj-parse-ast-print
clj-parse-edn

I rewrote tests again, and this time I went with including ert-deftest into a macro (similar to the unrepl-deftest) because I wanted to generate different named tests for each parsing mode.

Regarding :discard nodes: before, I was adding them to the AST as any other node, now I'm completely ignoring them in the AST construction (similar to the elisp reducers) because, tbh, I didn't find an elegant way to handle the case where there are two #_ in a row. "(1 #_#_ 2 3) should produce (1), but if we want to represent that in a AST, we would have to have something like:

- :root - :list - :number 1 - :discard - :number 2 - :discard - :number 3

plexus · 2017-07-12T16:30:00Z

When unsure about how to represent things like #_ or ^, just imagine them with lisp syntax

#_ #_ 2 3 => (discard (discard 2) 3)

That said discards are like whitespace or comments: they don't add any semantics and so should not be represented in the AST mode. In the whitespace aware source mode they should be present.

volrath added 3 commits July 5, 2017 20:06

Refactor clj-parse.el

7ae887b

- Make parser produce an AST by default. - Switch to lexical binding and fix some broken var references. - Reorganizes reduce functions' signatures a bit.

Add AST to Elisp and Clojure/EDN printers

1b2b221

Refactor clj-parse-test.

a838160

All tests are separated now.

volrath mentioned this pull request Jul 6, 2017

Add support for tagged elements in Lexer #2

Merged

plexus reviewed Jul 7, 2017

View reviewed changes

volrath added 2 commits July 7, 2017 12:56

Avoid dropping whitespaces, handling them while reducing

43f59dc

Rework clj-parse-deftest macro

553f861

volrath commented Jul 7, 2017

View reviewed changes

Merge branch 'master' of https://github.com/lambdaisland/clj-parse

0702332

volrath added 2 commits July 10, 2017 01:18

DESIGN.md-related adjustments

7733985

Rewrite all tests, and add new tests for the AST "printer"

c906de3

volrath force-pushed the master branch from ba1397d to c906de3 Compare July 9, 2017 23:19

plexus merged commit 697618d into clojure-emacs:master Jul 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parse to AST and use printers to go from AST to specific targets #1

Parse to AST and use printers to go from AST to specific targets #1

Uh oh!

volrath commented Jul 5, 2017

plexus commented Jul 6, 2017

plexus commented Jul 6, 2017

volrath commented Jul 6, 2017 •

edited

Loading

lambdaisland commented Jul 6, 2017 via email

volrath commented Jul 7, 2017

plexus Jul 6, 2017

volrath Jul 7, 2017

plexus Jul 6, 2017

volrath Jul 7, 2017

plexus Jul 6, 2017

volrath Jul 7, 2017

plexus Jul 12, 2017

plexus commented Jul 7, 2017

volrath commented Jul 7, 2017

volrath Jul 5, 2017

volrath Jul 5, 2017

volrath Jul 6, 2017

volrath Jul 7, 2017

volrath Jul 7, 2017

volrath Jul 7, 2017

volrath Jul 7, 2017

volrath commented Jul 7, 2017

volrath commented Jul 9, 2017

plexus commented Jul 12, 2017

Labels

3 participants

Parse to AST and use printers to go from AST to specific targets #1

Parse to AST and use printers to go from AST to specific targets #1

Uh oh!

Conversation

volrath commented Jul 5, 2017

plexus commented Jul 6, 2017

plexus commented Jul 6, 2017

volrath commented Jul 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lambdaisland commented Jul 6, 2017 via email

volrath commented Jul 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plexus commented Jul 7, 2017

volrath commented Jul 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

volrath commented Jul 7, 2017

volrath commented Jul 9, 2017

plexus commented Jul 12, 2017

Labels

3 participants

volrath commented Jul 6, 2017 •

edited

Loading