Skip to content

Conversation

@MichaReiser
Copy link
Contributor

This PR tries to reduce the number of elements that we copy by

  • Removing the need for calls to vec.extend
  • Removing the need for vec.insert(0, element)
  • Use lalrpop named arguments instead of map|x| x.1) to ignore some tokens

#Performance

I hoped that this would improve performance more but it remains mostly unchanged

group base opt ----- ---- --- parser/large/dataset.py 1.00 3.5±0.01ms 11.7 MB/sec 1.00 3.5±0.05ms 11.7 MB/sec parser/numpy/ctypeslib.py 1.02 664.1±1.66µs 25.1 MB/sec 1.00 654.0±1.27µs 25.5 MB/sec parser/numpy/globals.py 1.02 67.1±0.56µs 44.0 MB/sec 1.00 65.5±0.44µs 45.1 MB/sec parser/pydantic/types.py 1.00 1416.3±19.83µs 18.0 MB/sec 1.01 1430.9±4.21µs 17.8 MB/sec 
Comment on lines 1613 to +1620
OneOrMore<T>: Vec<T> = {
<i1: T> <i2:("," T)*> => {
let mut items = vec![i1];
items.extend(i2.into_iter().map(|e| e.1));
items
<e:T> => vec![e],
<mut v: OneOrMore<T>> "," <e:T> => {
v.push(e);
v
}
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the same as

#[inline] OneOrMore<T>: Vec<T> = {	<mut v: (<T> ",")*> <last:T> => {	v.push(last);	v	} } 

But this doesn't work for reasons I do not understand. I don't know if both macros will expand to the same code or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I am not a parser professional, it seems to related to lookahead.
It is using a sort of table lookup.

@MichaReiser MichaReiser changed the title Reduce copying elements when parsing Avoid copying elements when parsing May 15, 2023
Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, the generated parser is also a lot smaller

Comment on lines 1613 to +1620
OneOrMore<T>: Vec<T> = {
<i1: T> <i2:("," T)*> => {
let mut items = vec![i1];
items.extend(i2.into_iter().map(|e| e.1));
items
<e:T> => vec![e],
<mut v: OneOrMore<T>> "," <e:T> => {
v.push(e);
v
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I am not a parser professional, it seems to related to lookahead.
It is using a sort of table lookup.

)
},
<location:@L> "match" <subject:TestOrStarNamedExpr> "," <subjects:OneOrMore<TestOrStarNamedExpr>> ","? ":" "\n" Indent <cases:MatchCase+> Dedent => {
<location:@L> "match" <subjects:TwoOrMore<TestOrStarNamedExpr, ",">> ","? ":" "\n" Indent <cases:MatchCase+> Dedent => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants