bpo-33899: Make tokenize module mirror end-of-file is end-of-line behavior #7891

ammaraskar · 2018-06-24T06:01:15Z

Most of the change involves fixing up the test suite, which previously made the assumption that there wouldn't be a new line if the input didn't end in one.

https://bugs.python.org/issue33899

…avior

taleinat · 2018-06-24T12:34:11Z

Lib/tokenize.py

+ line = b''
 while True: # loop over lines in stream
 try:
+ last_line = line


Shouldn't last_line only be set after StopIteration is caught? ISTM that in other cases we wouldn't want to be adding the newline at the end.

readline is one of the ancient APIs that existed before generators. There's two ways of stopping iteration, either raising StopIteration or returning the empty string, the latter gets caught all the way down here https://github.com/python/cpython/blob/master/Lib/tokenize.py#L528

How you're describing it is what I had initially but there's a few places in the loop where iteration can stop so I thought this would be simpler.

Then perhaps add a comment to that end? It seems rather crucial to understanding that particular piece of the code.

Good point, added.

taleinat · 2018-07-03T09:23:34Z

Lib/tokenize.py

 try:
+ # This loop has multiple points where it can break out so we
+ # pick up the value for the last_line here, at one unified
+ # point to keep things simple.


"at one unified point to keep things simple" should be removed IMO (unnecessary).

I suggest mentioning briefly why last_line is kept (to the best of my understanding: the last line is needed for the newline check after the loop, but line will be overridden at the last loop iteration).

bedevere-bot · 2018-07-03T09:23:41Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

ammaraskar · 2018-07-03T16:23:06Z

You're right, on a second read the comment didn't really explain much. Reworded.

I have made the requested changes; please review again

bedevere-bot · 2018-07-03T16:23:08Z

Thanks for making the requested changes!

@taleinat: please review the changes made to this pull request.

taleinat

There's a whitespace issue in one of the files. For details, run make patchcheck or take a look at the output on Travis.

bedevere-bot · 2018-07-04T08:09:29Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

ammaraskar · 2018-07-04T08:24:56Z

Whoops, that's what I get for being lazy and using the github web editor.

I have made the requested changes; please review again

bedevere-bot · 2018-07-04T08:24:58Z

Thanks for making the requested changes!

@taleinat: please review the changes made to this pull request.

taleinat

I've reviewed the whole PR again and a have a few final remarks. After these are fixed it would be good to go IMO.

taleinat · 2018-07-04T09:01:09Z

Lib/tokenize.py

 pos += 1

+ # Add an implicit NEWLINE if the input doesn't end in one
+ if len(last_line) > 0 and last_line[-1] not in '\r\n':


This should begin if last_line and ....

taleinat · 2018-07-04T09:05:31Z

Lib/test/test_tokenize.py

 for type, token, start, end, line in tokenize(f.readline):
 if type == ENDMARKER:
 break
+ if s[-1] not in '\r\n' and type == NEWLINE and end[0] == num_lines:


This check is quite unreadable. Not simple to make more readable though. A short comment could help.

taleinat · 2018-07-04T09:06:19Z

Lib/test/test_tokenize.py

 if type == ENDMARKER:
 break
+ if s[-1] not in '\r\n' and type == NEWLINE and end[0] == num_lines:
+ continue


The check_tokenize() logic is repeated exactly, but it is now becoming rather involved. This should be a common function used by both classes.

bedevere-bot · 2018-07-04T09:06:56Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

ammaraskar · 2018-07-06T02:36:10Z

Thanks for the reviews Tal, I really appreciate it! I refactored out the common logic, does this seem fine? I also added a comment and added a variable in the check to convey the meaning better.

I have made the requested changes; please review again

miss-islington · 2018-07-06T07:19:10Z

Thanks @ammaraskar for the PR, and @taleinat for merging it 🌮🎉.. I'm working now to backport this PR to: 2.7, 3.6, 3.7.
🐍🍒⛏🤖

miss-islington · 2018-07-06T07:20:15Z

Sorry, @ammaraskar and @taleinat, I could not cleanly backport this to 3.7 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker c4ef4896eac86a6759901c8546e26de4695a1389 3.7

miss-islington · 2018-07-06T07:21:13Z

Sorry, @ammaraskar and @taleinat, I could not cleanly backport this to 2.7 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker c4ef4896eac86a6759901c8546e26de4695a1389 2.7

miss-islington · 2018-07-06T07:22:12Z

Sorry, @ammaraskar and @taleinat, I could not cleanly backport this to 3.6 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker c4ef4896eac86a6759901c8546e26de4695a1389 3.6

ammaraskar · 2018-07-06T07:22:23Z

Will backport by hand in a bit

taleinat · 2018-07-06T07:26:54Z

@ammaraskar, ask me (via GitHub's "Reviewers" feature) to review the backport PRs when they're ready.

…ne behavior (pythonGH-7891) Most of the change involves fixing up the test suite, which previously made the assumption that there wouldn't be a new line if the input didn't end in one. Contributed by Ammar Askar.. (cherry picked from commit c4ef489) Co-authored-by: Ammar Askar <ammar_askar@hotmail.com>

ammaraskar · 2018-07-06T08:32:01Z

@taleinat
It seems like I can't touch the reviewers/labels etc, only people with access to the repo can.

2.7: #8133
3.6: #8134
3.7: #8132

taleinat · 2018-07-06T10:19:03Z

Thanks, @ammaraskar.

…ne behavior (GH-7891) (GH-8132) Most of the change involves fixing up the test suite, which previously made the assumption that there wouldn't be a new line if the input didn't end in one. Contributed by Ammar Askar. (cherry picked from commit c4ef489)

…ne behavior (GH-7891) (GH-8134) Most of the change involves fixing up the test suite, which previously made the assumption that there wouldn't be a new line if the input didn't end in one. Contributed by Ammar Askar. (cherry picked from commit c4ef489)

…ne behavior (GH-7891) (#8133) Most of the change involves fixing up the test suite, which previously made the assumption that there wouldn't be a new line if the input didn't end in one. Contributed by Ammar Askar. (cherry picked from commit c4ef489)

bpo-33899: Make tokenize module mirror end-of-file is end-of-line beh…

5d58e8a

…avior

the-knights-who-say-ni added the CLA signed label Jun 24, 2018

bedevere-bot added the awaiting review label Jun 24, 2018

Add a specific testcase for the change

43a1bd4

taleinat reviewed Jun 24, 2018

View reviewed changes

Add comment explaining why last_line is in the start of the loop

57c92d4

taleinat requested changes Jul 3, 2018

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting review labels Jul 3, 2018

Update comment with Tal's suggestions

679cd89

bedevere-bot removed the awaiting changes label Jul 3, 2018

bedevere-bot added the awaiting change review label Jul 3, 2018

taleinat requested changes Jul 4, 2018

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting change review labels Jul 4, 2018

Fix whitespace issue

24214a7

bedevere-bot added awaiting change review and removed awaiting changes labels Jul 4, 2018

taleinat requested changes Jul 4, 2018

View reviewed changes

bedevere-bot removed the awaiting change review label Jul 4, 2018

bedevere-bot added the awaiting changes label Jul 4, 2018

Move common code out to a function

055dffb

taleinat added needs backport to 2.7 labels Jul 6, 2018

update NEWS to mention contributor

ae032e7

ammaraskar closed this Jul 6, 2018

ammaraskar reopened this Jul 6, 2018

taleinat merged commit c4ef489 into python:master Jul 6, 2018

bedevere-bot removed the awaiting merge label Jul 6, 2018

miss-islington assigned taleinat Jul 6, 2018

taleinat removed needs backport to 2.7 labels Jul 6, 2018

duncanmmacleod mentioned this pull request Oct 24, 2018

Fixed parsing filter strings in bug-fix python releases gwpy/gwpy#946

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bpo-33899: Make tokenize module mirror end-of-file is end-of-line behavior #7891

bpo-33899: Make tokenize module mirror end-of-file is end-of-line behavior #7891

Uh oh!

ammaraskar commented Jun 24, 2018 •

edited by bedevere-bot

Loading

taleinat Jun 24, 2018

ammaraskar Jun 24, 2018

taleinat Jun 24, 2018

ammaraskar Jun 24, 2018

taleinat Jul 3, 2018

bedevere-bot commented Jul 3, 2018

ammaraskar commented Jul 3, 2018

bedevere-bot commented Jul 3, 2018

taleinat left a comment •

edited

Loading

bedevere-bot commented Jul 4, 2018

ammaraskar commented Jul 4, 2018

bedevere-bot commented Jul 4, 2018

taleinat left a comment

taleinat Jul 4, 2018

taleinat Jul 4, 2018

taleinat Jul 4, 2018

bedevere-bot commented Jul 4, 2018

ammaraskar commented Jul 6, 2018

miss-islington commented Jul 6, 2018

miss-islington commented Jul 6, 2018

miss-islington commented Jul 6, 2018

miss-islington commented Jul 6, 2018

ammaraskar commented Jul 6, 2018

taleinat commented Jul 6, 2018

ammaraskar commented Jul 6, 2018

taleinat commented Jul 6, 2018

Labels

5 participants

Uh oh!

bpo-33899: Make tokenize module mirror end-of-file is end-of-line behavior #7891

bpo-33899: Make tokenize module mirror end-of-file is end-of-line behavior #7891

Uh oh!

Conversation

ammaraskar commented Jun 24, 2018 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bedevere-bot commented Jul 3, 2018

ammaraskar commented Jul 3, 2018

bedevere-bot commented Jul 3, 2018

taleinat left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

bedevere-bot commented Jul 4, 2018

ammaraskar commented Jul 4, 2018

bedevere-bot commented Jul 4, 2018

taleinat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bedevere-bot commented Jul 4, 2018

ammaraskar commented Jul 6, 2018

miss-islington commented Jul 6, 2018

miss-islington commented Jul 6, 2018

miss-islington commented Jul 6, 2018

miss-islington commented Jul 6, 2018

ammaraskar commented Jul 6, 2018

taleinat commented Jul 6, 2018

ammaraskar commented Jul 6, 2018

taleinat commented Jul 6, 2018

Labels

5 participants

ammaraskar commented Jun 24, 2018 •

edited by bedevere-bot

Loading

taleinat left a comment •

edited

Loading