| msg233794 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2015-01-10 01:46 |
I am trying to make LZMAFile (which implements BufferedIOBase) use a BufferedReader in read mode. However this broke test_lzma.FileTestCase.test_read1_multistream(), which calls read1() with the default size argument. This is because BufferedReader.read1() does not accept size=-1: >>> stdin = open(0, "rb", closefd=False) >>> stdin <_io.BufferedReader name=0> >>> stdin.read1() # Parameter is mandatory Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: read1() takes exactly 1 argument (0 given) >>> stdin.read1(-1) # Does not accept the BufferedIOBase default Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: read length must be positive >>> stdin.read1(0) # Technically not positive b'' Also, the BufferedIOBase documentation does not say what the size=-1 value means, only that it reads and returns up to -1 bytes. |
| msg233864 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2015-01-11 23:22 |
Looking at the test suite: * read1() of LZMAFile and GzipFile (both implementing BufferedIOBase) are asserted to return a non-zero result until EOF * LZMAFile.read1(0) is asserted to return an empty string * BufferedReader.read1(-1) is asserted to raise ValueError * There are also tests of read1() methods on HTTPResponse and ZipFile.open() objects, but those methods are undocumented It seems the most consistent way forward would be to: * Define BufferedIOBase.read1(-1) to read and return an arbitrary number of bytes, more than zero unless none are available due to EOF or non-blocking mode. Maybe suggest that it would return the current buffered data or try to read a full buffer of data (with one call) and return it if applicable. * Change the signature to BufferedReader.read1(size=-1) and implement the size=-1 behaviour |
| msg261672 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2016-03-13 06:02 |
Looking at this again, I think a less intrusive way forward would be to: * Document that in 3.6, the required signature is now BufferedIOBase.read1(size). An implementation no longer has to provide a default size, and no longer has to accept negative sizes. * Explicitly document the behaviour of each concrete implementation like GzipFile.read1(-1) etc, if this behaviour is intentional * Fix the BufferedReader error so that “read length must not be negative” Relaxing the read1() signature would allow wider or easier use of BufferedReader, e.g. to implement HTTPResponse as I suggested in Issue 26499. The advantage would be using existing code that is well tested, used, optimized, etc, rather than a custom BufferedIOBase implementation which for the HTTP case is buggy. |
| msg261685 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) *  | Date: 2016-03-13 07:44 |
Shouldn't read1(-1) be the same as read1(sys.maxsize)? I.e. read from a buffer without limit. |
| msg261698 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2016-03-13 12:01 |
Calling BufferedReader.read1(sys.maxsize) gives me a MemoryError. Making read1(-1) equivalent to read1(sys.maxsize) only makes sense where the return value already has a predetermined size, and only a limited buffer needs to be allocated. Another interpretation is to return an arbitrary, modest buffer size. This is what I ended up doing with LZMAFile.read1() in Issue 23529: return no more than 8 KiB. It is not equivalent to sys.maxsize because more than 8 KiB is possible if you ask for it. HTTPResponse (for non-chunked responses) is similar, but uses a default of 16 KiB. |
| msg261703 - (view) | Author: Antoine Pitrou (pitrou) *  | Date: 2016-03-13 18:38 |
> Define BufferedIOBase.read1(-1) to read and return an arbitrary number > of bytes, more than zero unless none are available due to EOF or > non-blocking mode. Maybe suggest that it would return the current > buffered data or try to read a full buffer of data (with one call) and > return it if applicable. This sounds reasonable to me. |
| msg262019 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2016-03-19 06:32 |
Okay here is a patch implementing read1(-1) in BufferedReader and BytesIO (my original proposal). Other changes: * Changed read1(size=-1) → read1([size]), because BufferedReader and BytesIO do not accept keyword arguments (see also Issue 23738) * Defined size=-1 to mean an arbitrary non-zero size * Change BufferedReader.read1() to return a buffer of data * Change BytesIO.read1() to read until EOF * Add tests to complement existing read1(size) tests for BufferedReader (includes BufferedRandom), BufferedRWPair, and BytesIO |
| msg279093 - (view) | Author: Roundup Robot (python-dev)  | Date: 2016-10-21 00:17 |
New changeset b6886ac88e28 by Martin Panter in branch 'default': Issue #23214: Implement optional BufferedReader, BytesIO read1() argument https://hg.python.org/cpython/rev/b6886ac88e28 |
| msg279097 - (view) | Author: Roundup Robot (python-dev)  | Date: 2016-10-21 02:00 |
New changeset d4fce64b1c65 by Martin Panter in branch 'default': Issue #23214: Remove BufferedReader.read1(-1) workaround https://hg.python.org/cpython/rev/d4fce64b1c65 |
| msg279100 - (view) | Author: STINNER Victor (vstinner) *  | Date: 2016-10-21 02:28 |
- .. method:: read1(size=-1) + .. method:: read1([size]) I don't understand this change: the default size is documented as -1. Did I miss something? + If *size* is −1 (the default) Is the "−" character deliberate in "−1"? I would expected ``-1``. |
| msg279102 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2016-10-21 03:17 |
The minus sign might have been deliberate at some stage, but I realize it is more accessible etc if it was ASCII -1, so I will change that. The idea of changing the signature is to avoid people thinking it accepts a keyword argument. See e.g. Issue 25030 for seek(offset, whence=SEEK_SET), Issue 14586 for truncate(size=None). |
| msg279113 - (view) | Author: STINNER Victor (vstinner) *  | Date: 2016-10-21 08:20 |
2016-10-21 5:17 GMT+02:00 Martin Panter <report@bugs.python.org>: > The idea of changing the signature is to avoid people thinking it accepts a keyword argument. See e.g. Issue 25030 for seek(offset, whence=SEEK_SET), Issue 14586 for truncate(size=None). Ah. Maybe we should modify the code to accept a keyword argument? :-) Or we might use the strange "read1(\, size=1)" syntax of Argument Clinic. |
| msg279163 - (view) | Author: Martin Panter (martin.panter) *  | Date: 2016-10-21 21:51 |
Modifying the API to accept a keyword argument is not worthwhile IMO. That means modifying modules like http.client and zipfile (which use the wrong keyword name), and user-defined implementations may become incompatible. The question of documentation is discussed more in Issue 23738. I think one or two people were concerned about using the Arg Clinic slash. |
| msg279167 - (view) | Author: Roundup Robot (python-dev)  | Date: 2016-10-21 23:00 |
New changeset 2af6d94de492 by Martin Panter in branch 'default': Issue #23214: Fix formatting of -1 https://hg.python.org/cpython/rev/2af6d94de492 |
|
| Date | User | Action | Args |
| 2022-04-11 14:58:11 | admin | set | github: 67403 |
| 2017-03-31 16:36:34 | dstufft | set | pull_requests: + pull_request1074 |
| 2016-10-22 00:24:26 | martin.panter | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2016-10-21 23:00:51 | python-dev | set | messages: + msg279167 |
| 2016-10-21 21:51:35 | martin.panter | set | messages: + msg279163 |
| 2016-10-21 08:20:22 | vstinner | set | messages: + msg279113 |
| 2016-10-21 03:17:35 | martin.panter | set | messages: + msg279102 versions: + Python 3.7, - Python 3.6 |
| 2016-10-21 02:28:14 | vstinner | set | messages: + msg279100 |
| 2016-10-21 02:00:18 | python-dev | set | messages: + msg279097 |
| 2016-10-21 00:17:07 | python-dev | set | nosy: + python-dev messages: + msg279093
|
| 2016-04-14 12:55:26 | vstinner | set | nosy: + vstinner
|
| 2016-03-19 06:32:07 | martin.panter | set | files: + read1-arbitrary.patch keywords: + patch messages: + msg262019
stage: patch review |
| 2016-03-13 18:38:21 | pitrou | set | messages: + msg261703 |
| 2016-03-13 12:01:15 | martin.panter | set | messages: + msg261698 |
| 2016-03-13 07:44:24 | serhiy.storchaka | set | nosy: + stutzbach, serhiy.storchaka, pitrou, benjamin.peterson messages: + msg261685
|
| 2016-03-13 06:02:01 | martin.panter | set | messages: + msg261672 versions: + Python 3.6, - Python 3.4 |
| 2015-01-11 23:23:00 | martin.panter | set | messages: + msg233864 |
| 2015-01-10 01:46:22 | martin.panter | create | |