pandas-dev
diff --git a/‎doc/source/whatsnew/v1.5.0.rst‎
Lines changed: 82 additions & 76 deletions b/‎doc/source/whatsnew/v1.5.0.rst‎
Lines changed: 82 additions & 76 deletions
@@ -147,6 +147,85 @@ If the compression method cannot be inferred, use the ``compression`` argument:
 (``mode`` being one of ``tarfile.open``'s modes: https://docs.python.org/3/library/tarfile.html#tarfile.open)
 
 
+.. _whatsnew_150.enhancements.read_xml_dtypes:
+
+read_xml now supports ``dtype``, ``converters``, and ``parse_dates``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Similar to other IO methods, :func:`pandas.read_xml` now supports assigning specific dtypes to columns,
+apply converter methods, and parse dates (:issue:`43567`).
+
+.. ipython:: python
+
+ xml_dates = """<?xml version='1.0' encoding='utf-8'?>
+ <data>
+ <row>
+ <shape>square</shape>
+ <degrees>00360</degrees>
+ <sides>4.0</sides>
+ <date>2020-01-01</date>
+ </row>
+ <row>
+ <shape>circle</shape>
+ <degrees>00360</degrees>
+ <sides/>
+ <date>2021-01-01</date>
+ </row>
+ <row>
+ <shape>triangle</shape>
+ <degrees>00180</degrees>
+ <sides>3.0</sides>
+ <date>2022-01-01</date>
+ </row>
+ </data>"""
+
+ df = pd.read_xml(
+ xml_dates,
+ dtype={'sides': 'Int64'},
+ converters={'degrees': str},
+ parse_dates=['date']
+ )
+ df
+ df.dtypes
+
+
+.. _whatsnew_150.enhancements.read_xml_iterparse:
+
+read_xml now supports large XML using ``iterparse``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+For very large XML files that can range in hundreds of megabytes to gigabytes, :func:`pandas.read_xml`
+now supports parsing such sizeable files using `lxml's iterparse`_ and `etree's iterparse`_
+which are memory-efficient methods to iterate through XML trees and extract specific elements
+and attributes without holding entire tree in memory (:issue:`45442`).
+
+.. code-block:: ipython
+
+ In [1]: df = pd.read_xml(
+ ... "/path/to/downloaded/enwikisource-latest-pages-articles.xml",
+ ... iterparse = {"page": ["title", "ns", "id"]})
+ ... )
+ df
+ Out[2]:
+ title ns id
+ 0 Gettysburg Address 0 21450
+ 1 Main Page 0 42950
+ 2 Declaration by United Nations 0 8435
+ 3 Constitution of the United States of America 0 8435
+ 4 Declaration of Independence (Israel) 0 17858
+ ... ... ... ...
+ 3578760 Page:Black cat 1897 07 v2 n10.pdf/17 104 219649
+ 3578761 Page:Black cat 1897 07 v2 n10.pdf/43 104 219649
+ 3578762 Page:Black cat 1897 07 v2 n10.pdf/44 104 219649
+ 3578763 The History of Tom Jones, a Foundling/Book IX 0 12084291
+ 3578764 Page:Shakespeare of Stratford (1926) Yale.djvu/91 104 21450
+
+ [3578765 rows x 3 columns]
+
+
+.. _`lxml's iterparse`: https://lxml.de/3.2/parsing.html#iterparse-and-iterwalk
+.. _`etree's iterparse`: https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.iterparse
+
 .. _whatsnew_150.enhancements.other:
 
 Other enhancements
@@ -294,83 +373,10 @@ upon serialization. (Related issue :issue:`12997`)
 Backwards incompatible API changes
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-.. _whatsnew_150.api_breaking.read_xml_dtypes:
-
-read_xml now supports ``dtype``, ``converters``, and ``parse_dates``
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Similar to other IO methods, :func:`pandas.read_xml` now supports assigning specific dtypes to columns,
-apply converter methods, and parse dates (:issue:`43567`).
-
-.. ipython:: python
-
- xml_dates = """<?xml version='1.0' encoding='utf-8'?>
- <data>
- <row>
- <shape>square</shape>
- <degrees>00360</degrees>
- <sides>4.0</sides>
- <date>2020-01-01</date>
- </row>
- <row>
- <shape>circle</shape>
- <degrees>00360</degrees>
- <sides/>
- <date>2021-01-01</date>
- </row>
- <row>
- <shape>triangle</shape>
- <degrees>00180</degrees>
- <sides>3.0</sides>
- <date>2022-01-01</date>
- </row>
- </data>"""
+.. _whatsnew_150.api_breaking.api_breaking1:
 
- df = pd.read_xml(
- xml_dates,
- dtype={'sides': 'Int64'},
- converters={'degrees': str},
- parse_dates=['date']
- )
- df
- df.dtypes
-
-.. _whatsnew_150.read_xml_iterparse:
-
-read_xml now supports large XML using ``iterparse``
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-For very large XML files that can range in hundreds of megabytes to gigabytes, :func:`pandas.read_xml`
-now supports parsing such sizeable files using `lxml's iterparse`_ and `etree's iterparse`_
-which are memory-efficient methods to iterate through XML trees and extract specific elements
-and attributes without holding entire tree in memory (:issue:`#45442`).
-
-.. code-block:: ipython
-
- In [1]: df = pd.read_xml(
- ... "/path/to/downloaded/enwikisource-latest-pages-articles.xml",
- ... iterparse = {"page": ["title", "ns", "id"]})
- ... )
- df
- Out[2]:
- title ns id
- 0 Gettysburg Address 0 21450
- 1 Main Page 0 42950
- 2 Declaration by United Nations 0 8435
- 3 Constitution of the United States of America 0 8435
- 4 Declaration of Independence (Israel) 0 17858
- ... ... ... ...
- 3578760 Page:Black cat 1897 07 v2 n10.pdf/17 104 219649
- 3578761 Page:Black cat 1897 07 v2 n10.pdf/43 104 219649
- 3578762 Page:Black cat 1897 07 v2 n10.pdf/44 104 219649
- 3578763 The History of Tom Jones, a Foundling/Book IX 0 12084291
- 3578764 Page:Shakespeare of Stratford (1926) Yale.djvu/91 104 21450
-
- [3578765 rows x 3 columns]
-
-
-.. _`lxml's iterparse`: https://lxml.de/3.2/parsing.html#iterparse-and-iterwalk
-.. _`etree's iterparse`: https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.iterparse
+api_breaking_change1
+^^^^^^^^^^^^^^^^^^^^
 
 .. _whatsnew_150.api_breaking.api_breaking2: