@@ -8,16 +8,12 @@ Copy-on-Write (CoW)
88
99.. note ::
1010
11-  Copy-on-Write will become the default in pandas 3.0. We recommend
12-  :ref: `turning it on now  <copy_on_write_enabling >`
13-  to benefit from all improvements.
11+  Copy-on-Write is now the default with pandas 3.0.
1412
1513Copy-on-Write was first introduced in version 1.5.0. Starting from version 2.0 most of the
1614optimizations that become possible through CoW are implemented and supported. All possible
1715optimizations are supported starting from pandas 2.1.
1816
19- CoW will be enabled by default in version 3.0.
20- 
2117CoW will lead to more predictable behavior since it is not possible to update more than
2218one object with one statement, e.g. indexing operations or methods won't have side-effects. Additionally, through
2319delaying copies as long as possible, the average performance and memory usage will improve.
@@ -29,21 +25,25 @@ pandas indexing behavior is tricky to understand. Some operations return views w
2925other return copies. Depending on the result of the operation, mutating one object
3026might accidentally mutate another:
3127
32- .. ipython :: python 
28+ .. code-block :: ipython 
3329
34-  df =  pd.DataFrame({" foo" 1 , 2 , 3 ], " bar" 4 , 5 , 6 ]}) 
35-  subset =  df[" foo"  
36-  subset.iloc[0 ] =  100  
37-  df 
30+  In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) 
31+  In [2]: subset = df["foo"] 
32+  In [3]: subset.iloc[0] = 100 
33+  In [4]: df 
34+  Out[4]: 
35+  foo bar 
36+  0 100 4 
37+  1 2 5 
38+  2 3 6 
3839
39- subset ``, e.g. updating its values, also updates ``df ``. The exact behavior is
40+ 
41+ subset ``, e.g. updating its values, also updated ``df ``. The exact behavior was
4042hard to predict. Copy-on-Write solves accidentally modifying more than one object,
41- it explicitly disallows this. With CoW enabled,  ``df `` is unchanged:
43+ it explicitly disallows this. ``df `` is unchanged:
4244
4345.. ipython :: python 
4446
45-  pd.options.mode.copy_on_write =  True  
46- 
4747 df =  pd.DataFrame({" foo" 1 , 2 , 3 ], " bar" 4 , 5 , 6 ]}) 
4848 subset =  df[" foo"  
4949 subset.iloc[0 ] =  100  
@@ -57,13 +57,13 @@ applications.
5757Migrating to Copy-on-Write
5858-------------------------- 
5959
60- Copy-on-Write will be  the default and only mode in pandas 3.0. This means that users
60+ Copy-on-Write is  the default and only mode in pandas 3.0. This means that users
6161need to migrate their code to be compliant with CoW rules.
6262
63- The default mode in pandas will raise  warnings for certain cases that will actively
63+ The default mode in pandas < 3.0 raises  warnings for certain cases that will actively
6464change behavior and thus change user intended behavior.
6565
66- We added another mode, e.g. 
66+ pandas 2.2 has a warning mode 
6767
6868.. code-block :: python 
6969
@@ -84,7 +84,6 @@ The following few items describe the user visible changes:
8484
8585**Accessing the underlying array of a pandas object will return a read-only view **
8686
87- 
8887.. ipython :: python 
8988
9089 ser =  pd.Series([1 , 2 , 3 ]) 
@@ -101,16 +100,21 @@ for more details.
101100
102101**Only one pandas object is updated at once **
103102
104- The following code snippet updates  both ``df `` and ``subset `` without CoW:
103+ The following code snippet updated  both ``df `` and ``subset `` without CoW:
105104
106- .. ipython :: python 
105+ .. code-block :: ipython 
107106
108-  df =  pd.DataFrame({" foo" 1 , 2 , 3 ], " bar" 4 , 5 , 6 ]}) 
109-  subset =  df[" foo"  
110-  subset.iloc[0 ] =  100  
111-  df 
107+  In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) 
108+  In [2]: subset = df["foo"] 
109+  In [3]: subset.iloc[0] = 100 
110+  In [4]: df 
111+  Out[4]: 
112+  foo bar 
113+  0 100 4 
114+  1 2 5 
115+  2 3 6 
112116
113- won't be  possible anymore with CoW, since the CoW rules explicitly forbid this.
117+ is not  possible anymore with CoW, since the CoW rules explicitly forbid this.
114118This includes updating a single column as a :class: `Series ` and relying on the change
115119propagating back to the parent :class: `DataFrame `.
116120This statement can be rewritten into a single statement with ``loc `` or ``iloc `` if
@@ -146,7 +150,7 @@ A different alternative would be to not use ``inplace``:
146150
147151Constructors now copy NumPy arrays by default **
148152
149- The Series and DataFrame constructors will  now copy  NumPy array by default when not
153+ The Series and DataFrame constructors now copies a  NumPy array by default when not
150154otherwise specified. This was changed to avoid mutating a pandas object when the
151155NumPy array is changed inplace outside of pandas. You can set ``copy=False `` to
152156avoid this copy.
@@ -162,7 +166,7 @@ that shares data with another DataFrame or Series object inplace.
162166This avoids side-effects when modifying values and hence, most methods can avoid
163167actually copying the data and only trigger a copy when necessary.
164168
165- The following example will operate inplace with CoW :
169+ The following example will operate inplace:
166170
167171.. ipython :: python 
168172
@@ -207,15 +211,17 @@ listed in :ref:`Copy-on-Write optimizations <copy_on_write.optimizations>`.
207211
208212Previously, when operating on views, the view and the parent object was modified:
209213
210- .. ipython :: python 
211- 
212-  with  pd.option_context(" mode.copy_on_write" False ): 
213-  df =  pd.DataFrame({" foo" 1 , 2 , 3 ], " bar" 4 , 5 , 6 ]}) 
214-  view =  df[:] 
215-  df.iloc[0 , 0 ] =  100  
214+ .. code-block :: ipython 
216215
217-  df 
218-  view 
216+  In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) 
217+  In [2]: subset = df["foo"] 
218+  In [3]: subset.iloc[0] = 100 
219+  In [4]: df 
220+  Out[4]: 
221+  foo bar 
222+  0 100 4 
223+  1 2 5 
224+  2 3 6 
219225
220226df `` is changed to avoid mutating ``view `` as well:
221227
@@ -236,16 +242,19 @@ Chained Assignment
236242Chained assignment references a technique where an object is updated through
237243two subsequent indexing operations, e.g.
238244
239- .. ipython :: python 
240-  :okwarning: 
245+ .. code-block :: ipython 
241246
242-  with  pd.option_context(" mode.copy_on_write" False ): 
243-  df =  pd.DataFrame({" foo" 1 , 2 , 3 ], " bar" 4 , 5 , 6 ]}) 
244-  df[" foo" " bar" >  5 ] =  100  
245-  df 
247+  In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]}) 
248+  In [2]: df["foo"][df["bar"] > 5] = 100 
249+  In [3]: df 
250+  Out[3]: 
251+  foo bar 
252+  0 100 4 
253+  1 2 5 
254+  2 3 6 
246255
247- foo `` is  updated where the column ``bar `` is greater than 5.
248- This violates  the CoW principles though, because it would have to modify the
256+ foo `` was  updated where the column ``bar `` is greater than 5.
257+ This violated  the CoW principles though, because it would have to modify the
249258view ``df["foo"] `` and ``df `` in one step. Hence, chained assignment will
250259consistently never work and raise a ``ChainedAssignmentError `` warning
251260with CoW enabled:
@@ -272,7 +281,6 @@ shares data with the initial DataFrame:
272281
273282The array is a copy if the initial DataFrame consists of more than one array:
274283
275- 
276284.. ipython :: python 
277285
278286 df =  pd.DataFrame({" a" 1 , 2 ], " b" 1.5 , 2.5 ]}) 
@@ -347,22 +355,3 @@ and :meth:`DataFrame.rename`.
347355
348356These methods return views when Copy-on-Write is enabled, which provides a significant
349357performance improvement compared to the regular execution.
350- 
351- .. _copy_on_write_enabling :
352- 
353- How to enable CoW
354- ----------------- 
355- 
356- Copy-on-Write can be enabled through the configuration option ``copy_on_write ``. The option can
357- be turned on __globally__ through either of the following:
358- 
359- .. ipython :: python 
360- 
361-  pd.set_option(" mode.copy_on_write" True ) 
362- 
363-  pd.options.mode.copy_on_write =  True  
364- 
365- ipython :: python 
366-  :suppress: 
367- 
368-  pd.options.mode.copy_on_write =  False  
0 commit comments