Skip to content
14 changes: 12 additions & 2 deletions Doc/library/string.rst
Original file line number Diff line number Diff line change
Expand Up @@ -755,8 +755,18 @@ attributes:

* *idpattern* -- This is the regular expression describing the pattern for
non-braced placeholders. The default value is the regular expression
``[_a-z][_a-z0-9]*``. If this is given and *braceidpattern* is ``None``
this pattern will also apply to braced placeholders.
``(?-i:[_a-zA-Z][_a-zA-Z0-9]*)``. Since default *flags* is
``re.IGNORECASE``, ``[a-z]``Without local flag ``-i``, is used to avoid to match with non ASCII characters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a space before the W, but I also feel like the sentence starting with "Since default..." should just be moved to, and consolidated with, the note:: section that just follows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed removing this sentence after writing note section.

If this is given and *braceidpattern* is
``None`` this pattern will also apply to braced placeholders.

.. note::

Default *flags* is ``re.IGNORECASE``. So the pattern ``[a-z]`` can match
with some non ASCII characters. That's why We use local ``-i`` flag here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"non-ASCII"

Also s/We/we/


When overrinding this class, please consider overriding *flags* with ``0``
or ``re.IGNORECASE | re.ASCII``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about: "When subclassing, please..."

Why? Or in other words, can you add a short explanation of why they should consider this?


.. versionchanged:: 3.7
*braceidpattern* can be used to define separate patterns used inside and
Expand Down
6 changes: 5 additions & 1 deletion Lib/string.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,11 @@ class Template(metaclass=_TemplateMetaclass):
"""A string class for supporting $-substitutions."""

delimiter = '$'
idpattern = r'[_a-z][_a-z0-9]*'
# r'[a-z]' matches to non-ASCII letters when used with IGNORECASE,
# but without ASCII flag. We can't add re.ASCII to flags because of
# backward compatibility. So we use local -i flag and [a-zA-Z] pattern.
# See https://bugs.python.org/issue31672
idpattern = r'(?-i:[_a-zA-Z][_a-zA-Z0-9]*)'
braceidpattern = None
flags = _re.IGNORECASE

Expand Down
4 changes: 4 additions & 0 deletions Lib/test/test_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,10 @@ def test_invalid_placeholders(self):
raises(ValueError, s.substitute, dict(who='tim'))
s = Template('$who likes $100')
raises(ValueError, s.substitute, dict(who='tim'))
# Template.idpattern should match to only ASCII characters.
# https://bugs.python.org/issue31672
s = Template("$who likes $ı") # (0x131, DOTLESS I)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test also 'İ' (0x130). 'İ'.lower() == 'i'. In older Python versions [a-z] didn't match 'ı', but matched 'İ'.

raises(ValueError, s.substitute, dict(who='tim'))

def test_idpattern_override(self):
class PathPattern(Template):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
``idpattern`` in ``string.Template`` matched some non ASCII characters. Now
it uses ``-i`` regular expression local flag to avoid non ASCII characters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"non-ASCII" in two places.