DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

String in Python (3)

Buy Me a Coffee

*Memos:

encode() can encode a string and decode() can decode the encoded string as shown below:

*Memos:

  • The 1st argument is encoding(Optional-Default:'utf-8'): *Memos:
    • 'utf-8', 'utf-7', 'utf-16', 'big5', 'ascii', etc can be set to it.
    • You can see Standard Encodings for more possible values.
  • The 2nd argument is errors(Optional-Default:'strict'): *Memos:
    • It controls encoding or decoding error with the error handlers, 'strict', 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace', etc.
    • 'strict' raises UnicodeError if the character, which cannot be encoded or decoded, exists.
    • 'ignore' ignores the character which cannot be encoded or decoded.
    • 'replace' replaces the character, which cannot be encoded or decoded, with ? for encode() or for decode().
    • 'xmlcharrefreplace' replaces the character, which cannot be encoded or decoded, with a XML character e.g. ё, φ, etc.
    • 'backslashreplace' replaces the character, which cannot be encoded or decoded, with \\uxxxx for encode() e.g. \\u0451 or \uxxxx for decode() e.g. \u0451.
    • You can see more error handlers.
    • You can create your own error handler with codecs.register_error().
  • After using encode(), decode() appears and encode() disappears.
  • After using decode(), encode() appears and decode() disappears.
v = 'Hёllφ!' ev = v.encode() ev = v.encode(encoding='utf-8', errors='strict') print(ev) # b'H\xd1\x91ll\xcf\x86!'  dv = ev.decode() dv = ev.decode(encoding='utf-8', errors='strict') print(dv) # Hёllφ! 
Enter fullscreen mode Exit fullscreen mode
v = 'Hёllφ!' ev = v.encode(encoding='utf-7') print(ev) # b'H+BFE-ll+A8Y!'  dv = ev.decode(encoding='utf-7') print(dv) # Hёllφ! 
Enter fullscreen mode Exit fullscreen mode
v = 'Hёllφ!' ev = v.encode(encoding='utf-16') print(ev) # b'\xff\xfeH\x00Q\x04l\x00l\x00\xc6\x03!\x00'  dv = ev.decode(encoding='utf-16') print(dv) # Hёllφ! 
Enter fullscreen mode Exit fullscreen mode
v = 'Hёllφ!' ev = v.encode(encoding='big5') print(ev) # b'H\xc7\xcell\xa3p!'  dv = ev.decode(encoding='big5') print(dv) # Hёllφ! 
Enter fullscreen mode Exit fullscreen mode
import codecs def hashreplace_handler(s): return ((s.end - s.start) * '#', s.end) codecs.register_error('hashreplace', hashreplace_handler) v = 'Hёllφ!' print(v.encode(encoding='ascii', errors='ignore')) # b'Hll!'  print(v.encode(encoding='ascii', errors='replace')) # b'H?ll?!'  print(v.encode(encoding='ascii', errors='xmlcharrefreplace')) # b'Hёllφ!'  print(v.encode(encoding='ascii', errors='backslashreplace')) # b'H\\u0451ll\\u03c6!'  print(v.encode(encoding='ascii', errors='hashreplace')) # b'H#ll#!'  print(v.encode(encoding='ascii')) print(v.encode(encoding='ascii', errors='strict')) # UnicodeEncodeError: 'ascii' codec can't encode character '\u0451' # in position 1: ordinal not in range(128) 
Enter fullscreen mode Exit fullscreen mode
import codecs def hashreplace_handler(s): return ((s.end - s.start) * '#', s.end) codecs.register_error('hashreplace', hashreplace_handler) v = 'Hёllφ!' ev = v.encode() print(ev) # b'H\xd1\x91ll\xcf\x86!'  print(ev.decode(encoding='ascii', errors='ignore')) # Hll!  print(ev.decode(encoding='ascii', errors='replace')) # H��ll��!  print(ev.decode(encoding='ascii', errors='hashreplace')) # H##ll##!  print(ev.decode(encoding='ascii', errors='strict')) # UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 # in position 1: ordinal not in range(128) 
Enter fullscreen mode Exit fullscreen mode
v = 'Hёllφ!' ev = v.encode(encoding='ascii', errors='xmlcharrefreplace') print(ev) # b'Hёllφ!'  print(ev.decode(encoding='ascii', errors='xmlcharrefreplace')) # Hёllφ! 
Enter fullscreen mode Exit fullscreen mode
v = 'Hёllφ!' ev = v.encode(encoding='ascii', errors='backslashreplace') print(ev) # b'H\\u0451ll\\u03c6!'  print(ev.decode(encoding='ascii', errors='backslashreplace')) # H\u0451ll\u03c6! 
Enter fullscreen mode Exit fullscreen mode

Top comments (0)