*Memos:
- My post explains a string.
encode() can encode a string and decode() can decode the encoded string as shown below:
*Memos:
- The 1st argument is
encoding
(Optional-Default:'utf-8'
): *Memos:-
'utf-8'
,'utf-7'
,'utf-16'
,'big5'
,'ascii'
, etc can be set to it. - You can see Standard Encodings for more possible values.
-
- The 2nd argument is
errors
(Optional-Default:'strict'
): *Memos:- It controls encoding or decoding error with the error handlers,
'strict'
,'ignore'
,'replace'
,'xmlcharrefreplace'
,'backslashreplace'
, etc. -
'strict'
raises UnicodeError if the character, which cannot be encoded or decoded, exists. -
'ignore'
ignores the character which cannot be encoded or decoded. -
'replace'
replaces the character, which cannot be encoded or decoded, with?
forencode()
or�
fordecode()
. -
'xmlcharrefreplace'
replaces the character, which cannot be encoded or decoded, with a XML character e.g.ё
,φ
, etc. -
'backslashreplace'
replaces the character, which cannot be encoded or decoded, with\\uxxxx
forencode()
e.g.\\u0451
or\uxxxx
fordecode()
e.g.\u0451
. - You can see more error handlers.
- You can create your own error handler with codecs.register_error().
- It controls encoding or decoding error with the error handlers,
- After using
encode()
,decode()
appears andencode()
disappears. - After using
decode()
,encode()
appears anddecode()
disappears.
v = 'Hёllφ!' ev = v.encode() ev = v.encode(encoding='utf-8', errors='strict') print(ev) # b'H\xd1\x91ll\xcf\x86!' dv = ev.decode() dv = ev.decode(encoding='utf-8', errors='strict') print(dv) # Hёllφ!
v = 'Hёllφ!' ev = v.encode(encoding='utf-7') print(ev) # b'H+BFE-ll+A8Y!' dv = ev.decode(encoding='utf-7') print(dv) # Hёllφ!
v = 'Hёllφ!' ev = v.encode(encoding='utf-16') print(ev) # b'\xff\xfeH\x00Q\x04l\x00l\x00\xc6\x03!\x00' dv = ev.decode(encoding='utf-16') print(dv) # Hёllφ!
v = 'Hёllφ!' ev = v.encode(encoding='big5') print(ev) # b'H\xc7\xcell\xa3p!' dv = ev.decode(encoding='big5') print(dv) # Hёllφ!
import codecs def hashreplace_handler(s): return ((s.end - s.start) * '#', s.end) codecs.register_error('hashreplace', hashreplace_handler) v = 'Hёllφ!' print(v.encode(encoding='ascii', errors='ignore')) # b'Hll!' print(v.encode(encoding='ascii', errors='replace')) # b'H?ll?!' print(v.encode(encoding='ascii', errors='xmlcharrefreplace')) # b'Hёllφ!' print(v.encode(encoding='ascii', errors='backslashreplace')) # b'H\\u0451ll\\u03c6!' print(v.encode(encoding='ascii', errors='hashreplace')) # b'H#ll#!' print(v.encode(encoding='ascii')) print(v.encode(encoding='ascii', errors='strict')) # UnicodeEncodeError: 'ascii' codec can't encode character '\u0451' # in position 1: ordinal not in range(128)
import codecs def hashreplace_handler(s): return ((s.end - s.start) * '#', s.end) codecs.register_error('hashreplace', hashreplace_handler) v = 'Hёllφ!' ev = v.encode() print(ev) # b'H\xd1\x91ll\xcf\x86!' print(ev.decode(encoding='ascii', errors='ignore')) # Hll! print(ev.decode(encoding='ascii', errors='replace')) # H��ll��! print(ev.decode(encoding='ascii', errors='hashreplace')) # H##ll##! print(ev.decode(encoding='ascii', errors='strict')) # UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 # in position 1: ordinal not in range(128)
v = 'Hёllφ!' ev = v.encode(encoding='ascii', errors='xmlcharrefreplace') print(ev) # b'Hёllφ!' print(ev.decode(encoding='ascii', errors='xmlcharrefreplace')) # Hёllφ!
v = 'Hёllφ!' ev = v.encode(encoding='ascii', errors='backslashreplace') print(ev) # b'H\\u0451ll\\u03c6!' print(ev.decode(encoding='ascii', errors='backslashreplace')) # H\u0451ll\u03c6!
Top comments (0)