UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 34: character

Melcu54 · (This post was last modified: Sep-25-2022, 08:03 PM by Melcu54.)

hello. I donit know what to to with this error:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 34: character maps to <undefined>

This is the Python Code:

import fileinput import glob import os import re with open('c:\\Folder6\\merged.txt', 'w', encoding='UTF-8') as f: for line in fileinput.input(sorted(glob.glob('c:\\Folder6\\*.txt'))): f.write(line) fileinput.close() print(f)

And this is the ERROR:

Traceback (most recent call last): File "E:\Carte\BB\17 - Site Leadership\alte\Ionel Balauta\Aryeht\Task 1 - Traduce tot site-ul\Doar Google Web\Andreea\Meditatii\Sedinta 31 august 2022\merge txt - versiune 3 .py", line 8, in <module> for line in fileinput.input(sorted(glob.glob('c:\\Folder6\\*.txt'))): File "C:\Program Files\Python39\lib\fileinput.py", line 256, in __next__ line = self._readline() File "C:\Program Files\Python39\lib\fileinput.py", line 389, in _readline return self._readline() File "C:\Program Files\Python39\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 34: character maps to <undefined>

This is a print screen:

[Image: McUwGS.jpg]

What can I do, as not to apear that error again? Can anyone help me?

**deanhystad** · Sep-25-2022, 08:26 PM

You should set the encoding when you read the file (fileinput). Windows must thing it is something other than utf-8.

Melcu54 · (This post was last modified: Sep-26-2022, 07:06 AM by Melcu54.)

hello, sir. Thank you for answer.

Can you modify my script as to work with your solution? I don't know Python very good...I am a beginner.

I don't know if this is good, as I make it now. Doesn't do nothing...but have no error..

import fileinput import glob import os import re def read_text_from_file(file_path): with open(file_path, encoding='utf8') as f: text = f.read() return text def write_to_file(text, file_path): with open(file_path, 'wb') as f: f.write(text.encode('utf8', 'ignore')) with open('c:\\Folder6\\translated\\merged.txt', 'w', encoding='UTF-8') as f: for file_name in sorted(glob.glob('c:\\Folder6\\translated\\*.txt')): contents = read_text_from_file(file_name) f.write(line) fileinput.close() print(f)

OR, SECOND VERSION:

import fileinput import glob import os import re with open('c:\\Folder6\\translated\\merged.txt', 'w', encoding='UTF-8') as f: current_content = f.read() modified = new_content != current_content if modified and args.diff: for line in fileinput.input(sorted(glob.glob('c:\\Folder6\\translated\\*.txt'))) : f.write(line) fileinput.close() print(f)

OR, 3' SOLUTION

import fileinput import glob import os import re read_files = sorted(glob.glob("c:\\Folder6\\translated\\merged.txt\\*.txt")) with open("c:\\Folder6\\translated\\merged.txt", "wb") as outfile: for f in read_files: with open(f, "rb") as infile: outfile.write(infile.read()) fileinput.close() print(f)

None of them works. It creates the file, but does not write it

Melcu54 · (This post was last modified: Sep-26-2022, 08:29 AM by Melcu54.)

I found a solution:

import fileinput import glob import os import re def read_files(file_path): with open(file_path, encoding='utf8') as f: text = f.read() return text def read_files(text, file_path): with open(file_path, 'rb') as f: f.write(text.encode('utf8', 'ignore')) read_files = sorted(glob.glob("c:\\Folder6\\translated\\*.txt")) with open("c:\\Folder6\\translated\\merged.txt", "wb") as outfile: for f in read_files: with open(f, "rb") as infile: outfile.write(infile.read()) outfile.write(b"\n\n") fileinput.close() print(f)

***snippsat*** · Sep-26-2022, 09:09 AM

In your first code it would be like this.

import fileinput import glob import os with open('c:\\Folder6\\merged.txt', 'w', encoding='UTF-8') as f: for line in fileinput.input(sorted(glob.glob('c:\\Folder6\\*.txt')), encoding="utf-8"): print(line) f.write(line) fileinput.close()

This need Python 3.10 to work as in fileinput doc

Quote:Changed in version 3.10: The keyword-only parameter encoding and errors are added.

Melcu54 · (This post was last modified: Sep-26-2022, 09:25 AM by Melcu54.)

(Sep-26-2022, 09:09 AM)snippsat Wrote: In your first code it would be like this.
import fileinput import glob import os with open('c:\\Folder6\\merged.txt', 'w', encoding='UTF-8') as f: for line in fileinput.input(sorted(glob.glob('c:\\Folder6\\*.txt')), encoding="utf-8"): print(line) f.write(line) fileinput.close()
This need Python 3.10 to work as in fileinput doc

Quote:Changed in version 3.10: The keyword-only parameter encoding and errors are added.

ok, thanks. But if I want to put an [b]f.write("\n\n") in order to have a dividing line between the files, where should I put it?[/b]

***snippsat*** · Sep-26-2022, 09:38 AM

(Sep-26-2022, 09:25 AM)Melcu54 Wrote: But if I want to put an [b]f.write("\n\n") in order to have a dividing line between the files

Change line 8:

f.write(f'{line}\n\n')

Melcu54 · (This post was last modified: Sep-26-2022, 10:09 AM by Melcu54.)

(Sep-26-2022, 09:38 AM)snippsat Wrote: Change line 8:
f.write(f'{line}\n\n')

I try also this. But, in this case, will double all my lines from all text files, into one file.

See the duplicate lines after using your code (is better with f.write('\n')) , except this will put a new empty lines between all paragraphs)

[Image: zCgDSZ.jpg]

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	"with open(os.path.join(args.input_dir, 'charmap.pickle'), 'rb') as f: IndentationErr	uek67	13	987	Dec-12-2025, 01:13 PM Last Post: uek67
	ASCII-Codec in Python3 [SOLVED]	AlphaInc	6	12,644	Jul-19-2025, 08:53 AM Last Post: Gribouillis
	UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 562: ord	ctrldan	23	12,719	Apr-24-2023, 03:40 PM Last Post: ctrldan
	UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 16: invalid cont	Melcu54	3	13,994	Mar-26-2023, 12:12 PM Last Post: Gribouillis
	Decode string ?	JohnnyCoffee	1	1,939	Jan-11-2023, 12:29 AM Last Post: bowlofred
	[SOLVED] [Debian] UnicodeEncodeError: 'ascii' codec	Winfried	1	2,296	Nov-16-2022, 11:41 AM Last Post: Winfried
	UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in position 14: ordin	Armandito	6	5,453	Apr-29-2022, 12:36 PM Last Post: Armandito
	[solved] unexpected character after line continuation character	paul18fr	4	10,298	Jun-22-2021, 03:22 PM Last Post: deanhystad
	python error: bad character range \\|-t at position 12	faustineaiden	0	5,472	May-28-2021, 09:38 AM Last Post: faustineaiden
	UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 error from Mysql call	AkaAndrew123	1	5,060	Apr-28-2021, 08:16 AM Last Post: AkaAndrew123

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 34: character

User Panel Messages

Announcements