0
echo random text > text_file 

Saves text_file in text format with ASCII encoding. To check the encoding, I do

chardetect text_file 

which tells me that the file is ASCII encoded. Now I have a jpg format file and I do the same

chardetect my_image_file 

but it doesnt't recognize any encoding in that jpg file.

I read these two answers (first and second) about what is the difference between file format and file encoding and understood that file encoding (ASCII, UTF-8, etc) is done for data representation to users (because computers can't understand English) and also integrity checks sometimes (Base64) while file formats are just representation of how data is presented to the application for parsing (HTML, JSON, etc) and that media and some other files (JPEG, MP4, DOC, PDF) are stored in binary format.

Questions

  • Is my understanding correcy? If so,do binary format files (PDF,MP4,JPEG) do not have any encoding?
  • If encoding for binary files (media/other) exist, how to detect it using terminal.
  • How to detect what format a file is using (JSON, HTML, Plain Text, PDF, GIF, JPEG, etc) because chardetect seems to only tell information about text encoding
0

2 Answers 2

1

Use the file command to determine file type (man file).

Other utilities can help further identify specific types of files:

  • chardetect is a universal character encoding detector (man chardetect).
  • identify describes the format and characteristics of one or more image files.

Also, look up other file specific utilities such as mediainfo, ffmpeg, exiftool.

0

An "encoding", in this context (and my guess about the chardetect command), refers specifically to the way text is represented in a text file. Other files have encodings, too, but that refers to something different.

I think you're looking for the file command: it tries to guess what format a file uses.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.