MODULE 2 – PART 3 FILES By, Ravi Kumar B N Assistant professor, Dept. of CSE BMSIT & M
INTRODUCTION ✓ Fileisa namedlocationonthe systemstoragewhichrecordsdata forlateraccess. It enablespersistentstorage ina non-volatilememoryi.e.Harddisk. ✓ It is requiredtoworkwithfilesforeitherwritingtoa fileorreaddatafromit.It is essentialtostorethe filespermanentlyin secondarystorage. ✓ In python,fileprocessingtakesplaceinthe followingorder. • Opena filethatreturnsa filehandle. • Usethe handletoperformreadorwrite action. • Close thefilehandle.
TYPES OF FILES & OPERATIONS ON FILES Typesof Files 1. TextFiles- All docfiles/excelfilesetc 2. BinaryFiles- Audiofiles,VideoFiles,imagesetc OperationsonFile: ▪ Open ▪ Close ▪ Read ▪ Write
✓ Toreadorwritetoa file,you needtoopenitfirst.Toopena fileinPython,use itsbuilt-in open() function.Thisfunctionreturnsa fileobject,i.e.,a handle.Youcan use ittoread ormodifythe file. ✓open() filemethod: file_object= open(“file_name”,” access_mode”) file_object–Filehandlerthatpointstotheparticularlocationasa referencetoan object file_name-Name ofthe file access_mode-Read/write/appendmode. Bydefault,it isset toread-only<r>. Ex: file1= open("app.log","w") OPEN A FILE IN PYTHON
✓ Python stores a file in the form of bytes on the disk, so you need to decode them in strings before reading. And, similarly, encode them while writing texts to the file. This is done automaticallybyPythonInterpreter. ✓ If the open is successful, the operating system returns us a file handle. The file handle is not the actual data contained in the file, but instead it is a “handle” that we can use to read the data. You are given a handle if the requested file exists and you have the properpermissionstoreadthe file.
✓ If the file does not exist, open will fail with a traceback and you will not get a handletoaccessthecontents ✓ Afileopening maycauseanerrorduetosomeofthereasonsaslistedbelow ▪ Filemaynotexistinthe specifiedpath(whenwe trytoreada file) ▪ Filemayexist,butwe may nothave a permissiontoread/writea file ▪ Filemight have got corruptedandmay notbe inanopeningstate
Modes Description r Opens a file only for reading rb Opens a file only for reading but in a binary format w Opens a file only for writing; overwrites the file if the file exists wb Opens a file only for writing but in a binary format a Opens a file for appending. It does not overwrite the file, just adds the data in the file, and if file is not created, then it creates a new file ab Opens a file for appending in a binary format r+ Opens a file only for reading and writing w+ Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing. a+ Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. FILE MODES
TEXT FILES & LINES A text file can be thought of as a sequence oflines, much like a Python stringcan be thought of as a sequence of characters. For example, this is a sample of a text file which records mail activityfromvariousindividualsin anopensourceprojectdevelopmentteam: Tobreakthe fileintolines,thereisa specialcharacterthatrepresentsthe “endofthe line”called the newlinecharacter.In the textfileeachlinecan beseparatedusingescapecharactern >>> stuff = 'XnY’ >>> print(stuff) X Y
READING THE FILES When we successfully open a file to read the data from it, the open() function returns the file handle (or an object reference to file object) which will be pointing to the first characterinthe file. Therearedifferentwaysinwhichwe canreadthe filesinpython. ➢ read() //readsallthecontent ➢ read(n) //readsonlythefirstn characters ➢ readline() //readssingleline ➢ readlines() //readsalllines Where, n isthe numberof bytestobe read Note: If the file is too large to fit in main memory, you should write your program to read the fileinchunks usinga fororwhileloop.
SAMPLE PROGRAMS 1.txt Coronavirus Can Be Stopped Only by Harsh Steps Stay at home wear a facemask Clean your hands often Monitor your symptoms readexample.py f1=open("1.txt",”r”) # to read first n bytes print("---first four characters---") print(f1.read(4)) # to read a first line print("---first line---") print(f1.readline()) # to read entire file print("---Entire File---") print(f1.read()) Output ---first four characters--- Coro ---first line--- navirus Can Be Stopped Only by Harsh Steps ---Entire File--- Stay at home wear a facemask Clean your hands often Monitor your symptoms countlines.py f1 = open(“1.txt”,”r”) count = 0 for line in f1: count = count + 1 print('Line Count:', count) Output Line Count: 5 Note: When the file is read using a for loop in this manner, Python takes care of splitting the data in the file into separate lines using the newline character. Readlineexample.py f1=open("1.txt") #by default read mode #to read line wise print(f1.readline()) print(f1.readline()) Output Coronavirus Can Be Stopped Only by Harsh Steps Stay at home Countchars.py #finds the length of the file f1 = open('1.txt') ch = f1.read() print(len(ch)) Output 120 Note:In the above code it counts the number of characters along with newline character(n)
WRITING THE FILES: ✓ Towritea dataintoa file,we needtouse themode ‘w’inopen()function. ✓ The write()methodisusedtowritedata intoa file. >>> fhand=open(“mynewfile.txt","w") >>> print(fhand) <_io.TextIOWrapper name='mynewfile.txt' mode='w' encoding='cp1252'> ✓ We have twomethodsforwritingdataintoa fileasshown below 1. write(string) 2. writelines(list) If the file specified already exists, then the old contents will be erased and it will be ready to write new data intoit. If the file does not exists, then a new file with the given name will be created.
write( ) method: It returns number of characters successfully written into a file. The file object alsokeepstrackofpositionin thefile. For example, writelines()method: Example:This code adds the listofcontentsintothe fileincludingn Writexample.py fhand=open("2.txt",'w') s="hello how are you?" print(fhand.write(s)) Output: 18 Writelist.py food = ["Citrusn", "Garlicn", "Almondn", "Gingern"] my_file = open("immunity.txt", "w") my_file.writelines(food) Output It creates a file immunity.txt Citrus Garlic Almond Ginger
Example for read binary ‘rb’ and write binary ‘wb’ Imagecopy.py f1=open("bird.jpg",'rb') f2=open("birdcopy.jpg",'wb') for i in f1: print(f2.write(i)) bird.jpg #input file birdcopy.jpg #output file
SEARCHING THROUGH A FILE ✓ When you are searching through data in a file, it is a very common pattern to read through a file, ignoring most of the lines and only processing lines which meet a particularcondition. ✓ Mostofthe times,we wouldlike toreada filetosearchforsome specificdata withinit. Thiscan beachievedbyusingsome stringmethodswhile readinga file. ✓ For example, we may be interested in printing only the line which starts with a specificcharacter.
SEARCH EXAMPLE Search1.py fhand = open('mbox.txt') for line in fhand: line = line.rstrip() #strips whitespace from right side of a string if line.startswith('From:'): print(line) Or fhand = open('mbox-short.txt') for line in fhand: line = line.rstrip() # Skip 'uninteresting lines' if not line.startswith('From:'): continue # Process our 'interesting' line print(line) Output From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu From: zqian@umich.edu Fri Jan 4 16:10:39 2008 mbox.txt From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 Return-Path: <postmaster@collab.sakaiproject.org> From: louis@media.berkeley.edu Subject: [sakai] svn commit: From: zqian@umich.edu Fri Jan 4 16:10:39 2008 Return-Path: <postmaster@collab.sakaiproject.org> Search2.py Note:find lines where the search string is anywhere in the line. Find() method returns either position of a string or -1 fhand = open('mbox.txt') for line in fhand: line = line.rstrip() if line.find('@uct.ac.za') == -1: continue print(line) Output From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
LETTING THE USER CHOOSE THE FILE NAME In a real time programming, it is always better to ask the user to enter a name of the file which he/she would like to open, instead of hard-coding the name of a file inside the program. Fileuser.py fname=input("Enter a file name:") f1=open(fname) count =0 for line in f1: count+=1 print("Line Number ",count, ":", line) print("Total lines=",count) f1.close() Output: Enter a file name:1.txt Line Number 1 : Coronavirus Can Be Stopped Only by Harsh Steps Line Number 2 : Stay at home Line Number 3 : wear a facemask Line Number 4 : Clean your hands often Line Number 5 : Monitor your symptoms Total lines= 5 In this program, the user input filename is received through variable fname, and the same has been used as an argument to open() method. Now, if the user input is 1.txt (discussed before), then the result would be Total lines=5 Everything goes well, if the user gives a proper file name as input. But, what if the input filename cannot be opened (Due to some reason like – file doesn‟t exists, file permission denied etc)? Obviously, Python throws an error. The programmer need to handle such run- time errors as discussed in the next section.
USING TRY ,EXCEPT AND OPEN When you try opening the file which doesn’t exist or if a file name is not valid, then the interpreter throws you an error. Assume that the open call might fail and add recovery code when the open fails as follows: In the above program, the command to open a file is kept within try block. If the specified file cannot be opened due to any reason, then an error message is displayed saying File cannot be opened, and the program is terminated. If the file could able to open successfully, then we will proceed further to perform required task using that file. Tryfile.py fname = input('Enter the file name: ') try: fhand = open(fname) except: print('File cannot be opened:', fname) exit() count = 0 for line in fhand: if line.startswith('From:'): count = count + 1 print('count=', count) Output1: Enter the file name: mbox.txt count= 3 Output2: Enter the file name: newmbox.txt File cannot be opened: newmbox.txt
CLOSE A FILE IN PYTHON ✓ It’s always the best practice to close a file when your work gets finished. However, Python runs a garbage collector to clean up the unused objects. While closing a file, the system frees up all resources allocated to it ✓ The most basic way is to call the Python close() method. Filepointer.close() Example: f = open("app.log“) # do file operations. f.close()
PROBLEMS WITH WHITE SPACE- REPR() ✓ When we are reading and writing files, we might run into problems with white space. These errors can be hard to debug because spaces, tabs and newlines are normally invisible. >>> s = ‘1 2 t 3 n 4’ >>> print(s) 1 2 3 4 ✓ The built in function repr( ) can take any object as an argument and returns a string representation of the object. For strings, it represents whitespace, characters with blackslash sequences: >>> print(s) ‘1 2 t 3 n 4’ ✓ This helps for debugging
EXERCISE PROBLEMS: 1) WAP to copy all the lines from one file to another file ( file1.txt to file2.txt) where the line beginswithvowelsandalso demonstratethecomputationalfaultsintheprogram 2) Write a program to count the number of occurrences of a given word(accept the input fromuser) ina file. (hint: Canusestrip()andcount()methodsforeachwordin a line) 3) Input decimal number and convert it to binary number and write it in another file until userenters0
THANK YOU

File handling in Python

  • 1.
    MODULE 2 –PART 3 FILES By, Ravi Kumar B N Assistant professor, Dept. of CSE BMSIT & M
  • 2.
    INTRODUCTION ✓ Fileisa namedlocationonthesystemstoragewhichrecordsdata forlateraccess. It enablespersistentstorage ina non-volatilememoryi.e.Harddisk. ✓ It is requiredtoworkwithfilesforeitherwritingtoa fileorreaddatafromit.It is essentialtostorethe filespermanentlyin secondarystorage. ✓ In python,fileprocessingtakesplaceinthe followingorder. • Opena filethatreturnsa filehandle. • Usethe handletoperformreadorwrite action. • Close thefilehandle.
  • 3.
    TYPES OF FILES& OPERATIONS ON FILES Typesof Files 1. TextFiles- All docfiles/excelfilesetc 2. BinaryFiles- Audiofiles,VideoFiles,imagesetc OperationsonFile: ▪ Open ▪ Close ▪ Read ▪ Write
  • 4.
    ✓ Toreadorwritetoa file,youneedtoopenitfirst.Toopena fileinPython,use itsbuilt-in open() function.Thisfunctionreturnsa fileobject,i.e.,a handle.Youcan use ittoread ormodifythe file. ✓open() filemethod: file_object= open(“file_name”,” access_mode”) file_object–Filehandlerthatpointstotheparticularlocationasa referencetoan object file_name-Name ofthe file access_mode-Read/write/appendmode. Bydefault,it isset toread-only<r>. Ex: file1= open("app.log","w") OPEN A FILE IN PYTHON
  • 5.
    ✓ Python storesa file in the form of bytes on the disk, so you need to decode them in strings before reading. And, similarly, encode them while writing texts to the file. This is done automaticallybyPythonInterpreter. ✓ If the open is successful, the operating system returns us a file handle. The file handle is not the actual data contained in the file, but instead it is a “handle” that we can use to read the data. You are given a handle if the requested file exists and you have the properpermissionstoreadthe file.
  • 6.
    ✓ If thefile does not exist, open will fail with a traceback and you will not get a handletoaccessthecontents ✓ Afileopening maycauseanerrorduetosomeofthereasonsaslistedbelow ▪ Filemaynotexistinthe specifiedpath(whenwe trytoreada file) ▪ Filemayexist,butwe may nothave a permissiontoread/writea file ▪ Filemight have got corruptedandmay notbe inanopeningstate
  • 7.
    Modes Description r Opensa file only for reading rb Opens a file only for reading but in a binary format w Opens a file only for writing; overwrites the file if the file exists wb Opens a file only for writing but in a binary format a Opens a file for appending. It does not overwrite the file, just adds the data in the file, and if file is not created, then it creates a new file ab Opens a file for appending in a binary format r+ Opens a file only for reading and writing w+ Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing. a+ Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. FILE MODES
  • 8.
    TEXT FILES &LINES A text file can be thought of as a sequence oflines, much like a Python stringcan be thought of as a sequence of characters. For example, this is a sample of a text file which records mail activityfromvariousindividualsin anopensourceprojectdevelopmentteam: Tobreakthe fileintolines,thereisa specialcharacterthatrepresentsthe “endofthe line”called the newlinecharacter.In the textfileeachlinecan beseparatedusingescapecharactern >>> stuff = 'XnY’ >>> print(stuff) X Y
  • 9.
    READING THE FILES Whenwe successfully open a file to read the data from it, the open() function returns the file handle (or an object reference to file object) which will be pointing to the first characterinthe file. Therearedifferentwaysinwhichwe canreadthe filesinpython. ➢ read() //readsallthecontent ➢ read(n) //readsonlythefirstn characters ➢ readline() //readssingleline ➢ readlines() //readsalllines Where, n isthe numberof bytestobe read Note: If the file is too large to fit in main memory, you should write your program to read the fileinchunks usinga fororwhileloop.
  • 10.
    SAMPLE PROGRAMS 1.txt Coronavirus CanBe Stopped Only by Harsh Steps Stay at home wear a facemask Clean your hands often Monitor your symptoms readexample.py f1=open("1.txt",”r”) # to read first n bytes print("---first four characters---") print(f1.read(4)) # to read a first line print("---first line---") print(f1.readline()) # to read entire file print("---Entire File---") print(f1.read()) Output ---first four characters--- Coro ---first line--- navirus Can Be Stopped Only by Harsh Steps ---Entire File--- Stay at home wear a facemask Clean your hands often Monitor your symptoms countlines.py f1 = open(“1.txt”,”r”) count = 0 for line in f1: count = count + 1 print('Line Count:', count) Output Line Count: 5 Note: When the file is read using a for loop in this manner, Python takes care of splitting the data in the file into separate lines using the newline character. Readlineexample.py f1=open("1.txt") #by default read mode #to read line wise print(f1.readline()) print(f1.readline()) Output Coronavirus Can Be Stopped Only by Harsh Steps Stay at home Countchars.py #finds the length of the file f1 = open('1.txt') ch = f1.read() print(len(ch)) Output 120 Note:In the above code it counts the number of characters along with newline character(n)
  • 11.
    WRITING THE FILES: ✓Towritea dataintoa file,we needtouse themode ‘w’inopen()function. ✓ The write()methodisusedtowritedata intoa file. >>> fhand=open(“mynewfile.txt","w") >>> print(fhand) <_io.TextIOWrapper name='mynewfile.txt' mode='w' encoding='cp1252'> ✓ We have twomethodsforwritingdataintoa fileasshown below 1. write(string) 2. writelines(list) If the file specified already exists, then the old contents will be erased and it will be ready to write new data intoit. If the file does not exists, then a new file with the given name will be created.
  • 12.
    write( ) method:It returns number of characters successfully written into a file. The file object alsokeepstrackofpositionin thefile. For example, writelines()method: Example:This code adds the listofcontentsintothe fileincludingn Writexample.py fhand=open("2.txt",'w') s="hello how are you?" print(fhand.write(s)) Output: 18 Writelist.py food = ["Citrusn", "Garlicn", "Almondn", "Gingern"] my_file = open("immunity.txt", "w") my_file.writelines(food) Output It creates a file immunity.txt Citrus Garlic Almond Ginger
  • 13.
    Example for readbinary ‘rb’ and write binary ‘wb’ Imagecopy.py f1=open("bird.jpg",'rb') f2=open("birdcopy.jpg",'wb') for i in f1: print(f2.write(i)) bird.jpg #input file birdcopy.jpg #output file
  • 14.
    SEARCHING THROUGH AFILE ✓ When you are searching through data in a file, it is a very common pattern to read through a file, ignoring most of the lines and only processing lines which meet a particularcondition. ✓ Mostofthe times,we wouldlike toreada filetosearchforsome specificdata withinit. Thiscan beachievedbyusingsome stringmethodswhile readinga file. ✓ For example, we may be interested in printing only the line which starts with a specificcharacter.
  • 15.
    SEARCH EXAMPLE Search1.py fhand =open('mbox.txt') for line in fhand: line = line.rstrip() #strips whitespace from right side of a string if line.startswith('From:'): print(line) Or fhand = open('mbox-short.txt') for line in fhand: line = line.rstrip() # Skip 'uninteresting lines' if not line.startswith('From:'): continue # Process our 'interesting' line print(line) Output From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu From: zqian@umich.edu Fri Jan 4 16:10:39 2008 mbox.txt From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 Return-Path: <postmaster@collab.sakaiproject.org> From: louis@media.berkeley.edu Subject: [sakai] svn commit: From: zqian@umich.edu Fri Jan 4 16:10:39 2008 Return-Path: <postmaster@collab.sakaiproject.org> Search2.py Note:find lines where the search string is anywhere in the line. Find() method returns either position of a string or -1 fhand = open('mbox.txt') for line in fhand: line = line.rstrip() if line.find('@uct.ac.za') == -1: continue print(line) Output From: stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
  • 16.
    LETTING THE USERCHOOSE THE FILE NAME In a real time programming, it is always better to ask the user to enter a name of the file which he/she would like to open, instead of hard-coding the name of a file inside the program. Fileuser.py fname=input("Enter a file name:") f1=open(fname) count =0 for line in f1: count+=1 print("Line Number ",count, ":", line) print("Total lines=",count) f1.close() Output: Enter a file name:1.txt Line Number 1 : Coronavirus Can Be Stopped Only by Harsh Steps Line Number 2 : Stay at home Line Number 3 : wear a facemask Line Number 4 : Clean your hands often Line Number 5 : Monitor your symptoms Total lines= 5 In this program, the user input filename is received through variable fname, and the same has been used as an argument to open() method. Now, if the user input is 1.txt (discussed before), then the result would be Total lines=5 Everything goes well, if the user gives a proper file name as input. But, what if the input filename cannot be opened (Due to some reason like – file doesn‟t exists, file permission denied etc)? Obviously, Python throws an error. The programmer need to handle such run- time errors as discussed in the next section.
  • 17.
    USING TRY ,EXCEPTAND OPEN When you try opening the file which doesn’t exist or if a file name is not valid, then the interpreter throws you an error. Assume that the open call might fail and add recovery code when the open fails as follows: In the above program, the command to open a file is kept within try block. If the specified file cannot be opened due to any reason, then an error message is displayed saying File cannot be opened, and the program is terminated. If the file could able to open successfully, then we will proceed further to perform required task using that file. Tryfile.py fname = input('Enter the file name: ') try: fhand = open(fname) except: print('File cannot be opened:', fname) exit() count = 0 for line in fhand: if line.startswith('From:'): count = count + 1 print('count=', count) Output1: Enter the file name: mbox.txt count= 3 Output2: Enter the file name: newmbox.txt File cannot be opened: newmbox.txt
  • 18.
    CLOSE A FILEIN PYTHON ✓ It’s always the best practice to close a file when your work gets finished. However, Python runs a garbage collector to clean up the unused objects. While closing a file, the system frees up all resources allocated to it ✓ The most basic way is to call the Python close() method. Filepointer.close() Example: f = open("app.log“) # do file operations. f.close()
  • 19.
    PROBLEMS WITH WHITESPACE- REPR() ✓ When we are reading and writing files, we might run into problems with white space. These errors can be hard to debug because spaces, tabs and newlines are normally invisible. >>> s = ‘1 2 t 3 n 4’ >>> print(s) 1 2 3 4 ✓ The built in function repr( ) can take any object as an argument and returns a string representation of the object. For strings, it represents whitespace, characters with blackslash sequences: >>> print(s) ‘1 2 t 3 n 4’ ✓ This helps for debugging
  • 20.
    EXERCISE PROBLEMS: 1) WAPto copy all the lines from one file to another file ( file1.txt to file2.txt) where the line beginswithvowelsandalso demonstratethecomputationalfaultsintheprogram 2) Write a program to count the number of occurrences of a given word(accept the input fromuser) ina file. (hint: Canusestrip()andcount()methodsforeachwordin a line) 3) Input decimal number and convert it to binary number and write it in another file until userenters0
  • 21.