 
  Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Flight-price checker using Python and Selenium
Web scraping has been a useful technique for extracting data from websites for various purposes, including price checking for airline tickets. In this article, we will explore how to build a flight price checker using Selenium, a popular web testing automation tool. By leveraging Selenium's capabilities, we can automate the process of collecting and comparing prices for flights across different airlines, saving time and effort for users.
Setup
Firefox Executable
- Download the Firefox browser installer from here 
- Once downloaded, install the browser and an exe file will be placed automatically in C:\Program Files\Mozilla Firefox\firefox.exe. We will be needing it later. 
Gecko Driver
- Windows Users can download the gecko driver from here. For other versions see releases. 
- Extract the zip and place the "geckodriver.exe" file in C:\ directory. We will be referencing it later in our code. 
Selenium Python Package
We are going to be working with the latest version of Selenium Webdriver so pip install the following ?
pip3 install -U selenium pip3 install -U webdriver-manager
Algorithm
- Import the necessary libraries - Selenium and time 
- Set up the Firefox Gecko driver path 
- Open the website to be scraped 
- Identify the necessary elements to be scraped 
- Input the departure and arrival locations and the departure and return dates 
- Click the search button 
- Wait for the search results to load 
- Scrape the prices for the different airlines 
- Store the data in a format that's easy to read and analyze 
- Compare the prices and identify the cheapest option 
Example
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.firefox.options import Options import time # Set Firefox options firefox_options = Options() firefox_options.binary_location = r'C:\Program Files\Mozilla Firefox\firefox.exe' # Initialize webdriver with Firefox driver = webdriver.Firefox(executable_path=r'C:\geckodriver.exe', options=firefox_options) # Set URL and date of travel url = 'https://paytm.com/flights/flightSearch/BBI-Bhubaneshwar/DEL-Delhi/1/0/0/E/2023-04-22' date_of_travel = "2023-04-22" # Print URL print(f"URL: {url}") # Load webpage driver.get(url) # Wait for 5 seconds time.sleep(5) # Find search button and click it search_button = driver.find_element(By.CLASS_NAME, "_3LRd") search_button.click() # Find all elements with class name '_2gMo' prices_elements = driver.find_elements(By.CLASS_NAME, "_2gMo") # Get text of all elements prices_text = [price.text for price in prices_elements] # Convert text to integers prices = [int(p.replace(',', '')) for p in prices_text] # Display the minimum airfare price print(f"Minimum Airfare Price: {min(prices)}") # Display all prices print(f"All prices:\n {prices}")  Output
Minimum Airfare Price: 4471 All prices: [4471, 4472, 4544, 4544, 4679, 4838, 5497, 5497, 5866, 6991, 7969, 8393, 8393, 8393, 8393, 8393, 8445, 8445, 8445, 8445, 8445, 8498, 8498, 8498, 8540, 8898, 8898, 8898, 8898, 8898, 9203, 9207, 9385, 10396, 10554, 10896, 11390, 11433, 11766, 11838, 11838, 11838, 12518, 12678, 12678, 12678, 12735, 12735, 12735, 12735, 12767, 12767, 12787, 12787, 12787, 12787, 12840, 12945, 12966, 12981, 13069, 13145, 13145, 13145, 13145, 13152, 13525, 13537, 13537, 13571, 13610, 13633, 13828, 13956, 14358, 14630, 14630, 14828, 14838, 15198, 15528, 15849, 15954, 16479, 17748, 17748, 18506, 20818, 20818, 20818, 20818, 21992, 23590, 24468, 25483, 25483, 26628, 75271]
Explanation
- First, the necessary libraries are imported: webdriver and Options from selenium, By from selenium.webdriver.common.by, and time. 
- Next, Firefox options are set using Options() and the binary location for Firefox is set to C:\Program Files\Mozilla Firefox\firefox.exe. 
- A webdriver instance is then created with Firefox using the webdriver.Firefox() function, passing in the path to the Gecko driver executable and the Firefox options. 
- The webpage is loaded into the browser using driver.get(url). 
- The script then waits for 5 seconds using time.sleep(5). 
- The search button on the webpage is found using driver.find_element(By.CLASS_NAME, "_3LRd") and stored in the search_button variable. The click() method is then called on the search_button variable to simulate a click on the button. 
- All elements on the web page with class name _2gMo are found using driver.find_elements(By.CLASS_NAME, "_2gMo") and stored in the prices_elements list. 
- The text of all elements in the prices_elements list is extracted using a list comprehension and stored in the prices_text list. 
- The replace() method is used to remove commas from each element in prices_text and the resulting string is converted to an integer using int(). This is done using another list comprehension and the resulting list of integers is stored in the prices list. 
- The minimum value in prices is found using the min() function and printed to the console. 
- Finally, all values in prices are printed to the console. 
Application
Using Python and Selenium, this code can be used to begin scraping airfare prices from Paytm's flight search website and hereon, you can modify it to meet specific needs and additional features like storing the scraped data in a file and sending an email notification with a price, among other things.
Conclusion
Selenium is a potent web automation and scraping tool that may be used to collect information from websites without an API. Python's versatility, usability, and robust ecosystem of tools make it the perfect language for scraping. This script shows how to automate browser activities and retrieve data from a webpage with just a few lines of code.
