Find the text of the given tag using BeautifulSoup

Last Updated : 30 May, 2022

Web scraping is a process of using software bots called web scrapers in extracting information from HTML or XML content of a web page. Beautiful Soup is a library used for scraping data through python. Beautiful Soup works along with a parser to provide iteration, searching, and modifying the content that the parser provides(in the form of a parse tree). It's fairly easy to crawl through the web pages and to find the text of a given tag using Beautiful Soup.

In this article, we will discuss finding the text from the given tag.

Step-by-step Approach:

First import the library.

Python3

from bs4 import BeautifulSoup import requests

Now assign the URL.

Python3

# assign URL url = "https://www.geeksforgeeks.org/"

Fetch the raw HTML content from the URL.

Python3

html_content = requests.get(url).text

Now parse through the content.

Python3

# Now that the content is ready, iterate  # through the content using BeautifulSoup soup = BeautifulSoup(html_content, "html.parser")

After the content is parsed we search for a specific tag and print its text.

Python3

print(soup.find('title'))

Below is the complete program.

Python3

from bs4 import BeautifulSoup import requests # Assign URL url = "https://www.geeksforgeeks.org/" # Fetch raw HTML content html_content = requests.get(url).text # Now that the content is ready, iterate  # through the content using BeautifulSoup: soup = BeautifulSoup(html_content, "html.parser") # similarly to get all the occurrences of a given tag print(soup.find('title').text)

Output:

Similarly to get all the occurrences of the given tag:

Python3

from bs4 import BeautifulSoup import requests # Assign URL url = "https://www.geeksforgeeks.org/" # Fetch raw HTML content html_content = requests.get(url).text # Now that the content is ready, iterate  # through the content using BeautifulSoup: soup = BeautifulSoup(html_content, "html.parser") # similarly to get all the occurrences of a given tag texts = soup.find_all('p') for text in texts: print(text.get_text())