Python Forum
scraping from a website that hides source code
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
scraping from a website that hides source code
#1
Hello Pythoners,

I want to make python get some data from this website: https://tritrypdb.org/


The idea is to then implement that into excel, so I can have it search for the same piece of information for hundreds of genes, and put it in my excel sheet.


It is now searching for a gene on the website (I just have it add the name to the URL), and it reads the source code.
import urllib.request import re #this is just an example for the gene ID. It will be a list of gene IDs. geneID="Tb927.7.2390" #open the webpage, directly going to the right gene ID page=urllib.request.urlopen("https://tritrypdb.org/tritrypdb/app/record/gene/"+geneID) #read entire source code scode=page.read()
The plan was now to search the source code for the information needed, and return that. But It seems that the source code just doesn't contain any of the actual text which is there with the normal graphic view of the browser. Instead there are huge blank spaces.


Is this webpage somehow hiding that information? and is there a way to still get the information out of there?

Thank you for your help people!
Reply
#2
You need to learn a bit more about scraping.
There's a good two part tutorial on this forum.
see:
web scraping part 1
https://python-forum.io/Thread-Web-scraping-part-2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Website Scraping Problems JamesWilson 1 1,306 Jul-01-2024, 09:46 AM
Last Post: Larz60+
  scraping code misses listings kolarmi19 0 2,128 Jan-27-2023, 10:00 AM
Last Post: kolarmi19
  web scraping for new additions/modifed website? kingoman123 4 3,924 Apr-14-2022, 04:46 PM
Last Post: snippsat
  Scraping lender data from Ren Ren Dai website using Python. I will pay for that 200$ Hafedh_2021 1 3,893 May-18-2021, 08:41 PM
Last Post: snippsat
  Code Help, web scraping non uniform lists(ul) luke_m 4 4,800 Apr-22-2021, 05:16 PM
Last Post: luke_m
  Hide source code from python process itself xmghe 2 3,104 Jan-27-2021, 04:04 PM
Last Post: xmghe
  Scraping Whole Page Source GJG 1 3,201 Jan-13-2021, 03:19 PM
Last Post: GJG
  Scraping all website text using Python MKMKMKMK 1 3,286 Nov-26-2020, 10:35 PM
Last Post: Larz60+
  scraping code nexuz89 0 2,282 Sep-28-2020, 12:16 PM
Last Post: nexuz89
  In need of web scraping code! kolbyng 1 2,711 Sep-21-2020, 06:02 AM
Last Post: buran

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.