Python Forum
Running A Parser In VSCode - And Write The Results Into A Csv-File
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Running A Parser In VSCode - And Write The Results Into A Csv-File
#1
hi there - good day dear python-experts.


running a parser in VSCode - and write the results into a csv-file
i ve got a tiny error on a

import requests from bs4 import BeautifulSoup import re import csv from tqdm import tqdm first = "https://path ?page={}" second = "https://path /{}_en" def catch(url): with requests.Session() as req: pages = [] print("Loading All IDS\n") for item in tqdm(range(0, 347)): r = req.get(url.format(item)) soup = BeautifulSoup(r.content, 'html.parser') numbers = [item.get("href").split("/")[-1].split("_")[0] for item in soup.findAll( "a", href=re.compile("^path/"), class_="btn btn-default")] pages.append(numbers) return numbers def parse(url): links = catch(first) with requests.Session() as req: with open("Data.csv", 'w', newline="", encoding="UTF-8") as f: writer = csv.writer(f) writer.writerow(["Name", "Address", "Site", "Phone", "Description", "Scope", "Rec", "Send", "PIC", "OID", "Topic"]) print("\nParsing Now... \n") for link in tqdm(links): r = req.get(url.format(link)) soup = BeautifulSoup(r.content, 'html.parser') task = soup.find("section", class_="col-sm-12").contents name = task[1].text add = task[3].find( "i", class_="fa fa-location-arrow fa-lg").parent.text.strip() try: site = task[3].find("a", class_="link-default").get("href") except: site = "N/A" try: phone = task[3].find( "i", class_="fa fa-phone").next_element.strip() except: phone = "N/A" desc = task[3].find( "h3", class_="eyp-project-heading underline").find_next("p").text scope = task[3].findAll("span", class_="pull-right")[1].text rec = task[3].select("tbody td")[1].text send = task[3].select("tbody td")[-1].text pic = task[3].select( "span.vertical-space")[0].text.split(" ")[1] oid = task[3].select( "span.vertical-space")[-1].text.split(" ")[1] topic = [item.next_element.strip() for item in task[3].select( "i.fa.fa-check.fa-lg")] writer.writerow([name, add, site, phone, desc, scope, rec, send, pic, oid, "".join(topic)]) parse(second)
see the output -

python /home/martin/dev/vscode/euro.py martin@mx:~ $ python /home/martin/dev/vscode/euro.py Loading All IDS 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 347/347 [08:01<00:00, 1.39s/it] Traceback (most recent call last): File "/home/martin/dev/vscode/euro.py", line 65, in <module> parse(second) File "/home/martin/dev/vscode/euro.py", line 29, in parse with open("Data.csv", 'w', newline="", encoding="UTF-8") as f: TypeError: file() takes at most 3 arguments (4 given) martin@mx:~
well i think that i have an error here

with open("Data.csv", 'w', newline="", encoding="UTF-8") as f:
i guess i need to have a closer look at the arguments here
Reply
#2
Looks like you might be using python3 options but you are running it under python2.

$ python3 -c 'open("in", "w", newline="", encoding="UTF-8")' $ python2 -c 'open("in", "w", newline="", encoding="UTF-8")' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: file() takes at most 3 arguments (4 given)
Reply
#3
Yes indeed. Drop the newline and UTF parameters and I bet it works fine.
Reply
#4
good day dear JeffSummers
many many thanks for the quick answer -great to hear from you . i did as you adviced but i guess that i have gotten some errors doings so..


i run the code like so

def parse(url): links = catch(first) with requests.Session() as req: with open("Data.csv", 'w') as f: writer = csv.writer(f) writer.writerow(["Name", "Address", "Site", "Phone", "Description", "Scope", "Rec", "Send", "PIC", "OID", "Topic"]) print("\nParsing Now... \n") 
but now i have some issues:



martin@mx:~ $ python /home/martin/dev/vscode/euro.py Loading All IDS 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:15<00:00, 1.56s/it] Parsing Now... 5%|█████▎ | 1/20 [00:02<00:40, 2.14s/it] Traceback (most recent call last): File "/home/martin/dev/vscode/euro.py", line 65, in <module> parse(second) File "/home/martin/dev/vscode/euro.py", line 62, in parse scope, rec, send, pic, oid, "".join(topic)]) UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 9: ordinal not in range(128) martin@mx:~ $ 
well i guess i have have a UnicodeEncodeError, seems that my system default encoding isn't utf-8,
therefor, should i do some extra thing to avoid issues here!?


should i try with df.to_csv("data.csv", index=False, encoding="utf-8")

but then it will not work again...

Smile
Reply
#5
hi again

update - well i run Python 2.7.1

u will install and update the system - to run with version 3xy

i hope that i will be successful - i guess that i can do this again

$ python3 -c 'open("in", "w", newline="", encoding="UTF-8")' $ python2 -c 'open("in", "w", newline="", encoding="UTF-8")' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: file() takes at most 3 arguments (4 given)
since i nee to take care for the decoding options..see the above mentioned issues

 a UnicodeEncodeError, seems that my system default encoding isn't utf-8, therefor, should i do some extra thing to avoid issues here!? should i try with df.to_csv("data.csv", index=False, encoding="utf-8")
step one: i will update the python to 3.xy
step two: i will add all the arguments - so that we have


$ python3 -c 'open("in", "w", newline="", encoding="UTF-8")' $ python2 -c 'open("in", "w", newline="", encoding="UTF-8")' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: file() takes at most 3 arguments (4 given)
look forward to hear from you
Reply
#6
You should look at setup VS Code and how it work with Python.
It's not hard to see what version you use as it show it always down in left corner.
VS Code from start
Overview image of my setup with Python and Code Runner as the most important extensions.
[Image: vSxNpA.png]
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to write/overwrite data in a txt. file according to inp Quinn 2 1,543 Aug-12-2025, 04:20 PM
Last Post: Quinn
  How can I write formatted (i.e. bold, italic, change font size, etc.) text to a file? JohnJSal 13 35,332 May-20-2025, 12:26 PM
Last Post: hanmen9527
  writing and running code in vscode without saving it akbarza 5 4,657 Mar-03-2025, 08:14 PM
Last Post: Gribouillis
  How to write variable in a python file then import it in another python file? tatahuft 4 2,199 Jan-01-2025, 12:18 AM
Last Post: Skaperen
  [SOLVED] [Linux] Write file and change owner? Winfried 6 3,132 Oct-17-2024, 01:15 AM
Last Post: Winfried
  What does .flush do? How can I change this to write to the file? Pedroski55 3 2,195 Apr-22-2024, 01:15 PM
Last Post: snippsat
  Last record in file doesn't write to newline gonksoup 3 2,739 Jan-22-2024, 12:56 PM
Last Post: deanhystad
  write to csv file problem jacksfrustration 11 8,285 Nov-09-2023, 01:56 PM
Last Post: deanhystad
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 5,684 Nov-09-2023, 10:56 AM
Last Post: mg24
  Updating sharepoint excel file odd results cubangt 1 2,917 Nov-03-2023, 05:13 PM
Last Post: noisefloor

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.