Python Forum
Need help with saving output into an excel file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Need help with saving output into an excel file
#1
I have a python scraper for ranker.com. It scrapes the name and image url from the site.

import requests import pandas as pd from pandas import DataFrame #Check "ListID" for api id r = requests.get('https://api.ranker.com/lists/538997/items?limit=100') data = r.json()['listItems'] for i in data: print(i['name'], i['image'])
I wanted this output to be saved into a ranks excel file. I dont have much experience with python. Hence, can someone please help me with the excel part of the code. Thank you
Reply
#2
Here is a script I wrote a couple days ago. I have elided some parts. It scrapes a site to retrieve the part description. I am using openpyxl
import requests import re import sys import openpyxl ErrorLog = [] debug = False # ******************************************************************* # logError # Inputs: # s: a string to add to the error log # ******************************************************************* def logError(s): ErrorLog.append(s) # ******************************************************************* # msgProlog # Inputs: # C. which is a cell value # Result: string # The standard prolog to a message, which includes a trailing space # 'A1: "123" ' # ******************************************************************* def msgProlog(C): return (C.coordinate + ": \"" + str(C.value) + '" ') # ******************************************************************* # deHTML # Inputs: # s: A string # Result: string # The input string with all the &...; codes replaced # ******************************************************************* def deHTML(s): t = s t = t.replace('&quot;', '"') t = t.replace('&lt', '<') t = t.replace('&gt', ">") t = t.replace('&amp;', '&') return t # ******************************************************************* # Main program # ******************************************************************* # Open the workspace workspace_name = "inventory.xlsx" try: wb = openpyxl.load_workbook(filename = workspace_name) except: print("Error opening file ", sys.exc_info()[0]) raise sheet = wb.active # Set the default columns in case we can't figure out what is going on colPartNumber = 1 captionPartNumber = "#" colManufacturer = 2 captionManufacturer = "Mfg" colDescription = 5 captionDescription = "Description" # First, identify the columns that have the headings # "#" colPartNumber # "Mfg" colManufacturer # "Description" colDescription for col in range(1,20): cell = sheet.cell(1, col) caption = sheet.cell(1, col).value if caption == None: print(cell.coordinate, end=": ") print("<None>") break print(cell.coordinate + ":\"" + caption + "\"") if caption == captionPartNumber: colPartNumber = col print(captionPartNumber + ' = ' + str(col)) elif caption == captionManufacturer: colManufacturer = col print(captionManufacturer + ' = ' + str(col)) elif caption == captionDescription: colDescription = col print(captionDescription + ' = ' + str(col)) print("") # Now we can iterate through the cells row = 1 try: while(True): row += 1 # Get the part # partNo = sheet.cell(row = row, column = colPartNumber) part = partNo.value if(part == None): print(partNo.coordinate, end=": ") print("<None>") break; # do we already have a description? desc = sheet.cell(row = row, column = colDescription) if desc.value != None: # We have a description. If it has not been deHTMLed, do so nicedesc = deHTML(desc.value) if(desc.value != nicedesc): msg = msgProlog(partNo) + '"' + desc.value + '"=>"' + nicedesc + '"****************' print(msg) logError(msg) desc.value = nicedesc continue msg = msgProlog(partNo) + ' already has description "' + desc.value + '"' print(msg) continue # We do not have a description. Go find it try: r = requests.get('https://www.elided.com/products/' + str(partNo.value)) # If there was an error, throw an exception r.raise_for_status() except: msg = msgProlog(partNo) + ' "' + str(r) + '"' print(msg) logError(msg) print(r) continue # The best description is found in an entry that says # twitter:title<useless stuff>content="(.*)" x = re.search('twitter:title.*content="(.*)"', r.text) if x != None: msgProlog(partNo) + '"' + x.group(1) + '"' print(msg) caption = x.group(1) caption = deHTML(caption) desc.value = caption else: msg = msgProlog(partNo) + "<No attribute found>" logError(msg) print(msg) except: print("Exception occurred", sys.exc_info()[0]) # Now save the workbook wb.save(filename = workspace_name) print("File saved") print("=========================== Errors ===========================") for msg in ErrorLog: print(msg)
Reply
#3
I suggest using Pandas as an intermediary.
Convert your json to a Pandas dataframe, then convert the pandas dataframe to an xls file.
Reply
#4
(Mar-21-2021, 12:19 PM)jefsummers Wrote: I suggest using Pandas as an intermediary.
Convert your json to a Pandas dataframe, then convert the pandas dataframe to an xls file.

If I had JSON, this might apply, but I don't.

I used openpyxl because it was the first one I found. I've heard of Pandas, what is the advantage over what I have? I'm curious, since I don't actually know anything about Pandas.
Reply
#5
See OP llne 6. The request is converted to a json.

Pandas is widely used for tabular information, used with AI, graphing (matplotlib), etc.

Here is an introduction: Brandon Rhodes - Pandas from PyCon 2015
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  docx file to pandas dataframe/excel iitip92 1 5,483 Jun-27-2024, 05:28 AM
Last Post: Pedroski55
  Python openyxl not updating Excel file MrBean12 1 3,799 Mar-03-2024, 12:16 AM
Last Post: MrBean12
  Copy Paste excel files based on the first letters of the file name Viento 2 2,341 Feb-07-2024, 12:24 PM
Last Post: Viento
  Search Excel File with a list of values huzzug 4 4,772 Nov-03-2023, 05:35 PM
Last Post: huzzug
  Updating sharepoint excel file odd results cubangt 1 2,925 Nov-03-2023, 05:13 PM
Last Post: noisefloor
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 3,044 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Save and Close Excel File avd88 0 9,435 Feb-20-2023, 07:19 PM
Last Post: avd88
  Trying to access excel file on our sharepoint server but getting errors cubangt 0 1,932 Feb-16-2023, 08:11 PM
Last Post: cubangt
  Import XML file directly into Excel spreadsheet demdej 0 2,067 Jan-24-2023, 02:48 PM
Last Post: demdej
  how to read txt file, and write into excel with multiply sheet jacklee26 14 20,567 Jan-21-2023, 06:57 AM
Last Post: jacklee26

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.