Python Forum
With Selenium create a google Search list in Incognito mode withe specific location,
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
With Selenium create a google Search list in Incognito mode withe specific location,
#1
Hello to all

a bit of Context:

I have to do a deadly repetitive task for a colleague that begging Monday.
Make a list of Search results in Google Search in Incognito/Secret mode. 300 results for 17 locations, for each Link, title, short description. ARGGHHHHH
I want also to help my colleague not to explode.

What I did until now:

I tried to use Selenium a first time, wired errors (I will come back to that after) occurred. I switched to fake_useragent and BeautifulSoup.
The code is working but I don't know if it is possible to implement location and Incognito mode.
Here is the code:
import urllib import csv import requests from fake_useragent import UserAgent from bs4 import BeautifulSoup import re csv_list = [["順位", "タイトル", "要約", "リンク", "関連キーワード"]] query = "'tour eifelle'" query = urllib.parse.quote_plus(query) # Format into URL encoding number_result = 20 ua = UserAgent() google_url = "https://www.google.com/search?q=" + query + "&num=" + str(number_result) response = requests.get(google_url, {"User-Agent": ua.random}) soup = BeautifulSoup(response.text, "html.parser") result_div = soup.find_all('div', attrs = {'class': 'ZINbbc'}) links = [] titles = [] descriptions = [] link2= "" for r in result_div: # Checks if each element is present, else, raise exception try: link = r.find('a', href = True) title = r.find('div', attrs={'class':'vvjwJb'}).get_text() description = r.find('div', attrs={'class':'s3v9rd'}).get_text() # Check to make sure everything is present before appending if link != '' and title != '' and description != '': link3= link['href'].lstrip('/url?q=') link2=re.sub(r'&sa.*',"",link3) links.append(link2) titles.append(title) descriptions.append(description) # Next loop if one element is not present except: continue #to_remove = [] #clean_links = [] #for i, l in enumerate(links): # clean = re.search('\/url\?q\=(.*)\&sa',l) # Anything that doesn't fit the above pattern will be removed # if clean is None: # to_remove.append(i) # continue # clean_links.append(clean.group(1)) # Remove the corresponding titles & descriptions #for x in to_remove: # del titles[x] # del descriptions[x] for i in range(len(titles)): add_list=[i+1,titles[i],descriptions[i],links[i]] csv_list.append(add_list) # タイトルリストをcsvに保存 with open('Search_word.csv','w',encoding="utf-8_sig") as f: writecsv = csv.writer(f, lineterminator='\n') writecsv.writerows(csv_list) #links #titles #descriptions 
Then After that
I tried to go back to Selenium


Here is the code:

import csv import time # スリープを使うために必要 from selenium import webdriver # Webブラウザを自動操作する(python -m pip install selenium) import chromedriver_binary # パスを通すためのコード def ranking(driver): i = 1 # ループ番号、ページ番号を定義 title_list = [] # タイトルを格納する空リストを用意 link_list = [] # URLを格納する空リストを用意 summary_list = [] RelatedKeywords = [] # 現在のページが指定した最大分析ページを超えるまでループする while i <= i_max: # タイトルとリンクはclass="r"に入っている class_group = driver.find_elements_by_class_name('r') class_group1 = driver.find_elements_by_class_name('s') class_group2 = driver.find_elements_by_class_name('nVcaUb') # タイトルとリンクを抽出しリストに追加するforループ for elem in class_group: title_list.append(elem.find_element_by_class_name('LC20lb').text) # タイトル(class="LC20lb") link_list.append(elem.find_element_by_tag_name('a').get_attribute('href')) # リンク(aタグのhref属性) for elem in class_group1: summary_list.append(elem.find_element_by_class_name('st').text) # リンク(aタグのhref属性) for elem in class_group2: RelatedKeywords.append(elem.text) # リンク(aタグのhref属性) # 「次へ」は1つしかないが、あえてelementsで複数検索。空のリストであれば最終ページの意味になる。 if driver.find_elements_by_id('pnnext') == []: i = i_max + 1 else: # 次ページのURLはid="pnnext"のhref属性 next_page = driver.find_element_by_id('pnnext').get_attribute('href') driver.get(next_page) # 次ページへ遷移する i = i + 1 # iを更新 time.sleep(3) # 3秒間待機 return title_list, link_list, summary_list, RelatedKeywords # タイトルとリンクのリストを戻り値に指定 # driver = webdriver.Chrome() # Chromeを準備 # サンプルのHTMLを開く driver.get('https://www.google.com/') # Googleを開く i_max = 5 # 最大何ページまで分析するかを定義 search = driver.find_element_by_name('q') # HTML内で検索ボックス(name='q')を指定する search.send_keys('Test blender') # 検索ワードを送信する search.submit() # 検索を実行 time.sleep(1.5) # 1.5秒間待機 # ranking関数を実行してタイトルとURLリストを取得する title, link, summary, RelatedKeywords = ranking(driver) csv_list = [["順位", "タイトル", "要約", "リンク", "関連キーワード"]] for i in range(len(title)): add_list=[i+1,title[i],summary[i],link[i]] csv_list.append(add_list) # タイトルリストをcsvに保存 with open('Search_word.csv','w',encoding="utf-8_sig") as f: writecsv = csv.writer(f, lineterminator='\n') writecsv.writerows(csv_list) driver.quit()
I specified the path
Quote:C:\Users\Name\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\chromedriver_binary

But I get this Error Message.

Error:
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 22:45:29) [MSC v.1916 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license()" for more information. >>> = RESTART: C:\Users\name\Desktop\B\Python\Recherche de mots\Cherche de mots.py Traceback (most recent call last): File "C:\Users\name\Desktop\B\Python\Recherche de mots\Cherche de mots.py", line 49, in <module> driver = webdriver.Chrome() # Chromeを準備 File "C:\Users\name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 76, in __init__ RemoteWebDriver.__init__( File "C:\Users\name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 157, in __init__ self.start_session(capabilities, browser_profile) File "C:\Users\name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 252, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "C:\Users\name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:\Users\name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary
After taht I tried to specify the path directly in the code

this ligne
driver = webdriver.Chrome()
But I encounter an other problem,
driver = webdriver.Chrome(r'C:\Users\Name\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\chromedriver_binary')
Error:
File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\service.py", line 72, in start self.process = subprocess.Popen(cmd, env=self.env, File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 854, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 1307, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] 指定されたファイルが見つかりません。 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\Name\Desktop\Bilhaud\Python\Recherche de mots\Cherche de mots.py", line 49, in <module> driver = webdriver.Chrome(r'C:\Users\Name\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\chromedriver_binary') File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 73, in __init__ self.service.start() File "C:\Users\Name\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start raise WebDriverException( selenium.common.exceptions.WebDriverException: Message: 'chromedriver_binary' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
I tried all the solutions here,
None of the solution were working

If you can find something to help I will be extremely happy.
Reply
#2
I am not familiar with chromdriver-binary package, just try downloading chromedriver from here: https://chromedriver.chromium.org/downloads and then
browser = webdriver.Chrome(executable_path=r"C:\path\to\chromedriver.exe")
Reply
#3
Same error discussed in the thread WebDriverException: 'chromedriver' executable needs to be in PATH
Reply
#4
mlieqo, Believe me I tried.
Yoriz, thank you also for the Link,
I don<t know working on a Japanese computer change parameters, I am just not finding solutions. Spent 2 days on this. I have to take an other route.
Thanks.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Photo Disable checkbox of google maps markers/labels using selenium erickkill 0 2,388 Nov-25-2021, 12:20 PM
Last Post: erickkill
  Selenium innerHTML list, print specific value denis22934 2 6,318 Jun-14-2021, 04:59 AM
Last Post: denis22934
  How to get specific TD text via Selenium? euras 3 14,136 May-14-2021, 05:12 PM
Last Post: snippsat
  help ! selenium and google sheet puttimet38 2 3,825 Mar-12-2021, 09:50 AM
Last Post: puttimet38
  How to use Selenium on EdgeHTML, when having WebDrivers in other location? euras 2 3,466 Feb-03-2021, 06:02 PM
Last Post: euras
  Selenium google login probottpric 0 3,395 Oct-09-2020, 04:19 PM
Last Post: probottpric
  Project: “I’m Feeling Lucky” Google Search Truman 31 39,353 Jul-09-2019, 04:20 PM
Last Post: tab_lo_lo
  How to use BeautifulSoup to parse google search results DevinGP 16 28,361 Dec-22-2017, 10:23 PM
Last Post: snippsat
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 5,166 Nov-03-2017, 08:41 PM
Last Post: metulburr
  Create Dictionary List (From a webpage dropdown) for Comparison to a CSV File Guttmann 5 8,104 Mar-31-2017, 01:29 AM
Last Post: Guttmann

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.