HammadRafique29
diff --git a/‎README.md‎
Lines changed: 48 additions & 45 deletions b/‎README.md‎
Lines changed: 48 additions & 45 deletions
diff --git a/‎Scripts/anonymous_techniques.py‎
Lines changed: 2 additions & 4 deletions b/‎Scripts/anonymous_techniques.py‎
Lines changed: 2 additions & 4 deletions
diff --git a/‎Scripts/main.py‎
Lines changed: 6 additions & 54 deletions b/‎Scripts/main.py‎
Lines changed: 6 additions & 54 deletions
@@ -1,45 +1,48 @@
-usage: git [-v | --version] [-h | --help] [-C <path>] [-c <name>=<value>]
- [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
- [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare]
- [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
- [--super-prefix=<path>] [--config-env=<name>=<envvar>]
- <command> [<args>]
-
-These are common Git commands used in various situations:
-
-start a working area (see also: git help tutorial)
- clone Clone a repository into a new directory
- init Create an empty Git repository or reinitialize an existing one
-
-work on the current change (see also: git help everyday)
- add Add file contents to the index
- mv Move or rename a file, a directory, or a symlink
- restore Restore working tree files
- rm Remove files from the working tree and from the index
-
-examine the history and state (see also: git help revisions)
- bisect Use binary search to find the commit that introduced a bug
- diff Show changes between commits, commit and working tree, etc
- grep Print lines matching a pattern
- log Show commit logs
- show Show various types of objects
- status Show the working tree status
-
-grow, mark and tweak your common history
- branch List, create, or delete branches
- commit Record changes to the repository
- merge Join two or more development histories together
- rebase Reapply commits on top of another base tip
- reset Reset current HEAD to the specified state
- switch Switch branches
- tag Create, list, delete or verify a tag object signed with GPG
-
-collaborate (see also: git help workflows)
- fetch Download objects and refs from another repository
- pull Fetch from and integrate with another repository or a local branch
- push Update remote refs along with associated objects
-
-'git help -a' and 'git help -g' list available subcommands and some
-concept guides. See 'git help <command>' or 'git help <concept>'
-to read about a specific subcommand or concept.
-See 'git help git' for an overview of the system.
+
+# Web Scraping with Selenium WebDriver
+
+This repository contains a web scraping tool that utilizes **Selenium WebDriver** with the latest version of FireFox to scrape data from the web. The tool supports proxy rotating and manual user agents for additional privacy and flexibility.
+This is the initial build of script using proxy rotating, user agents and other techniques to stay anonymous.
+I am know this is the not professional script but will be useful for moderate scraping.
+## Requirements
+
+The following dependencies are required to run the web scraping tool:
+
+- Python 3.x
+- Selenium WebDriver
+- geckodriver (for FireFox)
+- Requests (for sending HTTP requests)
+- Random (for parsing HTML)
+- Time (for rotating proxies)
+- Beautiful Soup (for parsing HTML)
+
+You can install the dependencies using pip:
+
+####pip install selenium requests beautifulsoup4
+
+## Usage
+
+To use the web scraping tool, just go the main file do whatever you want, but make sure you use the `Anonymous` class for making the object of WebDriver. `Anonymous` class will do work on his behalf. The `main.py` file should contain the following information:
+
+- `base_url`: the base URL of the website you want to scrape
+- `search_query`: the search query to be used to fetch data
+
+The `Anonymous.py` file should contain the following information:
+- `proxies`: a list of proxy servers to be used for scraping.
+- `user_agents`: a list of user agents to be used for scraping.
+- `setup_webdriver`: create a web driver with desired capabilities and options.
+
+All the data like proxies and user agents stored in text files (`Data Folder`), you can edit it if you want, before running actual work program will ask to download again the proxies to increase efficiency and speed.
+
+- `proxies`: up http proxies downloaded from `geonode.com`
+
+
+To start the web scraping tool, run the following file:
+
+###python main.py
+
+You can add additional command line arguments as needed.
+
+## Contact
+
+If you have any questions or issues, please contact the author at [hammadrafique029@gmail.com](mailto:hammadrafique029@gmail.com) or [codingmagician0@gmail.com](mailto:codingmagician0@gmail.com).
@@ -1,4 +1,3 @@
-from selenium import webdriver
 from proxies import *
 import random
 
@@ -43,12 +42,11 @@ def setup_agents(self):
  except Exception as e:
  print("\n\tGOT ERROR IN DEFINING USER AGENTS! ERROR BELOW:\n\t" + str(e))
 
- def setup_webDriver(self, Url):
+ def setup_webDriver(self):
  try:
  self.setup_proxies()
  driver = webdriver.Firefox(desired_capabilities=self.setup_desired_capabilities(),
  options=self.setup_agents())
- driver.get(Url)
- driver.close()
+ return driver
  except Exception as e:
  print("\n\tGOT ERROR IN DEFINING WEB DRIVER! ERROR BELOW:\n\t" + str(e))
@@ -1,58 +1,10 @@
 import time
 
-from bs4 import BeautifulSoup
-from selenium import webdriver
-from selenium.webdriver.common.keys import Keys
-
-from proxies import *
-import random
 from anonymous_techniques import *
 
-
-urls = ["https://www.google.com", "https://www.bing.com"]
-driver = webdriver.Firefox()
-driver.get(urls[0])
-
-driver.execute_script(f"window.open('{urls[1]}', '_blank');")
-driver.switch_to.window(driver.window_handles[-1])
-
-
-
-
-
-
-
-
-
-# get_proxies = FreeProxies()
-# print(get_proxies.verify_proxies())
-
-
-# # Initialize the Selenium webdriver
-# driver = webdriver.Firefox()
-#
-# # Use the webdriver to open a website
-# driver.get("https://www.google.com")
-#
-# # Get the HTML content of the page
-# html_content = driver.page_source
-#
-# # Use Beautiful Soup to parse the HTML content
-# soup = BeautifulSoup(html_content, 'html.parser')
-# print(soup.prettify())
-#
-#
-#
-# # Find elements in the HTML content
-# elements = soup.find_all('div', class_='example-class')
-#
-# # Extract data from the elements
-# data = []
-# for element in elements:
-# data.append(element.text)
-#
-# # Close the webdriver
-# driver.quit()
-#
-# # Print the extracted data
-# print(data)
+obj = Anonymous()
+url = "https://www.goole.com"
+driver = obj.setup_webDriver()
+driver.get(url)
+time.sleep(5)
+driver.close()