Skip to content

Conversation

@codegen-sh
Copy link
Contributor

@codegen-sh codegen-sh bot commented Mar 14, 2025

This PR adds a new web browser tool to Codegen that allows it to browse websites and extract content. This enhances Codegen's capabilities by enabling it to:

  1. Access web pages and extract their content
  2. Research documentation and references online
  3. Gather information from websites to assist users

Implementation Details

  • Added a new web_browser.py module with the browse_web function and WebBrowserObservation class
  • Created a Langchain tool wrapper WebBrowserTool in the tools.py file
  • Integrated the tool into the agent creation functions in agent.py
  • Added the tool to the workspace tools list
  • Updated the tools/init.py to export the new module

Dependencies

The implementation uses:

  • requests for making HTTP requests
  • BeautifulSoup for parsing HTML content

Usage Example

# Example usage in a conversation browse_web(url="https://example.com", extract_text_only=True)

This tool will be particularly useful for tasks that require accessing online resources, such as recreating websites, researching documentation, or gathering information from the web.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


codegen-bot seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants