Skip to content

Conversation

@Vasyl198
Copy link

@Vasyl198 Vasyl198 commented Dec 24, 2025

????????? ????????????? ????? ??????????? ???? ??????, ???? --allow_exec ??? ?????????? ?????????????, helper execute_code ? ????? (unit tests). ???????????? ???????????? ??? ?????????? agent-generated ????.

Summary by CodeRabbit

  • Chores

    • Added automated code quality checks and linting workflow.
    • Configured pre-commit hooks for code formatting and consistency.
    • Added development dependencies for testing and code analysis.
    • Updated repository ignore patterns for local configuration files.
  • Tests

    • Added comprehensive test coverage for core functionality and utilities.
    • Introduced smoke tests to validate basic agent workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 24, 2025

📝 Walkthrough

Walkthrough

This PR establishes development infrastructure by introducing a GitHub Actions CI workflow, pre-commit configuration, development dependencies, and a comprehensive test suite covering code agent execution, utility functions, smoke testing, and worker behavior validation.

Changes

Cohort / File(s) Summary
CI/CD & Configuration Setup
.github/workflows/ci.yml, .gitignore, .pre-commit-config.yaml, requirements-dev.txt
Adds GitHub Actions CI pipeline (linting and pytest on push/PR), secrets management (.env to gitignore), pre-commit hooks (Black 25.11.0, isort 5.12.0, Flake8 7.1.0, file fixers), and dev dependencies (pytest, Pillow, Black, Flake8, isort, pre-commit).
Test Suite
tests/test_code_agent.py
Tests code extraction and execution (Python/Bash) via execute_code with DummyEnvController mock.
Test Suite
tests/test_smoke.py
Smoke tests with mocked pytesseract/pyautogui and FakeLMMAgent; verifies AgentS3 prediction flow and CLI help runs without external dependencies.
Test Suite
tests/test_utils_formatters.py
Tests parse_code_from_string and extract_agent_functions utility functions.
Test Suite
tests/test_worker.py
Tests AgentS3.predict behavior with monkeypatched FakeLMMAgent across mllm, code_agent, and grounding modules.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A rabbit hops through CI with glee, 🐰
Pre-commit hooks now help us see—
Black, isort, tests all aligned,
Infrastructure so refined!
DevTools configured, workflows blessed,
The codebase now will pass each test! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main feature: requiring user confirmation before executing agent-generated code. However, the changeset includes substantial infrastructure additions (CI workflow, pre-commit config, dev dependencies, and multiple test files) that are not reflected in the title.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Vasyl198
Copy link
Author

??????? ?????? ?????????:\n\n- ????????? ????????????? ????? ??????????? ????, ? GUI-???????? ? fallback ?? ?????????? ????.\n- ???????? ???? --allow_exec ??? ?????????? ????????????? (??????).\n- ??????? helper execute_code ??? ????????????/??????????.\n- ????????? unit-????? (11 ?????? ?????? ???????? ????????).\n\n????????????:\n- ?? ????????? ?????????? ???? ??????? ?????????????. ?????????? ????????????? ??????????? ????? dry-run ? sandbox ??? ?????????? ????.\n\n???????:\n- ??????????, ??????? ????????? ? ????? ?????, ???? ????????? ??????????/maintainer ??? ????? PR.\n- CI: workflow ???????? ? ?????????? ????????????? (???? ??? ??????????? check runs).\n\n???????!

@Vasyl198
Copy link
Author

@alckasoc @eric-simu @kylesimular ??????! ??????, ??????????, ??????????? ???? PR ?? ???????????? ?????????? ???? (????????? ????????????? ????? exec, ???? --allow_exec, unit-?????). ???????!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (6)
requirements-dev.txt (1)

1-6: Consider pinning dependency versions for reproducibility.

Without version constraints, different developers and CI runs may install different versions of these tools, potentially causing inconsistent behavior or build failures.

🔎 Example with version pinning
-pytest -pillow -black -flake8 -isort -pre-commit +pytest>=8.0.0,<9.0.0 +pillow>=10.0.0,<11.0.0 +black==25.11.0 +flake8>=7.0.0,<8.0.0 +isort>=5.12.0,<6.0.0 +pre-commit>=3.0.0,<4.0.0

Note: Adjust version ranges based on your compatibility requirements.

.github/workflows/ci.yml (2)

10-11: Consider adding a timeout to prevent hung tests.

Without a timeout, a hung test could consume CI resources indefinitely.

🔎 Proposed enhancement
 test: runs-on: ubuntu-latest + timeout-minutes: 30

16-17: Consider testing against multiple Python versions.

Currently only Python 3.11 is tested. If the project supports other Python versions, consider adding a matrix strategy to test compatibility.

🔎 Example matrix strategy
 test: runs-on: ubuntu-latest + strategy: + matrix: + python-version: ["3.9", "3.10", "3.11", "3.12"] steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: - python-version: "3.11" + python-version: ${{ matrix.python-version }}
tests/test_code_agent.py (1)

18-36: Consider adding edge case tests.

The current tests validate happy paths. Consider adding tests for:

  • Invalid code (syntax errors)
  • Empty code blocks
  • Malformed markdown code fences
  • Error handling in execute_code
tests/test_worker.py (1)

39-43: Consider using pytest's monkeypatch fixture for more robust mocking.

Direct module attribute assignment (lines 39-43) is fragile and sensitive to import order. Using pytest's monkeypatch fixture would provide automatic cleanup and clearer test isolation.

🔎 Alternative approach
import pytest @pytest.fixture(autouse=True) def mock_lmm_agent(monkeypatch): monkeypatch.setattr("gui_agents.s3.core.mllm.LMMAgent", FakeLMMAgent) monkeypatch.setattr("gui_agents.s3.agents.code_agent.LMMAgent", FakeLMMAgent) monkeypatch.setattr("gui_agents.s3.agents.grounding.LMMAgent", FakeLMMAgent) def test_worker_generate_next_action(): # test implementation...
tests/test_utils_formatters.py (1)

5-15: Tests validate core functionality correctly.

The tests appropriately verify code extraction and agent function parsing. Consider adding tests for edge cases like empty strings, malformed code blocks, or nested function calls if they're relevant to your use cases.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cb57fb and 9902f3d.

📒 Files selected for processing (8)
  • .github/workflows/ci.yml
  • .gitignore
  • .pre-commit-config.yaml
  • requirements-dev.txt
  • tests/test_code_agent.py
  • tests/test_smoke.py
  • tests/test_utils_formatters.py
  • tests/test_worker.py
🧰 Additional context used
🧬 Code graph analysis (4)
tests/test_utils_formatters.py (1)
gui_agents/s3/utils/common_utils.py (1)
  • extract_agent_functions (169-179)
tests/test_worker.py (2)
gui_agents/s3/agents/agent_s.py (1)
  • AgentS3 (48-94)
tests/test_smoke.py (5)
  • FakeLMMAgent (51-77)
  • reset (56-62)
  • add_system_prompt (64-65)
  • add_message (67-73)
  • get_response (75-77)
tests/test_smoke.py (2)
tests/test_worker.py (5)
  • FakeLMMAgent (11-36)
  • reset (16-22)
  • add_system_prompt (24-25)
  • add_message (27-33)
  • get_response (35-36)
gui_agents/s3/agents/agent_s.py (1)
  • AgentS3 (48-94)
tests/test_code_agent.py (1)
gui_agents/s3/agents/code_agent.py (2)
  • extract_code_block (11-29)
  • execute_code (32-49)
🪛 actionlint (1.7.9)
.github/workflows/ci.yml

15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 Ruff (0.14.10)
tests/test_worker.py

12-12: Unused method argument: engine_params

(ARG002)


12-12: Unused method argument: engine

(ARG002)


27-27: Unused method argument: image_content

(ARG002)


27-27: Unused method argument: kwargs

(ARG002)


35-35: Unused method argument: args

(ARG002)


35-35: Unused method argument: kwargs

(ARG002)


73-73: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

tests/test_smoke.py

13-13: Unused static method argument: image

(ARG004)


13-13: Unused static method argument: output_type

(ARG004)


48-48: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)


52-52: Unused method argument: engine_params

(ARG002)


52-52: Unused method argument: engine

(ARG002)


67-67: Unused method argument: image_content

(ARG002)


67-67: Unused method argument: kwargs

(ARG002)


75-75: Unused method argument: args

(ARG002)


75-75: Unused method argument: kwargs

(ARG002)


119-119: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

tests/test_code_agent.py

14-14: Unused method argument: timeout

(ARG002)

🔇 Additional comments (3)
tests/test_code_agent.py (1)

4-16: Test coverage looks good.

The DummyEnvController appropriately stubs the environment interface for testing code execution paths without actually running code.

.pre-commit-config.yaml (1)

1-20: Pre-commit configuration looks good.

The hook configuration aligns well with the CI workflow and uses appropriate linting/formatting tools. The Black, isort, and Flake8 settings are consistent with common Python best practices.

tests/test_smoke.py (1)

8-46: Clever approach to stub heavy dependencies.

Using dummy modules injected into sys.modules before imports is an effective way to run tests without requiring pytesseract and pyautogui installations. This keeps the test suite lightweight and fast.

Comment on lines +15 to +17
- uses: actions/setup-python@v4
with:
python-version: "3.11"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update actions/setup-python to v5.

Version v4 of actions/setup-python is deprecated and may stop working on GitHub Actions runners.

🔎 Proposed fix
- - uses: actions/setup-python@v4 + - uses: actions/setup-python@v5 with: python-version: "3.11"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- uses: actions/setup-python@v4
with:
python-version: "3.11"
- uses: actions/setup-python@v5
with:
python-version: "3.11"
🧰 Tools
🪛 actionlint (1.7.9)

15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🤖 Prompt for AI Agents
.github/workflows/ci.yml around lines 15 to 17: the workflow uses actions/setup-python@v4 which is deprecated; update the action reference to actions/setup-python@v5 in the workflow file and ensure any 'with:' keys remain compatible with v5 (e.g., keep python-version or switch to 'python-version' input as before), then run a local or CI syntax check to confirm the workflow parses correctly. 
Comment on lines +166 to +167
# Local env file for secrets
.env
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicate .env entry.

The .env pattern is already present at line 125, making this entry redundant.

🔎 Proposed fix
-# Local env file for secrets -.env
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Local env file for secrets
.env
🤖 Prompt for AI Agents
In .gitignore around lines 166-167 the pattern ".env" is duplicated (already present at line 125); remove the redundant ".env" entry at lines 166-167 so there is only a single .env line in the file, and quickly scan for any other duplicate identical patterns to avoid repeated entries. 
sys.modules.setdefault("pyautogui", DummyPyAutoGUI())

# ---- Monkeypatch LMMAgent to avoid external LLM calls ----
import gui_agents.s3.core.mllm as mllm # noqa: E402
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused noqa directive.

The # noqa: E402 comment is unnecessary as E402 is not an enabled check in your Flake8 configuration.

🔎 Proposed fix
-import gui_agents.s3.core.mllm as mllm # noqa: E402 +import gui_agents.s3.core.mllm as mllm
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import gui_agents.s3.core.mllm as mllm # noqa: E402
import gui_agents.s3.core.mllm as mllm
🧰 Tools
🪛 Ruff (0.14.10)

48-48: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

🤖 Prompt for AI Agents
In tests/test_smoke.py around line 48, the import statement includes an unnecessary "# noqa: E402" directive; remove the trailing " # noqa: E402" from the import line so the import reads simply "import gui_agents.s3.core.mllm as mllm" and commit the change. 
Comment on lines +119 to +120
info, actions = agent.predict(
instruction="Wait a bit", observation={"screenshot": screenshot}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Prefix unused variable with underscore.

The info variable is unpacked but never used. Following Python conventions, prefix it with an underscore.

🔎 Proposed fix
- info, actions = agent.predict( + _info, actions = agent.predict( instruction="Wait a bit", observation={"screenshot": screenshot} )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
info, actions = agent.predict(
instruction="Wait a bit", observation={"screenshot": screenshot}
_info, actions = agent.predict(
instruction="Wait a bit", observation={"screenshot": screenshot}
)
🧰 Tools
🪛 Ruff (0.14.10)

119-119: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
In tests/test_smoke.py around lines 119 to 120, the unpacked variable "info" is not used; rename it to a prefixed unused variable (e.g., "_info" or simply "_") in the unpacking assignment so it follows Python conventions for unused variables and avoid linter warnings; update the assignment to use the chosen prefixed name and ensure there are no remaining references to the old "info" identifier. 
cli = importlib.import_module("gui_agents.s3.cli_app")

# Running help should exit with code 0
import sys as _sys
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove redundant import.

sys is already imported at line 3, so importing it again as _sys is unnecessary. Use the existing sys import instead.

🔎 Proposed fix
 # Running help should exit with code 0 - import sys as _sys - - prev_argv = _sys.argv.copy() + prev_argv = sys.argv.copy() try: - _sys.argv = ["agent_s", "--help"] + sys.argv = ["agent_s", "--help"] try: cli.main() except SystemExit as e: assert e.code == 0 finally: - _sys.argv = prev_argv + sys.argv = prev_argv

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In tests/test_smoke.py around line 132, there is a redundant import "import sys as _sys" (sys is already imported at line 3); remove this extra import and, if the code below references _sys, replace those references with the existing sys import so the single import at line 3 is used consistently. 
platform="linux",
)

info, actions = agent.predict(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Prefix unused variable with underscore.

The info variable is unpacked but never used. Following Python conventions, prefix it with an underscore to indicate it's intentionally unused.

🔎 Proposed fix
- info, actions = agent.predict( + _info, actions = agent.predict( instruction="Wait small", observation={"screenshot": screenshot} )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
info, actions = agent.predict(
_info, actions = agent.predict(
instruction="Wait small", observation={"screenshot": screenshot}
)
🧰 Tools
🪛 Ruff (0.14.10)

73-73: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
In tests/test_worker.py around line 73, the test unpacks two values into variables `info, actions` but never uses `info`; update the unpacking to prefix the unused variable with an underscore (e.g., `_info, actions`) to follow Python conventions for intentionally unused variables and avoid linter warnings. 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant