An autonomous debugging assistant powered by LangGraph and multi-agent AI workflows that analyzes code, identifies errors, and provides intelligent fixes through collaborative AI agents.
The AI Code Debugger uses three specialized AI agents working in harmony:
- π Parser Agent: Analyzes code and error logs to understand the root cause
- π§ Fixer Agent: Generates intelligent code fixes based on the analysis
- β Reviewer Agent: Validates fixes and provides feedback for improvement
The system runs in a feedback loop until a satisfactory solution is found or maximum iterations are reached.
- Multi-Agent Workflow: Three specialized AI agents collaborate to solve coding issues
- Intelligent Error Analysis: Deep understanding of error types, locations, and root causes
- Iterative Improvement: Continuous refinement through reviewer feedback loops
- Explainable AI: Clear reasoning steps and confidence scores for all fixes
- Web Interface: User-friendly Streamlit app for easy interaction
- Docker Support: Containerized deployment for scalability
- Multiple LLM Support: Works with GPT-4, GPT-3.5, and other OpenAI models
graph TD A[User Input: Code + Error] --> B[Parser Agent] B --> C{Error Analysis Complete?} C -->|Yes| D[Fixer Agent] C -->|No| E[Failed] D --> F[Reviewer Agent] F --> G{Fix Valid?} G -->|Yes| H[Success: Return Fix] G -->|No| I{Max Iterations?} I -->|Yes| J[Failed: Max Attempts] I -->|No| D - Python 3.11+
- OpenAI API key
- Git
-
Clone the repository
git clone https://github.com/yourusername/ai-code-debugger.git cd ai-code-debugger -
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
cp .env.example .env # Edit .env and add your OpenAI API key -
Run the application
streamlit run src/app/streamlit_app.py
-
Open your browser and navigate to
http://localhost:8501
-
Build and run with Docker Compose
docker-compose up --build
-
Access the app at
http://localhost:8501
- Enter API Key: Add your OpenAI API key in the sidebar
- Configure Settings: Choose your model and max iterations
- Input Code: Paste your buggy code in the left panel
- Add Error Log: Paste the error message/traceback in the right panel
- Debug: Click "Debug Code" and watch the AI agents work
- Review Results: Examine the fix, analysis, and reasoning process
Input Code:
def calculate_average(numbers): total = 0 for num in numbers: total += num return total / len(numbers) # Test with empty list result = calculate_average([]) print(result)Error Log:
Traceback (most recent call last): File "test.py", line 8, in <module> result = calculate_average([]) File "test.py", line 5, in calculate_average return total / len(numbers) ZeroDivisionError: division by zero AI-Generated Fix:
def calculate_average(numbers): if not numbers: # Handle empty list return 0 # or raise ValueError("Cannot calculate average of empty list") total = 0 for num in numbers: total += num return total / len(numbers) # Test with empty list result = calculate_average([]) print(result) # Output: 0from src.workflow.debug_workflow import DebugWorkflow import os # Set up API key os.environ["OPENAI_API_KEY"] = "your-api-key" # Initialize workflow debugger = DebugWorkflow(llm_model="gpt-4") # Debug code result = debugger.debug_code( code=""" def divide_numbers(a, b): return a / b result = divide_numbers(10, 0) """, error_log="ZeroDivisionError: division by zero", max_iterations=3 ) # Access results if result["final_result"]: print("Fixed code:", result["final_result"].fixed_code) print("Explanation:", result["final_result"].explanation)| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY | Your OpenAI API key | Required |
DEFAULT_MODEL | Default LLM model | gpt-4 |
TEMPERATURE | LLM temperature setting | 0.1 |
MAX_ITERATIONS | Maximum fix attempts | 3 |
DEBUG_MODE | Enable debug logging | false |
Supported models:
gpt-4(Recommended)gpt-4o-minigpt-3.5-turbo
Higher-tier models provide better analysis and fixes but cost more.
- Input: Source code + error log
- Output: Structured error analysis
- Capabilities:
- Error type classification
- Root cause identification
- Affected code location mapping
- Severity assessment
- Input: Error analysis + original code
- Output: Proposed code fix
- Capabilities:
- Context-aware code generation
- Minimal change optimization
- Style preservation
- Confidence scoring
- Input: Original code + proposed fix + error context
- Output: Validation result + feedback
- Capabilities:
- Logic validation
- Side effect analysis
- Best practice enforcement
- Improvement suggestions
The system uses a shared state object that flows through all agents:
class DebugState: original_code: str # Input code error_log: str # Input error error_analysis: ErrorAnalysis # Parser output proposed_fixes: List[CodeFix] # All attempted fixes current_fix: CodeFix # Latest fix attempt review_feedback: str # Reviewer comments status: DebugStatus # Current workflow state iteration_count: int # Loop counter reasoning_steps: List[str] # Explainability trail final_result: CodeFix # Successful fix- Initialization: Set up state with user input
- Parsing Phase: Analyze error and code structure
- Fixing Phase: Generate code improvements
- Review Phase: Validate and provide feedback
- Decision Point:
- If valid β Complete workflow
- If invalid β Return to fixing (up to max iterations)
- If max iterations reached β Mark as failed
# Run all tests pytest tests/ # Run specific test file pytest tests/test_parser_agent.py -v # Run with coverage pytest --cov=src tests/# tests/test_workflow.py import pytest from src.workflow.debug_workflow import DebugWorkflow def test_simple_syntax_error(): debugger = DebugWorkflow() result = debugger.debug_code( code="print('Hello World'", # Missing closing parenthesis error_log="SyntaxError: unexpected EOF while parsing", max_iterations=1 ) assert result["status"] == "completed" assert ")" in result["final_result"].fixed_code- Syntax Errors: Missing brackets, quotes, colons
- Runtime Errors: Division by zero, index out of range
- Type Errors: String/integer operations, method calls
- Logic Errors: Incorrect algorithms, edge cases
- Import Errors: Missing modules, circular imports
- Success Rate: Percentage of successfully fixed bugs
- Iteration Efficiency: Average iterations per fix
- Agent Performance: Individual agent accuracy
- Response Time: Time to generate fixes
- Confidence Scores: AI certainty levels
The system logs:
- Agent decisions and reasoning
- API calls and response times
- Error patterns and frequencies
- User interactions and feedback
- Use GPT-4 for complex bugs (higher success rate)
- Limit iterations to 3-5 for cost efficiency
- Provide detailed error logs for better analysis
- Include test cases when possible
- Review AI suggestions before implementing
- No Code Storage: Code is processed in memory only
- API Security: Uses secure HTTPS connections
- Key Management: Environment variable isolation
- Session Isolation: Each debugging session is independent
- Never commit API keys to version control
- Use environment variables for sensitive data
- Regularly rotate API keys
- Monitor API usage and costs
- Review generated code before execution
- Cause: Invalid or missing API key
- Solution: Check API key in sidebar/environment variables
- Cause: Complex bug requiring multiple attempts
- Solution: Increase max_iterations or provide more context
- Cause: Ambiguous error or unsupported language
- Solution: Provide clearer error logs or use different model
- Cause: Port already in use
- Solution: Use
streamlit run app.py --server.port 8502
Enable detailed logging:
export DEBUG_MODE=true streamlit run src/app/streamlit_app.py- Push code to GitHub repository
- Connect to Streamlit Cloud
- Add secrets (API keys) in dashboard
- Deploy automatically
# docker-compose.prod.yml version: '3.8' services: ai-debugger: build: . ports: - "80:8501" environment: - OPENAI_API_KEY=${OPENAI_API_KEY} restart: always healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8501/healthz"] interval: 30s timeout: 10s retries: 3- Use container services (ECS, Cloud Run, Container Instances)
- Set up load balancing for high availability
- Configure auto-scaling based on usage
- Implement monitoring and alerting
- API Rate Limits: Monitor OpenAI usage quotas
- Memory Usage: Large code files may require more RAM
- Concurrent Users: Use session state management
- Cost Management: Set usage alerts and budgets
- Fork the repository
- Create a feature branch
git checkout -b feature/your-feature-name
- Install development dependencies
pip install -r requirements-dev.txt
- Make your changes
- Run tests
pytest tests/ black src/
- New Language Support: Extend beyond Python
- Additional LLM Providers: Anthropic, Google, etc.
- UI Improvements: Better visualizations, mobile support
- Performance Optimization: Caching, parallel processing
- Testing Coverage: More test scenarios and edge cases
- LangChain/LangGraph: For the powerful agent orchestration framework
- OpenAI: For providing the GPT models
- Streamlit: For the intuitive web framework