Skip to content

Conversation

@lizard-boy
Copy link

Pull Request Summary

in kubernetes environment we don't really need multiple workers in the same pod, rather it's simpler to just have kubernetes autoscale the number of pods. based on some internal benchmarks gunicorn has some known load balancing issues, also removing this layer results in less error and better latency

Test Plan and Usage Guide

will run simple load testing for get requests with and without gunicorn

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR removes Gunicorn and replaces it with Uvicorn for the FastAPI server in a Kubernetes environment, aiming to simplify deployment and potentially improve performance.

  • Updated start_fastapi_server.py to use Uvicorn directly with a high concurrency limit (10000)
  • Removed worker.py, eliminating the custom LaunchWorker class for Gunicorn workers
  • Significantly updated Uvicorn from version 0.17.6 to 0.30.0 in requirements.in and requirements.txt
  • Adjusted server configuration parameters in start_fastapi_server.py for Uvicorn-only setup
  • Removed Gunicorn-related dependencies from requirements.txt

5 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings

def entrypoint():
"""Entrypoint for starting a local server."""

# We can probably use asyncio since this service is going to be more I/O bound.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: This comment about asyncio seems outdated now that we're using Uvicorn. Consider removing or updating it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants