A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.
 flux queue speech-recognition image-generation whisper vision-api mlx fastapi apple-silicon structured-outputs mlx-lm mlx-vlm openai-compatible mlx-openai-server 
 -  Updated Oct 30, 2025 
- Python