DEV Community

Hoang Manh Cam
Hoang Manh Cam

Posted on

Building a Laravel Package for Ollama - The full guide

A step‑by‑step guide to designing, coding, testing, documenting, and releasing a Laravel package that talks to a local Ollama server.


Why Ollama + Laravel?

Ollama makes it dead‑simple to run large language models (LLMs) locally. Laravel gives you an expressive toolkit for building PHP applications. A dedicated package is the cleanest way to:

  • Centralize HTTP calls to the Ollama API (chat, generate, embeddings, models, etc.).
  • Offer a fluent, framework‑native API via Facades and dependency injection.
  • Provide config, caching, logging, and test fakes out of the box.

This article uses camh/laravel-ollama as the concrete example, but the structure applies to any Laravel package talking to Ollama.

Laravel to Laravel Ollama to Ollama Server


Prerequisites

  • PHP 8.2+ and Composer
  • Laravel 10 or 11
  • Ollama installed locally and running (default: http://localhost:11434)
  • Basic familiarity with Laravel packages (service providers, facades, config)

Package Goals

We’ll build a package that:

  1. Wraps Ollama endpoints with a typed, ergonomic client.
  2. Supports both non‑streaming and streaming responses.
  3. Exposes a Facade (Ollama) and injectable interfaces.
  4. Adds config + environment variables for base URL, timeouts, and model defaults.
  5. Includes testing utilities (Http fakes and example fixtures).
  6. Ships with documentation and CI.

Project Skeleton

laravel-ollama/ ├─ src/ │ ├─ Contracts/ │ │ └─ OllamaClient.php │ ├─ DTOs/ │ │ ├─ ChatMessage.php │ │ ├─ ChatResponse.php │ │ └─ EmbeddingResponse.php │ ├─ Http/ │ │ └─ Client.php │ ├─ Facades/ │ │ └─ Ollama.php │ ├─ OllamaServiceProvider.php │ └─ Support/StreamIterator.php ├─ config/ollama.php ├─ tests/ │ ├─ Feature/ │ └─ Unit/ ├─ composer.json ├─ README.md └─ CHANGELOG.md 
Enter fullscreen mode Exit fullscreen mode

Composer Setup

composer.json

{ "name": "camh/laravel-ollama", "description": "Laravel wrapper for the Ollama local LLM API (chat, generate, embeddings).", "type": "library", "license": "MIT", "require": { "php": ">=8.2", "illuminate/support": "^10.0|^11.0" }, "autoload": { "psr-4": { "CamH\\LaravelOllama\\": "src/" } }, "extra": { "laravel": { "providers": [ "CamH\\LaravelOllama\\OllamaServiceProvider" ], "aliases": { "Ollama": "CamH\\LaravelOllama\\Facades\\Ollama" } } }, "minimum-stability": "stable", "prefer-stable": true } 
Enter fullscreen mode Exit fullscreen mode

Configuration

config/ollama.php

<?php return [ 'base_url' => env('OLLAMA_BASE_URL', 'http://localhost:11434'), // default model to use if not provided explicitly 'model' => env('OLLAMA_MODEL', 'llama3.1:8b'), // timeouts (in seconds) 'timeout' => env('OLLAMA_TIMEOUT', 120), 'connect_timeout' => env('OLLAMA_CONNECT_TIMEOUT', 5), ]; 
Enter fullscreen mode Exit fullscreen mode

.env

OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_MODEL=llama3.1:8b OLLAMA_TIMEOUT=120 OLLAMA_CONNECT_TIMEOUT=5 
Enter fullscreen mode Exit fullscreen mode

Add a publish group so users can copy the config into their app:

// in OllamaServiceProvider::boot() $this->publishes([ __DIR__.'/../config/ollama.php' => config_path('ollama.php'), ], 'ollama-config'); 
Enter fullscreen mode Exit fullscreen mode

Service Provider & Container Bindings

src/OllamaServiceProvider.php

<?php namespace CamH\LaravelOllama; use CamH\LaravelOllama\Contracts\OllamaClient as OllamaClientContract; use CamH\LaravelOllama\Http\Client as HttpClient; use Illuminate\Support\ServiceProvider; class OllamaServiceProvider extends ServiceProvider { public function register(): void { $this->mergeConfigFrom(__DIR__.'/../config/ollama.php', 'ollama'); $this->app->bind(OllamaClientContract::class, function ($app) { $config = $app['config']['ollama']; return new HttpClient( baseUrl: $config['base_url'], defaultModel: $config['model'], timeout: (int) $config['timeout'], connectTimeout: (int) $config['connect_timeout'], ); }); } public function boot(): void { $this->publishes([ __DIR__.'/../config/ollama.php' => config_path('ollama.php'), ], 'ollama-config'); } } 
Enter fullscreen mode Exit fullscreen mode

Contracts (Interface‑first Design)

src/Contracts/OllamaClient.php

<?php namespace CamH\LaravelOllama\Contracts; use Generator; interface OllamaClient { /** Simple one‑shot completion (non‑streaming). */ public function generate(string $prompt, ?string $model = null, array $options = []): string; /** Chat with role‑based messages (non‑streaming). */ public function chat(array $messages, ?string $model = null, array $options = []): array; /** Token‑streaming chat. Yields partial text chunks as they arrive. */ public function streamChat(array $messages, ?string $model = null, array $options = []): Generator; /** Create embeddings for given input text(s). */ public function embeddings(string|array $input, ?string $model = null, array $options = []): array; /** List available local models. */ public function models(): array; } 
Enter fullscreen mode Exit fullscreen mode

The HTTP Client

We’ll use Laravel’s HTTP client (Illuminate\Support\Facades\Http) behind a thin adapter.

src/Http/Client.php

<?php namespace CamH\LaravelOllama\Http; use CamH\LaravelOllama\Contracts\OllamaClient; use Generator; use Illuminate\Support\Facades\Http; class Client implements OllamaClient { public function __construct( private readonly string $baseUrl, private readonly string $defaultModel, private readonly int $timeout = 120, private readonly int $connectTimeout = 5, ) {} protected function http() { return Http::baseUrl($this->baseUrl) ->timeout($this->timeout) ->connectTimeout($this->connectTimeout) ->acceptJson(); } public function generate(string $prompt, ?string $model = null, array $options = []): string { $payload = array_merge([ 'model' => $model ?? $this->defaultModel, 'prompt' => $prompt, 'stream' => false, ], $options); $response = $this->http()->post('/api/generate', $payload)->throw(); // Ollama returns { response: "...", ... } return (string) $response->json('response', ''); } public function chat(array $messages, ?string $model = null, array $options = []): array { $payload = array_merge([ 'model' => $model ?? $this->defaultModel, 'messages' => $messages, 'stream' => false, ], $options); $response = $this->http()->post('/api/chat', $payload)->throw(); return $response->json(); } public function streamChat(array $messages, ?string $model = null, array $options = []): Generator { $payload = array_merge([ 'model' => $model ?? $this->defaultModel, 'messages' => $messages, 'stream' => true, ], $options); $response = $this->http()->withOptions(['stream' => true])->post('/api/chat', $payload)->throw(); foreach ($response->toPsrResponse()->getBody() as $chunk) { $line = trim((string) $chunk); if ($line === '') continue; $json = json_decode($line, true); if (isset($json['message']['content'])) { yield $json['message']['content']; } } } public function embeddings(string|array $input, ?string $model = null, array $options = []): array { $payload = array_merge([ 'model' => $model ?? $this->defaultModel, 'input' => $input, ], $options); $response = $this->http()->post('/api/embeddings', $payload)->throw(); return $response->json(); } public function models(): array { return $this->http()->get('/api/tags')->throw()->json(); } } 
Enter fullscreen mode Exit fullscreen mode

Note: Ollama’s streaming endpoints send a stream of JSON lines. We iterate the PSR stream and decode each line.


Facade for Ergonomics

src/Facades/Ollama.php

<?php namespace CamH\LaravelOllama\Facades; use CamH\LaravelOllama\Contracts\OllamaClient as OllamaClientContract; use Illuminate\Support\Facades\Facade; /** @method static string generate(string $prompt, ?string $model = null, array $options = []) * @method static array chat(array $messages, ?string $model = null, array $options = []) * @method static \Generator streamChat(array $messages, ?string $model = null, array $options = []) * @method static array embeddings(string|array $input, ?string $model = null, array $options = []) * @method static array models() */ class Ollama extends Facade { protected static function getFacadeAccessor() { return OllamaClientContract::class; } } 
Enter fullscreen mode Exit fullscreen mode

Usage in a Laravel App

Install

composer require camh/laravel-ollama php artisan vendor:publish --tag=ollama-config 
Enter fullscreen mode Exit fullscreen mode

Generate text (controller or job):

use CamH\LaravelOllama\Facades\Ollama; $text = Ollama::generate('Write a haiku about monsoons.', model: 'llama3.1:8b'); 
Enter fullscreen mode Exit fullscreen mode

Chat

$reply = Ollama::chat([ ['role' => 'system', 'content' => 'You are a concise assistant.'], ['role' => 'user', 'content' => 'Summarize Laravel in 1 sentence.'], ]); $assistant = data_get($reply, 'message.content'); 
Enter fullscreen mode Exit fullscreen mode

Streaming chat (controller returning an SSE stream)

use Symfony\Component\HttpFoundation\StreamedResponse; use CamH\LaravelOllama\Facades\Ollama; return new StreamedResponse(function () { $messages = [ ['role' => 'user', 'content' => 'Explain queues in Laravel concisely.'] ]; foreach (Ollama::streamChat($messages) as $delta) { echo "data: ".$delta."\n\n"; ob_flush(); flush(); } }, 200, [ 'Content-Type' => 'text/event-stream', 'Cache-Control' => 'no-cache', 'X-Accel-Buffering' => 'no', ]); 
Enter fullscreen mode Exit fullscreen mode

Embeddings

$response = Ollama::embeddings([ 'Laravel is a delightful PHP framework.', 'Eloquent provides ActiveRecord-like models.' ]); $firstVector = $response['embeddings'][0] ?? []; 
Enter fullscreen mode Exit fullscreen mode

List models

$models = Ollama::models(); 
Enter fullscreen mode Exit fullscreen mode

Error Handling & Timeouts

  • Use $response->throw() so HTTP ≥ 400 raises exceptions.
  • Catch and convert RequestException to domain‑specific exceptions if you want (OllamaUnavailable, OllamaValidationError).
  • Let users override timeout and connect_timeout via config and options arguments.

Example domain exceptions:

class OllamaUnavailable extends \RuntimeException {} class OllamaValidationError extends \InvalidArgumentException {} 
Enter fullscreen mode Exit fullscreen mode

Testing Strategy

  1. Unit tests for the client using Http::fake() to simulate Ollama responses.
  2. Feature tests for your routes / controllers consuming the facade.
  3. Provide fixtures (JSON lines for streaming) to test parsers.

Example

use Illuminate\Support\Facades\Http; use CamH\LaravelOllama\Http\Client; it('generates text', function () { Http::fake([ 'http://localhost:11434/api/generate' => Http::response([ 'response' => 'Hello world', ], 200), ]); $client = new Client('http://localhost:11434', 'llama3.1:8b'); $text = $client->generate('Say hello'); expect($text)->toBe('Hello world'); }); 
Enter fullscreen mode Exit fullscreen mode

Streaming test helper (fake JSONL):

$stream = "{\"message\":{\"content\":\"Hel\"}}\n{\"message\":{\"content\":\"lo\"}}\n"; Http::fake([ '*' => Http::response($stream, 200, ['Content-Type' => 'application/x-ndjson']) ]); 
Enter fullscreen mode Exit fullscreen mode

Documentation & DX

  • Ship a README with install, configuration, and quickstart examples.
  • Add PHPDoc on public methods and return types.
  • Provide copy‑paste examples for SSE streaming and queue jobs.
  • Include a php artisan example command to prove the integration end‑to‑end.

Example command

// app/Console/Commands/OllamaAsk.php protected $signature = 'ollama:ask {prompt} {--model=}'; public function handle(): int { $model = $this->option('model'); $answer = \CamH\LaravelOllama\Facades\Ollama::generate($this->argument('prompt'), $model); $this->line($answer); return self::SUCCESS; } 
Enter fullscreen mode Exit fullscreen mode

Releasing to Packagist

  1. Create a public Git repository.
  2. Ensure composer.json has correct name, autoload, and extra.laravel.
  3. Tag a release: git tag v1.0.0 && git push --tags.
  4. Submit the repo to packagist.org once (future tags auto‑sync).

Versioning tips

  • Follow SemVer.
  • Maintain a CHANGELOG.md.
  • Use GitHub Actions to run tests on PHP 8.2/8.3 and Laravel 10/11.

Example CI (GitHub Actions)

name: tests on: [push, pull_request] jobs: phpunit: runs-on: ubuntu-latest strategy: matrix: php: ['8.2', '8.3'] laravel: ['10.*', '11.*'] steps: - uses: actions/checkout@v4 - uses: shivammathur/setup-php@v2 with: php-version: ${{ matrix.php }} tools: composer:v2 - run: composer require "illuminate/support:${{ matrix.laravel }}" --no-interaction --no-progress --no-suggest - run: composer install --no-interaction --prefer-dist - run: vendor/bin/phpunit 
Enter fullscreen mode Exit fullscreen mode

Security, Performance & Ops Notes

  • Never trust prompts; validate and size‑limit user input.
  • Consider rate limiting endpoints that proxy to Ollama.
  • Add caching for models() (e.g., cache for 5–10 minutes) to avoid frequent calls.
  • Support retry/backoff on transient errors.
  • Log latency and token usage (if available) for observability.
$models = cache()->remember('ollama:models', 600, fn () => Ollama::models()); 
Enter fullscreen mode Exit fullscreen mode

Advanced: Middleware & Pipelines

  • Add middleware to inject system prompts, sanitize user input, or enforce max tokens.
  • Provide a pipeline API for RAG: retrieve docs → build context → call chat().
  • Consider stream transformers to emit SSE, console updates, or WebSockets.

Troubleshooting

  • Connection refused → Verify Ollama is running and OLLAMA_BASE_URL is correct.
  • Model not found → Pull the model: ollama pull llama3.1:8b.
  • Timeouts → Increase OLLAMA_TIMEOUT or simplify prompts.
  • Binary size → Exclude tests/fixtures from export via .gitattributes.

Example End‑to‑End Controller

namespace App\Http\Controllers; use CamH\LaravelOllama\Facades\Ollama; use Illuminate\Http\Request; class AskController { public function __invoke(Request $request) { $validated = $request->validate([ 'prompt' => ['required','string','max:4000'], 'model' => ['nullable','string'] ]); $answer = Ollama::generate($validated['prompt'], $validated['model'] ?? null); return response()->json([ 'answer' => $answer, ]); } } 
Enter fullscreen mode Exit fullscreen mode

Conclusion

The camh/laravel-ollama package is only the first step. The current focus has been on wrapping Ollama’s core APIs and providing a clean Laravel interface, but there is a clear roadmap ahead:

  • Tool & function calling: add helpers for structured outputs and integrations with external services.
  • Conversation memory: support multi‑turn conversation stores (database, cache, or Redis) to persist chat history.
  • RAG workflows: deeper integration with Laravel Scout or custom pipelines to enable retrieval‑augmented generation.
  • Monitoring & Observability: built‑in hooks for logging, metrics, and tracing of requests.
  • Community feedback: open issues and PRs will shape features for better developer experience.

In an upcoming article, we will move beyond package design and demonstrate how to use camh/laravel-ollama in a real Laravel project. That walkthrough will cover building controllers, jobs, and even front‑end integrations that take advantage of local LLMs through this package.

Stay tuned—the best part of Laravel + Ollama is seeing it power actual products and workflows!

Top comments (0)